# Plotting in JuliaDB

Plotting is achieved through StatPlots.jl and a few special recipes in JuliaDB.

In [64]:
using JuliaDB, OnlineStats, Plots
gr()

Plots.GRBackend()

## Plotting with [StatPlots.jl](https://github.com/JuliaPlots/StatPlots.jl)

All the power and flexibility of [Plots](https://github.com/JuliaPlots/Plots.jl) is made available to JuliaDB with StatPlots and the `@df` macro.

In [65]:
using StatPlots

t = table(@NT(x = randn(100), y = randn(100)))

@df t scatter(:x, :y)

## Plotting with `partitionplot`

`partitionplot` is meant to be used on big datasets when there are too many points to render.  The data is split into sections each summarized by an `OnlineStat`.

In [66]:
diamonds = loadtable("diamonds.csv"; indexcols = [:carat])

Table with 53940 rows, 10 columns:
[1mcarat  [22mcut          color  clarity  depth  table  price  x      y      z
───────────────────────────────────────────────────────────────────────────
0.2    "Premium"    "E"    "SI2"    60.2   62.0   345    3.79   3.75   2.27
0.2    "Premium"    "E"    "VS2"    59.8   62.0   367    3.79   3.77   2.26
0.2    "Premium"    "E"    "VS2"    59.0   60.0   367    3.81   3.78   2.24
0.2    "Premium"    "E"    "VS2"    61.1   59.0   367    3.81   3.78   2.32
0.2    "Premium"    "E"    "VS2"    59.7   62.0   367    3.84   3.8    2.28
0.2    "Ideal"      "E"    "VS2"    59.7   55.0   367    3.86   3.84   2.3
0.2    "Premium"    "F"    "VS2"    62.6   59.0   367    3.73   3.71   2.33
0.2    "Ideal"      "D"    "VS2"    61.5   57.0   367    3.81   3.77   2.33
0.2    "Very Good"  "E"    "VS2"    63.4   59.0   367    3.74   3.71   2.36
0.2    "Ideal"      "E"    "VS2"    62.2   57.0   367    3.76   3.73   2.33
0.2    "Premium"    "D"    "VS2"    62.3   60.0 

In [67]:
partitionplot(diamonds, :price)

In [68]:
partitionplot(diamonds, :price; stat=Hist(25), color=:blues, dropmissing=true)

In [69]:
partitionplot(diamonds, :cut, :color, stat=CountMap(String), bar_width=1, dropmissing=true)

In [70]:
partitionplot(diamonds, :carat, :price, by=:cut, stat=Extrema(), nparts=50,
    layout=(5,1), xlim=(0,3.5), size=(900,500))

# Missing Data

In [71]:
using DataValues
t = table(DataValueArray(randn(10^6), rand(Bool, 10^6)), randn(10^6))

Table with 1000000 rows, 2 columns:
1          2
─────────────────────
#NA        1.59696
0.354054   -0.194102
-0.147972  1.13246
#NA        -0.242726
#NA        -0.0875334
#NA        0.667347
#NA        0.449333
#NA        -1.18035
#NA        0.30503
#NA        -1.19647
#NA        1.42612
-1.99916   -1.13445
⋮
0.225049   -0.415386
0.483655   0.915587
-1.47836   -0.245382
#NA        -0.0177316
1.80047    0.831666
#NA        1.08034
#NA        -0.121978
#NA        -0.716103
-0.503971  -0.196417
0.654761   0.160032
0.452883   0.852256

In [72]:
partitionplot(t, 1, dropmissing=true)

In [73]:
partitionplot(t, 2, dropmissing=true)

In [74]:
partitionplot(t, 1, 2, dropmissing=true, stat = Hist(10))