# Quick Intro to OnlineStats

- OnlineStats runs algorithms for statistics both on-line and in parallel.

In [1]:
addprocs()
using OnlineStats

## Every stat is a type

In [2]:
m, v = Mean(), Variance()

(Mean(0.0), Variance(0.0))

## Stats are collected in a `Series`

In [3]:
s = Series(m, v)

[32m▦ Series{0}[39m
│[32m EqualWeight | nobs=0[39m
├── Mean(0.0)
└── Variance(0.0)

## A `Series` can be `fit!`-ted with more data

In [4]:
fit!(s, randn(100))

[32m▦ Series{0}[39m
│[32m EqualWeight | nobs=100[39m
├── Mean(-0.233069)
└── Variance(0.878082)

## `Series` can be merged together

In [5]:
s2 = Series(randn(100), Mean(), Variance())
merge!(s, s2)

[32m▦ Series{0}[39m
│[32m EqualWeight | nobs=200[39m
├── Mean(-0.169194)
└── Variance(1.03277)

# `fit!`-ting and `merge!`-ing works quite well alongside JuliaDB

<img src="https://user-images.githubusercontent.com/8075494/32748459-519986e8-c88a-11e7-89b3-80dedf7f261b.png" width=400>

# Jump into an Example

- OnlineStats integration is available through the `reduce` and `groupreduce` functions.

In [24]:
;ls stocksample

aapl.us.txt
amzn.us.txt
dis.us.txt
googl.us.txt
ibm.us.txt
msft.us.txt
nflx.us.txt
tsla.us.txt


In [7]:
using JuliaDB
t = loadtable("stocksample", filenamecol = :Ticker, indexcols = [:Ticker, :Date])

Distributed Table with 56023 rows in 8 chunks:
Columns:
[1m#  [22m[1mcolname  [22m[1mtype[22m
───────────────────
1  Ticker   String
2  Date     Date
3  Open     Float64
4  High     Float64
5  Low      Float64
6  Close    Float64
7  Volume   Int64
8  OpenInt  Int64