# Time series analysis with Julia
is easiest done with the pandas-like ```TimeSeries``` package. The pakage is registered.

In [None]:
# Pkg.add("TimeSeries")
using TimeSeries

# Basics
## TimeArray
At the heart of Timeseries is `TimeArray`, an Array-like collection of columns indexed by time (rows). A `TimeArray` is constructed as follows:
```julia
tarr = TimeArray(dt, val, colnames)
```
where `dt` is a `(T,)` collection of `Date` instances, `val` are a `(T,N)` array, and `colnames` is a `(N,)` collection.

Let's create a frame of two series called 'MSFT' and 'TSLA' populated with random numbers:

In [None]:
# let's take T=20 days in January 1999
T, N = 20, 2

# dates are a sequence of T days from January 1, 1999
start_date = Date(1999, 1, 1)
dt = collect(start_date : start_date + Dates.Day(T-1))

# column names are two strings
colnames = String["MSFT", "TSLA"]

# values are random normally distributed numbers
val = randn(T, N)

# construct array
tarr = TimeArray(dt, val, colnames)

print(tarr)

## Extracting values and dates
Sometimes it is useful to extract values and dates from a `TimeArray`. This is done using functions `values` and `timestamp`:

In [None]:
println("\nvalues of tarr:\n")
print(values(tarr))
println("\n\ndates of tarr:\n")
print(timestamp(tarr))

## Handy functions
### Selectors

In [None]:
# print first `k` values
k = 6
print(head(tarr, k))

In [None]:
# print last `k` values
print(tail(tarr, k))

In [None]:
# select column "MSFT"
print(head(tarr["MSFT"], k))

In [None]:
# select values from some date onwards
date_from = Date(1999, 1, 10)
print(from(tarr, date_from))

In [None]:
# select values from some date to some other date
date_from = Date(1999, 1, 10)
date_to = Date(1999, 1, 12)
# print(to(from(new_arr, date_from), date_to))
print(tarr[date_from : date_to])

In [None]:
# aggregate by some condition
println("\nstock values on tuesday:\n")
print(when(tarr, dayofweek, 2))

more conditions:

| Method          | Example       |
| -------------   | :-----------: |
| `month`         | 1             |
| `quarterofyear` | 4             |
| `year`          | 2000          |

yet more in the official [docs](http://timeseriesjl.readthedocs.io/en/latest/split.html#when)

### Shifters

In [None]:
# lag values by `p`
p = 1
tarr_lagged = lag(tarr, p)

println("\nfirst values of the old array\n")
print(head(tarr, 4))
println("\nfirst values of the array lagged by 1\n")
print(head(tarr_lagged, 4))

In [None]:
# lead
println("\nfirst values of the old array\n")
print(head(tarr, 4))
println("\nfirst values of the array lagged by -1\n")
print(head(lead(tarr, 1), 4))

### Resampler
syntax:
```
collapse(ta, period, timestamp, value)
```
where `period` is the aggregation period, e.g. `month` or `week`; timestamp is the function to apply to dates, e.g. `last` to take the last value of the aggregation period or `first` to take the first one; value is the function to apply to values, e.g. `mean` to take the average within each aggregation period.

In [None]:
# resample weekly by summing values and reindexing with Sundays (that's what `last` does); 
#     if values are daily log-returns, this gives weekly log-returns
print(collapse(tarr, week, last, sum))

### Concatenation
horizontal concatenation is done with `merge`, vertical -- with `vcat` (columns need to coincide)

In [None]:
# rename `new_arr` and stack it horzontally with the original self
tarr_renamed = rename(tarr, ["MSFT0", "TSLA0"])
tarr_2 = merge(tarr, tarr_renamed)
print(tarr_2)

## Calculations on TimeArrays
The operations are only performed on values that share a timestamp, e.g. when adding two TimeArrays, the values on 1999-01-03 in the first array are added to those on 1999-01-03 of the second array and so on.

Let's create a second array, with some dates overlapping the dates of `tarr`:

In [None]:
# let's take 10 days in early 1999
T, N = 10, 1

# dates, some will overlap the last dates of `new_arr
dates = [tarr.timestamp[end-4] + Dates.Day(p) for p = collect(1:T)]

# column names are a string
colnames = String["Rf"]

# values are random normally distributed numbers
values = rand(T, N)*100

# construct array
rf = TimeArray(dates, values, colnames)

print(rf)

In [None]:
# subtract risk-free rate from both
println("\nmind that only 'common' dates are left!\n")
print(tarr .- rf)

## Plotting

In [None]:
using Plots
gr()

We'll create a function that will plot a TimeArray and could contain our settings. The only setting so far will be linewidth of 1.5:

In [None]:
"""
Plot a TimeArray on the date axis.
    
Parameters
----------
ta : TimeArray
    
Returns
-------
nothing
"""
function prettyplot(ta::TimeArray)
    plot(ta, lw=1.5)
end

In [None]:
# print
prettyplot(tarr)

# Time series analysis

## Filters
A handy feature is rolling and expanding calculations:

In [None]:
# 5-period moving average
w = 5
prettyplot(moving(tarr, mean, w))

In [None]:
# 10-period moving Sharpe ratio (risk-free rate taken to be 0)
w = 10
prettyplot(moving(tarr, mean, w) ./ moving(tarr, std, w))

In [None]:
# rolling 7-period weird statistic: min / max
# define function calculating the desired quantity
weird_fun = arr -> maximum(arr) / minimum(arr)
prettyplot(moving(tarr, weird_fun, 7))

In [None]:
# expanding minimum
prettyplot(upto(tarr, minimum))

# Converting TimeArrays to DataFrames
using `IterableTables` package (registered):

In [None]:
Pkg.add("IterableTables")

In [None]:
using DataFrames, IterableTables

In [None]:
# convert
df = DataFrame(tarr)
print(df)