<a href="https://colab.research.google.com/github/xKDR/Julia-Workshop/blob/main/DataStructuresForSpeed.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _Colab Notebook Template_

## Instructions
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. If you need a GPU: _Runtime_ > _Change runtime type_ > _Harware accelerator_ = _GPU_.
3. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia and other packages (if needed, update `JULIA_VERSION` and the other parameters). This takes a couple of minutes.
4. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the next section.

_Notes_:
* If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2, 3 and 4.
* After installation, if you want to change the Julia version or activate/deactivate the GPU, you will need to reset the Runtime: _Runtime_ > _Factory reset runtime_ and repeat steps 3 and 4.

In [None]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.10.4" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools"
JULIA_PACKAGES_IF_GPU="CUDA" # or CuArrays for older Julia versions
JULIA_NUM_THREADS=2
#---------------------------------------------------#

if [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  nvidia-smi -L &> /dev/null && export GPU=1 || export GPU=0
  if [ $GPU -eq 1 ]; then
    JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"' &> /dev/null
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia

  echo ''
  echo "Successfully installed `julia -v`!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "jump to the 'Checking the Installation' section."
fi

Installing Julia 1.10.4 on the current Colab Runtime...
2024-10-23 05:56:43 URL:https://julialang-s3.julialang.org/bin/linux/x64/1.10/julia-1.10.4-linux-x86_64.tar.gz [173704015/173704015] -> "/tmp/julia.tar.gz" [1]
Installing Julia package IJulia...
Installing Julia package BenchmarkTools...


In [1]:
versioninfo()

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 2 × Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, broadwell)
Threads: 2 default, 0 interactive, 1 GC (on 2 virtual cores)
Environment:
  LD_LIBRARY_PATH = /usr/local/nvidia/lib:/usr/local/nvidia/lib64
  JULIA_NUM_THREADS = 2


# Data structures for speed

Julia is clearly the winner when it comes to speed of execution for
tabular data structure manipulation. In this session we will cover the
basics of the manipulatin tabular data structures with DataFrames.jl
and timeseries data using TSFrames.jl.

In [2]:
using Pkg
Pkg.add("DataFrames")
Pkg.add("TSFrames")
Pkg.add("RDatasets")
Pkg.add("CSV")
Pkg.add("MarketData")
Pkg.add("Impute")

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m Crayons ───────────────────── v4.1.1
[32m[1m   Installed[22m[39m SentinelArrays ────────────── v1.4.7
[32m[1m   Installed[22m[39m PooledArrays ──────────────── v1.4.3
[32m[1m   Installed[22m[39m TableTraits ───────────────── v1.0.1
[32m[1m   Installed[22m[39m DataAPI ───────────────────── v1.16.0
[32m[1m   Installed[22m[39m Tables ────────────────────── v1.12.0
[32m[1m   Installed[22m[39m InlineStrings ─────────────── v1.4.2
[32m[1m   Installed[22m[39m PrettyTables ──────────────── v2.4.0
[32m[1m   Installed[22m[39m IteratorInterfaceExtensions ─ v1.0.0
[32m[1m   Installed[22m[39m OrderedCollections ────────── v1.7.0
[32m[1m   Installed[22m[39m LaTeXStrings ──────────────── v1.4.0
[32m[1m   Installed[22m[39m InvertedIndices ───────────── v1.3.0
[32m[1m   Installed[22m[39m DataVal

In [3]:
using DataFrames

In [4]:
df = DataFrame([])

In [5]:
df = DataFrame(a=[1,2], b=[2,3])

Row,a,b
Unnamed: 0_level_1,Int64,Int64
1,1,2
2,2,3


In [97]:
using CSV
aapl_df = CSV.read("aapl.csv", DataFrame)

Row,timestamp,Open,High,Low,Close,AdjClose,Volume
Unnamed: 0_level_1,Date,Float64,Float64,Float64,Float64,Float64,Float64
1,1980-12-12,0.128348,0.128906,0.128348,0.128348,0.0988345,4.69034e8
2,1980-12-15,0.12221,0.12221,0.121652,0.121652,0.0936782,1.75885e8
3,1980-12-16,0.113281,0.113281,0.112723,0.112723,0.0868024,1.05728e8
4,1980-12-17,0.115513,0.116071,0.115513,0.115513,0.0889509,8.64416e7
5,1980-12-18,0.118862,0.11942,0.118862,0.118862,0.0915298,7.34496e7
6,1980-12-19,0.126116,0.126674,0.126116,0.126116,0.0971157,4.86304e7
7,1980-12-22,0.132254,0.132813,0.132254,0.132254,0.101842,3.73632e7
8,1980-12-23,0.137835,0.138393,0.137835,0.137835,0.10614,4.69504e7
9,1980-12-24,0.145089,0.145647,0.145089,0.145089,0.111726,4.80032e7
10,1980-12-26,0.158482,0.15904,0.158482,0.158482,0.122039,5.55744e7


In [7]:
## Pkg.add("MySQL")
## Pkg.add("JSON")

In [8]:
using RDatasets
iris = dataset("datasets", "iris")

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [9]:
describe(iris)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,DataType
1,SepalLength,5.84333,4.3,5.8,7.9,0,Float64
2,SepalWidth,3.05733,2.0,3.0,4.4,0,Float64
3,PetalLength,3.758,1.0,4.35,6.9,0,Float64
4,PetalWidth,1.19933,0.1,1.3,2.5,0,Float64
5,Species,,setosa,,virginica,0,"CategoricalValue{String, UInt8}"


In [10]:
first(iris)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa


In [11]:
first(iris, 10)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [12]:
last(iris, 10)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,6.7,3.1,5.6,2.4,virginica
2,6.9,3.1,5.1,2.3,virginica
3,5.8,2.7,5.1,1.9,virginica
4,6.8,3.2,5.9,2.3,virginica
5,6.7,3.3,5.7,2.5,virginica
6,6.7,3.0,5.2,2.3,virginica
7,6.3,2.5,5.0,1.9,virginica
8,6.5,3.0,5.2,2.0,virginica
9,6.2,3.4,5.4,2.3,virginica
10,5.9,3.0,5.1,1.8,virginica


In [13]:
iris[1, :]

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa


In [14]:
iris[:, 1]

150-element Vector{Float64}:
 5.1
 4.9
 4.7
 4.6
 5.0
 5.4
 4.6
 5.0
 4.4
 4.9
 5.4
 4.8
 4.8
 ⋮
 6.0
 6.9
 6.7
 6.9
 5.8
 6.8
 6.7
 6.7
 6.3
 6.5
 6.2
 5.9

In [15]:
iris[!, 1]

150-element Vector{Float64}:
 5.1
 4.9
 4.7
 4.6
 5.0
 5.4
 4.6
 5.0
 4.4
 4.9
 5.4
 4.8
 4.8
 ⋮
 6.0
 6.9
 6.7
 6.9
 5.8
 6.8
 6.7
 6.7
 6.3
 6.5
 6.2
 5.9

In [16]:
iris[!, [1, 2]]

Row,SepalLength,SepalWidth
Unnamed: 0_level_1,Float64,Float64
1,5.1,3.5
2,4.9,3.0
3,4.7,3.2
4,4.6,3.1
5,5.0,3.6
6,5.4,3.9
7,4.6,3.4
8,5.0,3.4
9,4.4,2.9
10,4.9,3.1


In [17]:
iris[!, :SepalLength]

150-element Vector{Float64}:
 5.1
 4.9
 4.7
 4.6
 5.0
 5.4
 4.6
 5.0
 4.4
 4.9
 5.4
 4.8
 4.8
 ⋮
 6.0
 6.9
 6.7
 6.9
 5.8
 6.8
 6.7
 6.7
 6.3
 6.5
 6.2
 5.9

In [18]:
iris[!, [:SepalLength, :SepalWidth]]

Row,SepalLength,SepalWidth
Unnamed: 0_level_1,Float64,Float64
1,5.1,3.5
2,4.9,3.0
3,4.7,3.2
4,4.6,3.1
5,5.0,3.6
6,5.4,3.9
7,4.6,3.4
8,5.0,3.4
9,4.4,2.9
10,4.9,3.1


In [19]:
iris.SepalLength

150-element Vector{Float64}:
 5.1
 4.9
 4.7
 4.6
 5.0
 5.4
 4.6
 5.0
 4.4
 4.9
 5.4
 4.8
 4.8
 ⋮
 6.0
 6.9
 6.7
 6.9
 5.8
 6.8
 6.7
 6.7
 6.3
 6.5
 6.2
 5.9

In [20]:
iris.SepalWidth

150-element Vector{Float64}:
 3.5
 3.0
 3.2
 3.1
 3.6
 3.9
 3.4
 3.4
 2.9
 3.1
 3.7
 3.4
 3.0
 ⋮
 3.0
 3.1
 3.1
 3.1
 2.7
 3.2
 3.3
 3.0
 2.5
 3.0
 3.4
 3.0

In [21]:
iris[!, r"Sepal"]

Row,SepalLength,SepalWidth
Unnamed: 0_level_1,Float64,Float64
1,5.1,3.5
2,4.9,3.0
3,4.7,3.2
4,4.6,3.1
5,5.0,3.6
6,5.4,3.9
7,4.6,3.4
8,5.0,3.4
9,4.4,2.9
10,4.9,3.1


In [22]:
iris[!, Not(r"Sepal")]

Row,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Cat…
1,1.4,0.2,setosa
2,1.4,0.2,setosa
3,1.3,0.2,setosa
4,1.5,0.2,setosa
5,1.4,0.2,setosa
6,1.7,0.4,setosa
7,1.4,0.3,setosa
8,1.5,0.2,setosa
9,1.4,0.2,setosa
10,1.5,0.1,setosa


In [23]:
iris[!, Not(:SepalLength)]

Row,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Cat…
1,3.5,1.4,0.2,setosa
2,3.0,1.4,0.2,setosa
3,3.2,1.3,0.2,setosa
4,3.1,1.5,0.2,setosa
5,3.6,1.4,0.2,setosa
6,3.9,1.7,0.4,setosa
7,3.4,1.4,0.3,setosa
8,3.4,1.5,0.2,setosa
9,2.9,1.4,0.2,setosa
10,3.1,1.5,0.1,setosa


In [24]:
iris[!, Between(:SepalWidth, :PetalWidth)]

Row,SepalWidth,PetalLength,PetalWidth
Unnamed: 0_level_1,Float64,Float64,Float64
1,3.5,1.4,0.2
2,3.0,1.4,0.2
3,3.2,1.3,0.2
4,3.1,1.5,0.2
5,3.6,1.4,0.2
6,3.9,1.7,0.4
7,3.4,1.4,0.3
8,3.4,1.5,0.2
9,2.9,1.4,0.2
10,3.1,1.5,0.1


In [25]:
iris[!, Between(2, 4)]

Row,SepalWidth,PetalLength,PetalWidth
Unnamed: 0_level_1,Float64,Float64,Float64
1,3.5,1.4,0.2
2,3.0,1.4,0.2
3,3.2,1.3,0.2
4,3.1,1.5,0.2
5,3.6,1.4,0.2
6,3.9,1.7,0.4
7,3.4,1.4,0.3
8,3.4,1.5,0.2
9,2.9,1.4,0.2
10,3.1,1.5,0.1


In [26]:
iris[!, Cols(r"Petal")]

Row,PetalLength,PetalWidth
Unnamed: 0_level_1,Float64,Float64
1,1.4,0.2
2,1.4,0.2
3,1.3,0.2
4,1.5,0.2
5,1.4,0.2
6,1.7,0.4
7,1.4,0.3
8,1.5,0.2
9,1.4,0.2
10,1.5,0.1


In [27]:
iris[iris.SepalLength .> 4, :]

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [28]:
iris[iris.Species .== "setosa", :]

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [29]:
iris[(iris.SepalLength .> 4) .& (iris.PetalLength .> 3), :]

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,7.0,3.2,4.7,1.4,versicolor
2,6.4,3.2,4.5,1.5,versicolor
3,6.9,3.1,4.9,1.5,versicolor
4,5.5,2.3,4.0,1.3,versicolor
5,6.5,2.8,4.6,1.5,versicolor
6,5.7,2.8,4.5,1.3,versicolor
7,6.3,3.3,4.7,1.6,versicolor
8,4.9,2.4,3.3,1.0,versicolor
9,6.6,2.9,4.6,1.3,versicolor
10,5.2,2.7,3.9,1.4,versicolor


In [30]:
DataFrames.subset(iris,
                    :SepalLength => s -> s .> 4,
                    :PetalLength => p -> p .> 3)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,7.0,3.2,4.7,1.4,versicolor
2,6.4,3.2,4.5,1.5,versicolor
3,6.9,3.1,4.9,1.5,versicolor
4,5.5,2.3,4.0,1.3,versicolor
5,6.5,2.8,4.6,1.5,versicolor
6,5.7,2.8,4.5,1.3,versicolor
7,6.3,3.3,4.7,1.6,versicolor
8,4.9,2.4,3.3,1.0,versicolor
9,6.6,2.9,4.6,1.3,versicolor
10,5.2,2.7,3.9,1.4,versicolor


In [31]:
DataFrames.subset(iris, :Species => s -> s .== "setosa")

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [32]:
iriscopy = copy(iris)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [33]:
DataFrames.subset!(iriscopy, :Species => s -> s .== "setosa")

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [34]:
nrow(iris)

150

In [35]:
nrow(iriscopy)

50

In [36]:
select(iris, Not(:SepalLength))

Row,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Cat…
1,3.5,1.4,0.2,setosa
2,3.0,1.4,0.2,setosa
3,3.2,1.3,0.2,setosa
4,3.1,1.5,0.2,setosa
5,3.6,1.4,0.2,setosa
6,3.9,1.7,0.4,setosa
7,3.4,1.4,0.3,setosa
8,3.4,1.5,0.2,setosa
9,2.9,1.4,0.2,setosa
10,3.1,1.5,0.1,setosa


In [37]:
select(iris, :SepalLength => s -> s * 2)

Row,SepalLength_function
Unnamed: 0_level_1,Float64
1,10.2
2,9.8
3,9.4
4,9.2
5,10.0
6,10.8
7,9.2
8,10.0
9,8.8
10,9.8


In [38]:
select(iris, :SepalLength => s -> s * 2, :SepalWidth)

Row,SepalLength_function,SepalWidth
Unnamed: 0_level_1,Float64,Float64
1,10.2,3.5
2,9.8,3.0
3,9.4,3.2
4,9.2,3.1
5,10.0,3.6
6,10.8,3.9
7,9.2,3.4
8,10.0,3.4
9,8.8,2.9
10,9.8,3.1


In [39]:
select(iris, :SepalLength => s -> s * 2, [:SepalLength, :SepalWidth] => ((x,y) -> x[1] + x[2]) => :X)

Row,SepalLength_function,X
Unnamed: 0_level_1,Float64,Float64
1,10.2,10.0
2,9.8,10.0
3,9.4,10.0
4,9.2,10.0
5,10.0,10.0
6,10.8,10.0
7,9.2,10.0
8,10.0,10.0
9,8.8,10.0
10,9.8,10.0


In [40]:
select(iris, :SepalLength => :S1, :SepalWidth => :S2) ## Rename columns

Row,S1,S2
Unnamed: 0_level_1,Float64,Float64
1,5.1,3.5
2,4.9,3.0
3,4.7,3.2
4,4.6,3.1
5,5.0,3.6
6,5.4,3.9
7,4.6,3.4
8,5.0,3.4
9,4.4,2.9
10,4.9,3.1


In [41]:
#select!(iris, :SepalLength => :S1, :SepalWidth => :S2) ## Don't copy columns

In [42]:
transform(iris, Not(:SepalLength))

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa


In [43]:
transform(iris, Not(:SepalLength)) == select(iris, Not(:SepalLength)) # true

false

In [44]:
transform(iris, :SepalLength => s -> s * 2) # returns new column

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species,SepalLength_function
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…,Float64
1,5.1,3.5,1.4,0.2,setosa,10.2
2,4.9,3.0,1.4,0.2,setosa,9.8
3,4.7,3.2,1.3,0.2,setosa,9.4
4,4.6,3.1,1.5,0.2,setosa,9.2
5,5.0,3.6,1.4,0.2,setosa,10.0
6,5.4,3.9,1.7,0.4,setosa,10.8
7,4.6,3.4,1.4,0.3,setosa,9.2
8,5.0,3.4,1.5,0.2,setosa,10.0
9,4.4,2.9,1.4,0.2,setosa,8.8
10,4.9,3.1,1.5,0.1,setosa,9.8


In [45]:
transform(iris, :SepalLength => (s -> s * 2) => :SepalLength2) # returns new column

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species,SepalLength2
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…,Float64
1,5.1,3.5,1.4,0.2,setosa,10.2
2,4.9,3.0,1.4,0.2,setosa,9.8
3,4.7,3.2,1.3,0.2,setosa,9.4
4,4.6,3.1,1.5,0.2,setosa,9.2
5,5.0,3.6,1.4,0.2,setosa,10.0
6,5.4,3.9,1.7,0.4,setosa,10.8
7,4.6,3.4,1.4,0.3,setosa,9.2
8,5.0,3.4,1.5,0.2,setosa,10.0
9,4.4,2.9,1.4,0.2,setosa,8.8
10,4.9,3.1,1.5,0.1,setosa,9.8


In [46]:
transform(iris, :SepalLength => s -> s * 2, [:SepalLength, :SepalWidth] => ((x,y) -> x[1] + x[2]) => :X)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species,SepalLength_function,X
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…,Float64,Float64
1,5.1,3.5,1.4,0.2,setosa,10.2,10.0
2,4.9,3.0,1.4,0.2,setosa,9.8,10.0
3,4.7,3.2,1.3,0.2,setosa,9.4,10.0
4,4.6,3.1,1.5,0.2,setosa,9.2,10.0
5,5.0,3.6,1.4,0.2,setosa,10.0,10.0
6,5.4,3.9,1.7,0.4,setosa,10.8,10.0
7,4.6,3.4,1.4,0.3,setosa,9.2,10.0
8,5.0,3.4,1.5,0.2,setosa,10.0,10.0
9,4.4,2.9,1.4,0.2,setosa,8.8,10.0
10,4.9,3.1,1.5,0.1,setosa,9.8,10.0


In [47]:
combine(iris, :SepalLength .=> sum)

Row,SepalLength_sum
Unnamed: 0_level_1,Float64
1,876.5


In [48]:
combine(iris, Not(:Species) .=> sum)

Row,SepalLength_sum,SepalWidth_sum,PetalLength_sum,PetalWidth_sum
Unnamed: 0_level_1,Float64,Float64,Float64,Float64
1,876.5,458.6,563.7,179.9


In [49]:
combine(iris, :SepalLength => x -> sum(x * 10))

Row,SepalLength_function
Unnamed: 0_level_1,Float64
1,8765.0


In [50]:
df = DataFrame(x=[1, 2, missing], y=[1, missing, missing])

Row,x,y
Unnamed: 0_level_1,Int64?,Int64?
1,1,1
2,2,missing
3,missing,missing


In [51]:
combine(df, All() .=> x -> x * 10)

Row,x_function,y_function
Unnamed: 0_level_1,Int64?,Int64?
1,10,10
2,20,missing
3,missing,missing


In [52]:
combine(df, All() .=> x -> sum(x * 10))

Row,x_function,y_function
Unnamed: 0_level_1,Missing,Missing
1,missing,missing


In [53]:
combine(df, All() .=> x -> sum(skipmissing(x * 10)))

Row,x_function,y_function
Unnamed: 0_level_1,Int64,Int64
1,30,10


In [54]:
gd = groupby(iris, :Species)

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
7,4.6,3.4,1.4,0.3,setosa
8,5.0,3.4,1.5,0.2,setosa
9,4.4,2.9,1.4,0.2,setosa
10,4.9,3.1,1.5,0.1,setosa

Row,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Cat…
1,6.3,3.3,6.0,2.5,virginica
2,5.8,2.7,5.1,1.9,virginica
3,7.1,3.0,5.9,2.1,virginica
4,6.3,2.9,5.6,1.8,virginica
5,6.5,3.0,5.8,2.2,virginica
6,7.6,3.0,6.6,2.1,virginica
7,4.9,2.5,4.5,1.7,virginica
8,7.3,2.9,6.3,1.8,virginica
9,6.7,2.5,5.8,1.8,virginica
10,7.2,3.6,6.1,2.5,virginica


In [55]:
combine(gd, :SepalLength => sum)

Row,Species,SepalLength_sum
Unnamed: 0_level_1,Cat…,Float64
1,setosa,250.3
2,versicolor,296.8
3,virginica,329.4


In [56]:
combine(gd, Not(:Species) .=> sum)

Row,Species,SepalLength_sum,SepalWidth_sum,PetalLength_sum,PetalWidth_sum
Unnamed: 0_level_1,Cat…,Float64,Float64,Float64,Float64
1,setosa,250.3,171.4,73.1,12.3
2,versicolor,296.8,138.5,213.0,66.3
3,virginica,329.4,148.7,277.6,101.3


In [57]:
combine(gd, Not(:Species) .=> sum, DataFrames.nrow)

Row,Species,SepalLength_sum,SepalWidth_sum,PetalLength_sum,PetalWidth_sum,nrow
Unnamed: 0_level_1,Cat…,Float64,Float64,Float64,Float64,Int64
1,setosa,250.3,171.4,73.1,12.3,50
2,versicolor,296.8,138.5,213.0,66.3,50
3,virginica,329.4,148.7,277.6,101.3,50


In [58]:
using Statistics
combine(gd, Not(:Species) .=> mean, DataFrames.nrow)

Row,Species,SepalLength_mean,SepalWidth_mean,PetalLength_mean,PetalWidth_mean,nrow
Unnamed: 0_level_1,Cat…,Float64,Float64,Float64,Float64,Int64
1,setosa,5.006,3.428,1.462,0.246,50
2,versicolor,5.936,2.77,4.26,1.326,50
3,virginica,6.588,2.974,5.552,2.026,50


In [59]:
combine(gd, AsTable([:SepalLength, :PetalLength]) => ByRow((x) -> x[1] / x[2]) => :Ratio)

Row,Species,Ratio
Unnamed: 0_level_1,Cat…,Float64
1,setosa,3.64286
2,setosa,3.5
3,setosa,3.61538
4,setosa,3.06667
5,setosa,3.57143
6,setosa,3.17647
7,setosa,3.28571
8,setosa,3.33333
9,setosa,3.14286
10,setosa,3.26667


In [60]:
using TSFrames
ts = TSFrame(1:10)



[1m10×1 TSFrame with Int64 Index[0m
[1m Index [0m[1m x1    [0m
[90m Int64 [0m[90m Int64 [0m
──────────────
     1      1
     2      2
     3      3
     4      4
     5      5
     6      6
     7      7
     8      8
     9      9
    10     10

In [61]:
ts = TSFrame(1:10, 2301: 2310)

[1m10×1 TSFrame with Int64 Index[0m
[1m Index [0m[1m x1    [0m
[90m Int64 [0m[90m Int64 [0m
──────────────
  2301      1
  2302      2
  2303      3
  2304      4
  2305      5
  2306      6
  2307      7
  2308      8
  2309      9
  2310     10

In [62]:
using MarketData
aapl_df = DataFrame(MarketData.yahoo(:AAPL))

Row,timestamp,Open,High,Low,Close,AdjClose,Volume
Unnamed: 0_level_1,Date,Float64,Float64,Float64,Float64,Float64,Float64
1,1980-12-12,0.128348,0.128906,0.128348,0.128348,0.0988345,4.69034e8
2,1980-12-15,0.12221,0.12221,0.121652,0.121652,0.0936782,1.75885e8
3,1980-12-16,0.113281,0.113281,0.112723,0.112723,0.0868024,1.05728e8
4,1980-12-17,0.115513,0.116071,0.115513,0.115513,0.0889508,8.64416e7
5,1980-12-18,0.118862,0.11942,0.118862,0.118862,0.0915297,7.34496e7
6,1980-12-19,0.126116,0.126674,0.126116,0.126116,0.0971157,4.86304e7
7,1980-12-22,0.132254,0.132813,0.132254,0.132254,0.101842,3.73632e7
8,1980-12-23,0.137835,0.138393,0.137835,0.137835,0.10614,4.69504e7
9,1980-12-24,0.145089,0.145647,0.145089,0.145089,0.111726,4.80032e7
10,1980-12-26,0.158482,0.15904,0.158482,0.158482,0.122039,5.55744e7


In [63]:
aapl_ts = TSFrame(MarketData.yahoo(:AAPL))

[1m11088×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open       [0m[1m High       [0m[1m Low        [0m[1m Close      [0m[1m AdjClose    [0m[1m Volume    [0m
[90m Date       [0m[90m Float64    [0m[90m Float64    [0m[90m Float64    [0m[90m Float64    [0m[90m Float64     [0m[90m Float64   [0m
────────────────────────────────────────────────────────────────────────────────────
 1980-12-12    0.128348    0.128906    0.128348    0.128348    0.0988345  4.69034e8
 1980-12-15    0.12221     0.12221     0.121652    0.121652    0.0936782  1.75885e8
 1980-12-16    0.113281    0.113281    0.112723    0.112723    0.0868024  1.05728e8
 1980-12-17    0.115513    0.116071    0.115513    0.115513    0.0889509  8.64416e7
 1980-12-18    0.118862    0.11942     0.118862    0.118862    0.0915298  7.34496e7
 1980-12-19    0.126116    0.126674    0.126116    0.126116    0.0971157  4.86304e7
 1980-12-22    0.132254    0.132813    0.132254    0.132254    0.101842   3.73632e

In [64]:
aapl_ts = CSV.read("aapl.csv", TSFrame)

LoadError: ArgumentError: "aapl.csv" is not a valid file or doesn't exist

In [65]:
nr(aapl_ts)

11088

In [66]:
nc(aapl_ts)

6

In [67]:
size(aapl_ts)

(11088, 6)

In [68]:
length(aapl_ts)

11088

In [69]:
names(aapl_ts)

6-element Vector{String}:
 "Open"
 "High"
 "Low"
 "Close"
 "AdjClose"
 "Volume"

In [70]:
index(aapl_ts)

11088-element Vector{Date}:
 1980-12-12
 1980-12-15
 1980-12-16
 1980-12-17
 1980-12-18
 1980-12-19
 1980-12-22
 1980-12-23
 1980-12-24
 1980-12-26
 1980-12-29
 1980-12-30
 1980-12-31
 ⋮
 2024-11-19
 2024-11-20
 2024-11-21
 2024-11-22
 2024-11-25
 2024-11-26
 2024-11-27
 2024-11-29
 2024-12-02
 2024-12-03
 2024-12-04
 2024-12-05

In [71]:
TSFrames.describe(aapl_ts)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,DataType
1,Index,,1980-12-12,,2024-12-05,0,Date
2,Open,23.953,0.049665,0.540179,243.99,0,Float64
3,High,24.2085,0.049665,0.549107,244.54,0,Float64
4,Low,23.7092,0.049107,0.53125,242.13,0,Float64
5,Close,23.9705,0.049107,0.540179,243.04,0,Float64
6,AdjClose,23.1337,0.0378149,0.442485,243.04,0,Float64
7,Volume,315896000.0,0.0,203896000.0,7.42164e9,0,Float64


In [72]:
aapl_ts[1]

[1m1×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open     [0m[1m High     [0m[1m Low      [0m[1m Close    [0m[1m AdjClose  [0m[1m Volume    [0m
[90m Date       [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64   [0m[90m Float64   [0m
──────────────────────────────────────────────────────────────────────────
 1980-12-12  0.128348  0.128906  0.128348  0.128348  0.0988345  4.69034e8

In [73]:
aapl_ts[2, 1]

0.12221000343561172

In [74]:
aapl_ts[2, [1]]

[1m1×1 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m
[90m Date       [0m[90m Float64 [0m
─────────────────────
 1980-12-15  0.12221

In [75]:
aapl_ts[[2, 3], [1, 2, 3, 4]]

[1m2×4 TSFrame with Date Index[0m
[1m Index      [0m[1m Open     [0m[1m High     [0m[1m Low      [0m[1m Close    [0m
[90m Date       [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m
────────────────────────────────────────────────────
 1980-12-15  0.12221   0.12221   0.121652  0.121652
 1980-12-16  0.113281  0.113281  0.112723  0.112723

In [76]:
aapl_ts[[2, 3], [:Open, :High, :Low, :Close]]

[1m2×4 TSFrame with Date Index[0m
[1m Index      [0m[1m Open     [0m[1m High     [0m[1m Low      [0m[1m Close    [0m
[90m Date       [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m
────────────────────────────────────────────────────
 1980-12-15  0.12221   0.12221   0.121652  0.121652
 1980-12-16  0.113281  0.113281  0.112723  0.112723

In [77]:
aapl_ts.Open

11088-element Vector{Float64}:
   0.1283479928970337
   0.12221000343561172
   0.1132809966802597
   0.11551299691200256
   0.11886200308799744
   0.12611599266529083
   0.1322540044784546
   0.13783499598503113
   0.14508900046348572
   0.15848200023174286
   0.16071400046348572
   0.15736599266529083
   0.1529020071029663
   ⋮
 226.97999572753906
 228.05999755859375
 228.8800048828125
 228.05999755859375
 231.4600067138672
 233.3300018310547
 234.47000122070312
 234.80999755859375
 237.27000427246094
 239.80999755859375
 242.8699951171875
 243.99000549316406

In [78]:
aapl_ts[Date(2007, 1, 10)]

[1m1×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m[1m AdjClose [0m[1m Volume    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64  [0m[90m Float64   [0m
─────────────────────────────────────────────────────────────────────
 2007-01-10  3.38393  3.49286   3.3375  3.46429   2.92229  2.95288e9

In [79]:
aapl_ts[Date(2007, 1, 10), [:Open, :High, :Low, :Close]]

[1m1×4 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m
────────────────────────────────────────────────
 2007-01-10  3.38393  3.49286   3.3375  3.46429

In [80]:
aapl_ts[Year(2007), Month(1)]

[1m20×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m[1m AdjClose [0m[1m Volume    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64  [0m[90m Float64   [0m
─────────────────────────────────────────────────────────────────────
 2007-01-03  3.08179  3.09214  2.925    2.99286   2.52462  1.23832e9
 2007-01-04  3.00179  3.06964  2.99357  3.05929   2.58065  8.4726e8
 2007-01-05  3.06321  3.07857  3.01429  3.0375    2.56227  8.34742e8
 2007-01-08  3.07     3.09036  3.04571  3.0525    2.57493  7.97107e8
 2007-01-09  3.0875   3.32071  3.04107  3.30607   2.78883  3.3493e9
 2007-01-10  3.38393  3.49286  3.3375   3.46429   2.92229  2.95288e9
 2007-01-11  3.42643  3.45643  3.39643  3.42143   2.88614  1.44025e9
 2007-01-12  3.37821  3.395    3.32964  3.37929   2.85059  1.31269e9
 2007-01-16  3.41714  3.47321  3.40893  3.46786   2.9253   1.24408e9
 2007-01-17  3.

In [81]:
aapl_ts[Year(2007), Month(1)][:, [:Open, :High, :Low, :Close]]

[1m20×4 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m
────────────────────────────────────────────────
 2007-01-03  3.08179  3.09214  2.925    2.99286
 2007-01-04  3.00179  3.06964  2.99357  3.05929
 2007-01-05  3.06321  3.07857  3.01429  3.0375
 2007-01-08  3.07     3.09036  3.04571  3.0525
 2007-01-09  3.0875   3.32071  3.04107  3.30607
 2007-01-10  3.38393  3.49286  3.3375   3.46429
 2007-01-11  3.42643  3.45643  3.39643  3.42143
 2007-01-12  3.37821  3.395    3.32964  3.37929
 2007-01-16  3.41714  3.47321  3.40893  3.46786
 2007-01-17  3.48429  3.48571  3.38643  3.39107
 2007-01-18  3.28929  3.28964  3.18036  3.18107
 2007-01-19  3.16536  3.20179  3.14714  3.16071
 2007-01-22  3.18357  3.18429  3.05893  3.09964
 2007-01-23  3.06179  3.12536  3.05393  3.06071
 2007-01-24  3.09571  3.1125   3.07429  3.09643
 2007-01-25  

In [82]:
aapl_ts[Year(2007), Quarter(1)][:, [:Open, :High, :Low, :Close]]

[1m61×4 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m
────────────────────────────────────────────────
 2007-01-03  3.08179  3.09214  2.925    2.99286
 2007-01-04  3.00179  3.06964  2.99357  3.05929
 2007-01-05  3.06321  3.07857  3.01429  3.0375
 2007-01-08  3.07     3.09036  3.04571  3.0525
 2007-01-09  3.0875   3.32071  3.04107  3.30607
 2007-01-10  3.38393  3.49286  3.3375   3.46429
 2007-01-11  3.42643  3.45643  3.39643  3.42143
 2007-01-12  3.37821  3.395    3.32964  3.37929
 2007-01-16  3.41714  3.47321  3.40893  3.46786
 2007-01-17  3.48429  3.48571  3.38643  3.39107
 2007-01-18  3.28929  3.28964  3.18036  3.18107
     ⋮          ⋮        ⋮        ⋮        ⋮
 2007-03-19  3.22286  3.26964  3.19964  3.25464
 2007-03-20  3.2625   3.28     3.25214  3.26714
 2007-03-21  3.28536  3.35714  3.27321  3.3525
 2007-03-22  3.34

In [83]:
# Pkg.add("Plots")
# using Plots
# plot(aapl_ts, [:AdjClose])

In [84]:
aapl_monthly = apply(aapl_ts, Month(1), last)

[1m529×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open_last  [0m[1m High_last  [0m[1m Low_last   [0m[1m Close_last [0m[1m AdjClose_last [0m[1m Volume_last [0m
[90m Date       [0m[90m Float64    [0m[90m Float64    [0m[90m Float64    [0m[90m Float64    [0m[90m Float64       [0m[90m Float64     [0m
────────────────────────────────────────────────────────────────────────────────────────
 1980-12-12    0.152902    0.152902    0.152344    0.152344      0.117313     3.57504e7
 1981-01-02    0.127232    0.127232    0.126116    0.126116      0.0971157    4.61888e7
 1981-02-02    0.118304    0.11942     0.118304    0.118304      0.0911001    1.47616e7
 1981-03-02    0.110491    0.110491    0.109375    0.109375      0.0842243    1.59936e7
 1981-04-01    0.126674    0.12779     0.126674    0.126674      0.0975454    1.26112e7
 1981-05-01    0.147879    0.148438    0.147879    0.147879      0.113874     5.93824e7
 1981-06-01    0.116629    0.116629    0.116071 

In [85]:
aapl_weekly = apply(aapl_ts, Week(1), Statistics.std)

[1m2296×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open_std     [0m[1m High_std     [0m[1m Low_std      [0m[1m Close_std    [0m[1m AdjClose_std [0m[1m Volume_std  [0m
[90m Date       [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64     [0m
───────────────────────────────────────────────────────────────────────────────────────────────
 1980-12-12  NaN           NaN           NaN           NaN           NaN           NaN
 1980-12-15    0.00513892    0.00522604    0.00522604    0.00522604    0.0040243     4.82168e7
 1980-12-22    0.0113361     0.0113358     0.0113361     0.0113361     0.00872936    7.46981e6
 1980-12-29    0.0035291     0.00356931    0.00365905    0.00365905    0.00281768    3.23055e7
 1981-01-05    0.00601539    0.00601797    0.00601797    0.00601797    0.00463415    1.25715e7
 1981-01-12    0.00231418    0.0023209     0.00205022    0.00205022    0.00157878    5.

In [86]:
aapl_weekly = apply(aapl_ts, Week(1), Statistics.std, last)

[1m2296×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open_std     [0m[1m High_std     [0m[1m Low_std      [0m[1m Close_std    [0m[1m AdjClose_std [0m[1m Volume_std  [0m
[90m Date       [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64      [0m[90m Float64     [0m
───────────────────────────────────────────────────────────────────────────────────────────────
 1980-12-12  NaN           NaN           NaN           NaN           NaN           NaN
 1980-12-19    0.00513892    0.00522604    0.00522604    0.00522604    0.0040243     4.82168e7
 1980-12-26    0.0113361     0.0113358     0.0113361     0.0113361     0.00872936    7.46981e6
 1981-01-02    0.0035291     0.00356931    0.00365905    0.00365905    0.00281768    3.23055e7
 1981-01-09    0.00601539    0.00601797    0.00601797    0.00601797    0.00463415    1.25715e7
 1981-01-16    0.00231418    0.0023209     0.00205022    0.00205022    0.00157878    5.

In [87]:
ibm_ts = TSFrame(MarketData.yahoo(:IBM))

[1m13575×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open     [0m[1m High     [0m[1m Low      [0m[1m Close    [0m[1m AdjClose  [0m[1m Volume        [0m
[90m Date       [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64  [0m[90m Float64   [0m[90m Float64       [0m
──────────────────────────────────────────────────────────────────────────────
 1971-02-08   16.0851   16.336    15.8939   16.2882    3.49648  719648.0
 1971-02-09   16.2882   16.3121   16.1687   16.1807    3.47339  673624.0
 1971-02-10   16.1568   16.1568   15.9775   16.1209    3.46057  648520.0
 1971-02-11   16.1209   16.2285   16.097    16.1926    3.47596  579484.0
 1971-02-12   16.1926   16.2285   16.1329   16.2285    3.48366  382836.0
 1971-02-16   16.2285   16.4197   16.1926   16.3301    3.50546  684084.0
 1971-02-17   16.2703   16.2703   16.0492   16.0851    3.45287  652704.0
 1971-02-18   16.0851   16.1209   15.7385   15.7385    3.37848  822156.0
 1971-02-19   15.738

In [88]:
date_from = Date(2021, 06, 01);
date_to = Date(2021, 12, 31);
ibm = TSFrames.subset(ibm_ts, date_from, date_to)

[1m150×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m[1m AdjClose [0m[1m Volume    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64  [0m[90m Float64   [0m
─────────────────────────────────────────────────────────────────────
 2021-06-01  138.623  139.417  137.428  137.849   117.809  2.5287e6
 2021-06-02  138.26   139.34   137.772  139.312   119.06   2.9151e6
 2021-06-03  138.537  139.465  137.706  139.149   118.921  4.32061e6
 2021-06-04  139.579  141.061  139.35   140.937   120.448  3.26132e6
 2021-06-07  141.061  142.199  140.698  141.511   120.939  3.62198e6
 2021-06-08  141.606  143.595  141.606  142.514   121.797  5.31378e6
 2021-06-09  142.476  144.426  142.275  144.044   123.104  5.54725e6
 2021-06-10  144.809  146.119  143.174  143.92    122.998  4.97739e6
 2021-06-11  143.815  145.172  143.757  144.627   123.602  3.59646e6
 2021-06-14  1

In [89]:
aapl = TSFrames.subset(aapl_ts, date_from, date_to)

[1m150×6 TSFrame with Date Index[0m
[1m Index      [0m[1m Open    [0m[1m High    [0m[1m Low     [0m[1m Close   [0m[1m AdjClose [0m[1m Volume    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64  [0m[90m Float64   [0m
─────────────────────────────────────────────────────────────────────
 2021-06-01   125.08   125.35   123.94   124.28   121.916  6.76371e7
 2021-06-02   124.28   125.24   124.05   125.06   122.681  5.92789e7
 2021-06-03   124.68   124.85   123.13   123.54   121.19   7.62292e7
 2021-06-04   124.07   126.16   123.85   125.89   123.496  7.51693e7
 2021-06-07   126.17   126.32   124.83   125.9    123.505  7.10576e7
 2021-06-08   126.6    128.46   126.21   126.74   124.329  7.44038e7
 2021-06-09   127.21   127.75   126.52   127.13   124.712  5.68779e7
 2021-06-10   127.02   128.19   125.94   126.11   123.711  7.11864e7
 2021-06-11   126.53   127.44   126.1    127.35   124.928  5.35224e7
 2021-06-14 

In [90]:
ibm_aapl = TSFrames.join(ibm[:, ["AdjClose"]], aapl[:, ["AdjClose"]]; jointype = :JoinBoth)

[1m150×2 TSFrame with Date Index[0m
[1m Index      [0m[1m AdjClose [0m[1m AdjClose_1 [0m
[90m Date       [0m[90m Float64  [0m[90m Float64    [0m
──────────────────────────────────
 2021-06-01   117.809     121.916
 2021-06-02   119.06      122.681
 2021-06-03   118.921     121.19
 2021-06-04   120.448     123.496
 2021-06-07   120.939     123.505
 2021-06-08   121.797     124.329
 2021-06-09   123.104     124.712
 2021-06-10   122.998     123.711
 2021-06-11   123.602     124.928
 2021-06-14   122.581     127.998
 2021-06-15   122.034     127.174
     ⋮          ⋮          ⋮
 2021-12-17   111.603     168.382
 2021-12-20   111.305     167.014
 2021-12-21   112.978     170.202
 2021-12-22   113.661     172.809
 2021-12-23   114.432     173.439
 2021-12-27   115.299     177.424
 2021-12-28   116.184     176.4
 2021-12-29   116.815     176.489
 2021-12-30   117.305     175.328
 2021-12-31   117.086     174.708
[36m                  129 rows omitted[0m

In [91]:
TSFrames.rename!(ibm_aapl, [:IBM, :AAPL])

[1m150×2 TSFrame with Date Index[0m
[1m Index      [0m[1m IBM     [0m[1m AAPL    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m
──────────────────────────────
 2021-06-01  117.809  121.916
 2021-06-02  119.06   122.681
 2021-06-03  118.921  121.19
 2021-06-04  120.448  123.496
 2021-06-07  120.939  123.505
 2021-06-08  121.797  124.329
 2021-06-09  123.104  124.712
 2021-06-10  122.998  123.711
 2021-06-11  123.602  124.928
 2021-06-14  122.581  127.998
 2021-06-15  122.034  127.174
     ⋮          ⋮        ⋮
 2021-12-17  111.603  168.382
 2021-12-20  111.305  167.014
 2021-12-21  112.978  170.202
 2021-12-22  113.661  172.809
 2021-12-23  114.432  173.439
 2021-12-27  115.299  177.424
 2021-12-28  116.184  176.4
 2021-12-29  116.815  176.489
 2021-12-30  117.305  175.328
 2021-12-31  117.086  174.708
[36m              129 rows omitted[0m

In [92]:
using Impute
ibm_aapl = ibm_aapl |> Impute.locf()

[1m150×2 TSFrame with Date Index[0m
[1m Index      [0m[1m IBM     [0m[1m AAPL    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m
──────────────────────────────
 2021-06-01  117.809  121.916
 2021-06-02  119.06   122.681
 2021-06-03  118.921  121.19
 2021-06-04  120.448  123.496
 2021-06-07  120.939  123.505
 2021-06-08  121.797  124.329
 2021-06-09  123.104  124.712
 2021-06-10  122.998  123.711
 2021-06-11  123.602  124.928
 2021-06-14  122.581  127.998
 2021-06-15  122.034  127.174
     ⋮          ⋮        ⋮
 2021-12-17  111.603  168.382
 2021-12-20  111.305  167.014
 2021-12-21  112.978  170.202
 2021-12-22  113.661  172.809
 2021-12-23  114.432  173.439
 2021-12-27  115.299  177.424
 2021-12-28  116.184  176.4
 2021-12-29  116.815  176.489
 2021-12-30  117.305  175.328
 2021-12-31  117.086  174.708
[36m              129 rows omitted[0m

In [93]:
ibm_aapl_weekly = to_weekly(ibm_aapl)

[1m31×2 TSFrame with Date Index[0m
[1m Index      [0m[1m IBM     [0m[1m AAPL    [0m
[90m Date       [0m[90m Float64 [0m[90m Float64 [0m
──────────────────────────────
 2021-06-04  120.448  123.496
 2021-06-11  123.602  124.928
 2021-06-18  116.935  127.979
 2021-06-25  119.975  130.578
 2021-07-02  114.402  137.298
 2021-07-09  115.628  142.35
 2021-07-16  113.487  143.606
 2021-07-23  115.481  145.734
 2021-07-30  115.17   143.086
 2021-08-06  117.728  143.575
 2021-08-13  118.331  146.483
     ⋮          ⋮        ⋮
 2021-10-29  103.389  147.171
 2021-11-05  106.857  148.842
 2021-11-12  104.209  147.573
 2021-11-19  101.66   157.962
 2021-11-26  101.45   154.283
 2021-12-03  104.104  159.232
 2021-12-10  108.703  176.558
 2021-12-17  111.603  168.382
 2021-12-23  114.432  173.439
 2021-12-31  117.086  174.708
[36m               10 rows omitted[0m

In [94]:
ibm_aapl_weekly_returns = diff(log.(ibm_aapl_weekly))

[1m31×2 TSFrame with Date Index[0m
[1m Index      [0m[1m IBM_log           [0m[1m AAPL_log         [0m
[90m Date       [0m[90m Float64?          [0m[90m Float64?         [0m
─────────────────────────────────────────────────
 2021-06-04 [90m missing           [0m[90m missing          [0m
 2021-06-11        0.0258468          0.0115307
 2021-06-18       -0.0554491          0.0241274
 2021-06-25        0.0256602          0.0201093
 2021-07-02       -0.0475582          0.0501808
 2021-07-09        0.0106556          0.0361353
 2021-07-16       -0.0186868          0.00878216
 2021-07-23        0.0174141          0.0147147
 2021-07-30       -0.00269219        -0.0183416
 2021-08-06        0.0219621          0.00341467
 2021-08-13        0.00511153         0.0200523
     ⋮               ⋮                 ⋮
 2021-10-29       -0.0219788          0.00743744
 2021-11-05        0.0329913          0.0112899
 2021-11-12       -0.0250882         -0.0085636
 2021-11-19       -0.02476

In [95]:
TSFrames.rename!(ibm_aapl_weekly_returns, [:IBM, :AAPL])

[1m31×2 TSFrame with Date Index[0m
[1m Index      [0m[1m IBM               [0m[1m AAPL             [0m
[90m Date       [0m[90m Float64?          [0m[90m Float64?         [0m
─────────────────────────────────────────────────
 2021-06-04 [90m missing           [0m[90m missing          [0m
 2021-06-11        0.0258468          0.0115307
 2021-06-18       -0.0554491          0.0241274
 2021-06-25        0.0256602          0.0201093
 2021-07-02       -0.0475582          0.0501808
 2021-07-09        0.0106556          0.0361353
 2021-07-16       -0.0186868          0.00878216
 2021-07-23        0.0174141          0.0147147
 2021-07-30       -0.00269219        -0.0183416
 2021-08-06        0.0219621          0.00341467
 2021-08-13        0.00511153         0.0200523
     ⋮               ⋮                 ⋮
 2021-10-29       -0.0219788          0.00743744
 2021-11-05        0.0329913          0.0112899
 2021-11-12       -0.0250882         -0.0085636
 2021-11-19       -0.02476

In [96]:
ibm_std = std(skipmissing(ibm_aapl_weekly_returns[:, :IBM]))

0.034079310360476005

# References

- Working with DataFrames: https://dataframes.juliadata.org/stable/man/working_with_dataframes/
- DataFrames API reference: https://dataframes.juliadata.org/stable/lib/functions/
- TSFrames user guide: https://xkdr.github.io/TSFrames.jl/stable/user_guide/
- Basic demo of TSFrames: https://xkdr.github.io/TSFrames.jl/stable/demo_finance/

Add new code cells by clicking the `+ Code` button (or _Insert_ > _Code cell_).

Have fun!

<img src="https://raw.githubusercontent.com/JuliaLang/julia-logo-graphics/master/images/julia-logo-mask.png" height="100" />