---
# **Running Julia in Google Colab**

---

Quant Researcher: Ali Ghaziasgar

MMF 2024-2025



# Preliminaries

In [None]:
# installation needs to be done each time we start a new Colab session!!
# we have to suppress the output to avoid delay during installation

In [1]:
# Install Julia 1.9.3
!wget -q https://julialang-s3.julialang.org/bin/linux/x64/1.9/julia-1.9.3-linux-x86_64.tar.gz -O julia.tar.gz
!tar -xzf julia.tar.gz
!mv julia-1.9.3 /usr/local/julia

# Update PATH environment variable
import os
os.environ["PATH"] += os.pathsep + "/usr/local/julia/bin"

In [2]:
# Install DataFrames
# !julia -e 'using Pkg; Pkg.add("DataFrames")'
# Install DataFrames silently
!julia -e 'using Pkg; Pkg.add("DataFrames")' > /dev/null 2>&1

In [None]:
# Install XLSX
!julia -e 'using Pkg; Pkg.add("XLSX")' > /dev/null 2>&1

In [None]:
# Install PlotlyJS
!julia -e 'using Pkg; Pkg.add("PlotlyJS")' > /dev/null 2>&1

In [None]:
# Install IJulia
!julia -e 'using Pkg; Pkg.add("IJulia")' > /dev/null 2>&1

In [None]:
# Precompile packages
!julia -e 'using Pkg; Pkg.precompile()'

[?25h

In [None]:
# Verify DataFrames
!julia -e 'using DataFrames; println("DataFrames loaded successfully.")'

# Verify XLSX
!julia -e 'using XLSX; println("XLSX loaded successfully.")'

# Verify PlotlyJS
!julia -e 'using PlotlyJS; println("PlotlyJS loaded successfully.")'
# Verify IJulia
!julia -e 'using IJulia; println("IJulia loaded successfully.")'


DataFrames loaded successfully.
XLSX loaded successfully.
PlotlyJS loaded successfully.
IJulia loaded successfully.


In [None]:
!julia --version

julia version 1.9.3


In [None]:
# run above; then go to Runtime >> Change runtime type >> change to julia >> save

## Libraries

In [None]:
# Install necessary packages if not already installed
# Use the Julia package manager (Pkg) to add the required libraries
using Pkg
# Pkg.add("CSV")           # CSV package for handling CSV files
# Pkg.add("DataFrames")    # DataFrames package for data manipulation
# Pkg.add("PlotlyJS")      # PlotlyJS package for visualizations

Pkg.add([
    "Downloads",
    "DataFrames",
    "XLSX",
    "PlotlyJS",
    "GoogleDrive",
    "Dates"
])

[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.9/Project.toml`
  [90m[ade2ca70] [39m[92m+ Dates[39m
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.9/Manifest.toml`


In [None]:
# Import required libraries
# using CSV                # To read CSV files
using XLSX
using Downloads
using Dates
using GoogleDrive
using DataFrames         # For data manipulation
using PlotlyJS           # For visualizations

## Input Data

In [None]:
# Function to download files from Google Drive
function download_from_google_drive(file_id::String, destination::String)
    url = "https://drive.google.com/uc?export=download&id=$(file_id)"
    Downloads.download(url, destination)
end

download_from_google_drive (generic function with 1 method)

In [None]:
# File IDs
file_id = "1DbvPCaF4cNzSk1gfdDogC5mRReprnxQX"   # Data_withoutNAs.xlsx
file_id_2 = "16abOhxeTo1IPh1wcivsMnPwetwxjG-2E" # Data_withNAs.xlsx
# Download the files
download_from_google_drive(file_id, "Data_withoutNAs.xlsx")
download_from_google_drive(file_id_2, "Data_withNAs.xlsx")

# Sheet names
sheet_name = "prices"
sheet_name_2 = "volume"

"volume"

In [None]:
# Read Excel files into DataFrames
data_df = DataFrame(XLSX.readtable("Data_withoutNAs.xlsx", sheet_name))
# Convert 'Date' column to Date type
data_df.Date = Dates.Date.(data_df.Date)
data_df_v = DataFrame(XLSX.readtable("Data_withoutNAs.xlsx", sheet_name_2))
# Convert 'Date' column to Date type
data_df_v.Date = Dates.Date.(data_df_v.Date)

data_df_2 = DataFrame(XLSX.readtable("Data_withNAs.xlsx", sheet_name))
data_df_2.Date = Dates.Date.(data_df_2.Date)

data_df_v_2 = DataFrame(XLSX.readtable("Data_withNAs.xlsx", sheet_name_2))
data_df_v_2.Date = Dates.Date.(data_df_v_2.Date);

In [None]:
# Display the first few rows of each DataFrame
println("Data without NAs - Prices:")
display(first(data_df, 5))

println("\nData without NAs - Volume:")
display(first(data_df_v, 5))

println("\nData with NAs - Prices:")
display(first(data_df_2, 5))

println("\nData with NAs - Volume:")
display(first(data_df_v_2, 5))

Data without NAs - Prices:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin,VIX_Index,Dollar_Index
Unnamed: 0_level_1,Date,Any,Any,Any,Any,Any,Any
1,2017-01-03,2257.83,1037.5,9.661,4.593,12.85,103.21
2,2017-01-04,2270.75,1139.6,11.075,4.59,11.85,102.7
3,2017-01-05,2269.0,1003.2,10.294,4.213,11.67,101.52
4,2017-01-06,2276.98,898.5,10.1,3.84,11.32,102.22
5,2017-01-09,2268.9,903.0,10.343,4.34,11.56,101.93



Data without NAs - Volume:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin
Unnamed: 0_level_1,Date,Any,Any,Any,Any
1,2017-01-03,649219406,1410800.0,18997200.0,1881980.0
2,2017-01-04,573172248,5372880.0,17491900.0,4193390.0
3,2017-01-05,573475802,9120480.0,22340400.0,9904260.0
4,2017-01-06,485214839,5740510.0,4942570.0,7021300.0
5,2017-01-09,532332873,1658730.0,6786430.0,5032990.0



Data with NAs - Prices:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin,VIX_Index,Dollar_Index
Unnamed: 0_level_1,Date,Any,Any,Any,Any,Any,Any
1,2017-01-03,2257.83,1037.5,9.661,4.593,12.85,103.21
2,2017-01-04,2270.75,1139.6,11.075,4.59,11.85,102.7
3,2017-01-05,2269.0,1003.2,10.294,4.213,11.67,101.52
4,2017-01-06,2276.98,898.5,10.1,3.84,11.32,102.22
5,2017-01-09,2268.9,903.0,10.343,4.34,11.56,101.93



Data with NAs - Volume:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin
Unnamed: 0_level_1,Date,Any,Any,Any,Any
1,2017-01-03,649219406,1410800.0,18997200.0,1881980.0
2,2017-01-04,573172248,5372880.0,17491900.0,4193390.0
3,2017-01-05,573475802,9120480.0,22340400.0,9904260.0
4,2017-01-06,485214839,5740510.0,4942570.0,7021300.0
5,2017-01-09,532332873,1658730.0,6786430.0,5032990.0


In [None]:
println("Data without NAs - Prices:")
display(last(data_df, 5))

println("\nData without NAs - Volume:")
display(last(data_df_v, 5))

println("\nData with NAs - Prices:")
display(last(data_df_2, 5))

println("\nData with NAs - Volume:")
display(last(data_df_v_2, 5))

Data without NAs - Prices:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin,VIX_Index,Dollar_Index
Unnamed: 0_level_1,Date,Any,Any,Any,Any,Any,Any
1,2024-10-07,5695.94,63066.0,2444.1,65.705,22.64,102.537
2,2024-10-08,5751.13,62445.0,2446.9,66.011,21.42,102.549
3,2024-10-09,5792.04,60447.0,2358.2,64.383,20.86,102.928
4,2024-10-10,5780.05,59790.0,2367.0,64.071,20.93,102.988
5,2024-10-11,5815.03,63070.0,2460.0,65.685,20.46,102.89



Data without NAs - Volume:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin
Unnamed: 0_level_1,Date,Any,Any,Any,Any
1,2024-10-07,645340786,53144500.0,46747400.0,764460.0
2,2024-10-08,611810958,54113800.0,30009500.0,611609.0
3,2024-10-09,613680656,43626500.0,20620300.0,626045.0
4,2024-10-10,562858168,65771000.0,29421800.0,503762.0
5,2024-10-11,605684238,41739100.0,23048600.0,333545.0



Data with NAs - Prices:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin,VIX_Index,Dollar_Index
Unnamed: 0_level_1,Date,Any,Any,Any,Any,Any,Any
1,2024-10-07,5695.94,63066.0,2444.1,65.705,22.64,102.537
2,2024-10-08,5751.13,62445.0,2446.9,66.011,21.42,102.549
3,2024-10-09,5792.04,60447.0,2358.2,64.383,20.86,102.928
4,2024-10-10,5780.05,59790.0,2367.0,64.071,20.93,102.988
5,2024-10-11,5815.03,63070.0,2460.0,65.685,20.46,102.89



Data with NAs - Volume:


Row,Date,SP500,Bitcoin,Ethereum,Litecoin
Unnamed: 0_level_1,Date,Any,Any,Any,Any
1,2024-10-07,645340786,53144500.0,46747400.0,764460.0
2,2024-10-08,611810958,54113800.0,30009500.0,611609.0
3,2024-10-09,613680656,43626500.0,20620300.0,626045.0
4,2024-10-10,562858168,65771000.0,29421800.0,503762.0
5,2024-10-11,605684238,41739100.0,23048600.0,333545.0


In [None]:
## original code from the professor
# Define the path to the directory on Google Drive (manually set this path)
# notebook_directory = "/content/drive/MyDrive/<Your Notebook Directory>/data"

# Construct the path to the CSV file
# csv_file_path = joinpath(notebook_directory, "data_file_name.csv")

# Load the CSV file into a DataFrame
# data_df = CSV.read(csv_file_path, DataFrame)

# Display the first few rows of the DataFrame
# first_five_rows = first(data_df, 5)
# println(first_five_rows)

## Data Management

In [None]:
# not relevant for this project; used data from Bloomberg
# import yfinance as yf
# import pandas as pd
# tickers = ['KO', 'PEP']
# data = yf.download(tickers, start='2015-09-01', end='2024-09-21')
# # Convert index to datetime, then remove timezone; it was also causing error
# data.index = pd.to_datetime(data.index).strftime('%Y-%m-%d')
# data['Adj Close'].to_csv('stock_data.csv', index = True) #specify index = True


# Analytics

# Simulation Resuts

# Empirical Results