<a href="https://colab.research.google.com/github/arnabmy/Data-Analytics-in-Finance/blob/main/financial_analytics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Introduction to Financial Analytics and Time Series Data

Develop a basic understanding about financial analytics: its definition and specific examples

Utilize an overview framework of financial analytics to generalize the procedure of financial analysis: the source of data, tools to analyze it, and its application for enhancing operating performance (from automate to transform and automate)


Understand time series data and how to deal with time series data using R codes to generate forecasting models that can be applied to enhance business performance

In [3]:
install.packages("xts") 
install.packages("tidyverse") 
install.packages("lubridate") 
install.packages("forecast") 

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

also installing the dependency ‘tseries’




In [4]:
# SUPPRESS PACKAGE WARNINGS
quietly <- suppressPackageStartupMessages

# DISABLE SCIENTIFIC NOTATION
options(scipen = 9999)

# LOAD PACKAGES SUPRESS WARNINGS
quietly(library(xts))
quietly(library(tidyverse))
quietly(library(lubridate))
quietly(library(forecast))

“running command 'timedatectl' had status 1”


Data Import and exploration

In [5]:
 # Import the kraken dataset as kraken_df
 kraken_df = read_csv("/content/sample_data/kraken.csv")

[1m[1mRows: [1m[22m[34m[34m19285[34m[39m [1m[1mColumns: [1m[22m[34m[34m8[34m[39m

[36m──[39m [1m[1mColumn specification[1m[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m  (2): crypto, trend
[32mdbl[39m  (3): pct_change, price, volume
[33mlgl[39m  (2): all_time_high, new_crypto
[34mdttm[39m (1): datetime


[36mℹ[39m Use [30m[47m[30m[47m`spec()`[47m[30m[49m[39m to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set [30m[47m[30m[47m`show_col_types = FALSE`[47m[30m[49m[39m to quiet this message.



In [6]:
head(kraken_df)

datetime,crypto,pct_change,price,volume,trend,all_time_high,new_crypto
<dttm>,<chr>,<dbl>,<dbl>,<dbl>,<chr>,<lgl>,<lgl>
2020-08-05 03:48:49,XBT,0.004,11194.0,111100000,down,False,False
2020-08-05 03:48:49,ETH,0.008,389.81,68800000,up,False,False
2020-08-05 03:48:49,XRP,0.032,0.301,20300000,down,False,False
2020-08-05 03:48:49,USDT,0.0001,1.0005,18600000,down,False,False
2020-08-05 03:48:49,LINK,0.065,9.8602,14800000,up,False,False
2020-08-05 03:48:49,XTZ,0.054,3.2424,10300000,up,False,False


Data Pre-processing


We will do some variable convertion in the following two cells:

Change the trend variable to numeric (up/down/flat to 1/-1/0)

Update the datetime object to two varibles, date and time (hours-minutes-seconds)

Make the crypto variable from character to factor

Do binary encoding on all_time_high and new_crypto variables (TRUE = 1, FALSE = 0)

In [8]:
kraken_df$trend = kraken_df$trend %>% 
                    str_replace_all("up","1") %>%
                    str_replace_all("down","2") %>%
                    str_replace_all("flat","0") %>%
                    as.numeric()

In [9]:
kraken_df$date = as.Date(kraken_df$datetime)
kraken_df$time = format(kraken_df$datetime,"%H:%M:%S")
kraken_df$crypto = as_factor(kraken_df$crypto)
kraken_df$all_time_high = as.numeric(kraken_df$all_time_high)
kraken_df$new_crypto = as.numeric(kraken_df$new_crypto)


In [10]:
# Let's check out the summary statistics of variables we have:
summary(kraken_df[c("date","crypto","price","volume")])

      date                crypto          price               volume         
 Min.   :2017-06-12   XBT    : 1053   Min.   :    0.001   Min.   :      518  
 1st Qu.:2018-05-16   ETH    : 1053   1st Qu.:    1.000   1st Qu.:   286709  
 Median :2019-04-01   XRP    : 1053   Median :   11.560   Median :  1090000  
 Mean   :2019-03-06   USDT   : 1053   Mean   :  487.156   Mean   : 11007116  
 3rd Qu.:2020-01-19   LTC    : 1053   3rd Qu.:  114.770   3rd Qu.:  4440000  
 Max.   :2020-08-05   XLM    : 1053   Max.   :19020.000   Max.   :829000000  
                      (Other):12967                                          

In [11]:
summary(kraken_df[c("trend","pct_change","all_time_high", "new_crypto")])

     trend         pct_change      all_time_high        new_crypto      
 Min.   :0.000   Min.   :0.00000   Min.   :0.000000   Min.   :0.000000  
 1st Qu.:1.000   1st Qu.:0.00960   1st Qu.:0.000000   1st Qu.:0.000000  
 Median :1.000   Median :0.02480   Median :0.000000   Median :0.000000  
 Mean   :1.452   Mean   :0.04029   Mean   :0.007156   Mean   :0.001141  
 3rd Qu.:2.000   3rd Qu.:0.05300   3rd Qu.:0.000000   3rd Qu.:0.000000  
 Max.   :2.000   Max.   :1.07400   Max.   :1.000000   Max.   :1.000000  