# Steps for Causal Impact

1. Define the Pre and Post Period.

2. Retrieve the data we need.

3. Check whether the variables are correlated in the pre-period.

4. Remove non-correlted data.

5. Use Causal Impact.

# Loading the necessary libraries


- **yfinance** library helps us retrieve financial data from yahoo finance.

- **tfcausalimpact** is Google's Causal Impact Algorithm Implemented on Top of TensorFlow Probability.

- **TensorFlow Probability** is a library for probabilistic reasoning and statistical analysis in TensorFlow.

- **Causal Impact** is an R package for causal inference using Bayesian structural time-series models.

In [None]:
# Loading the necessary libraries
!pip install yfinance
!pip install tfcausalimpact

In [2]:
# Import the libraries
import yfinance as yf
from causalimpact import CausalImpact
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [3]:
# Define Dates
training_start = "2020-09-01" # Had a ratio of 10:1, training is of 40 days and the treatment is of 4 days
training_end = "2020-10-19"
treatment_start = "2020-10-20"
treatment_end = "2020-10-23"
end_stock = "2020-10-24" # Always add an additional day from treatment_end

# Loading Financial Data

In [4]:
# Bitcoin Data
y = ["BTC-USD"]
y = yf.download(tickers = y,
                start = training_start,
                end = end_stock,
                interval = "1d")
y

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-09-01,11679.316406,12067.081055,11601.128906,11970.478516,11970.478516,27311555343
2020-09-02,11964.823242,11964.823242,11290.793945,11414.03418,11414.03418,28037405299
2020-09-03,11407.191406,11443.022461,10182.464844,10245.296875,10245.296875,31927261555
2020-09-04,10230.365234,10663.919922,10207.94043,10511.813477,10511.813477,29965130374
2020-09-05,10512.530273,10581.571289,9946.675781,10169.567383,10169.567383,44916565292
2020-09-06,10167.216797,10353.927734,10056.885742,10280.351562,10280.351562,37071460174
2020-09-07,10280.998047,10399.15332,9916.493164,10369.563477,10369.563477,33703098409
2020-09-08,10369.306641,10414.775391,9945.110352,10131.516602,10131.516602,33430927462
2020-09-09,10134.151367,10350.542969,10017.250977,10242.347656,10242.347656,24128292755
2020-09-10,10242.330078,10503.912109,10238.135742,10363.138672,10363.138672,54406443211


- Out of all these columns we need **Adj Close** so we will get that by doing.

In [5]:
y = y['Adj Close']

In [6]:
y

Date
2020-09-01    11970.478516
2020-09-02    11414.034180
2020-09-03    10245.296875
2020-09-04    10511.813477
2020-09-05    10169.567383
2020-09-06    10280.351562
2020-09-07    10369.563477
2020-09-08    10131.516602
2020-09-09    10242.347656
2020-09-10    10363.138672
2020-09-11    10400.915039
2020-09-12    10442.170898
2020-09-13    10323.755859
2020-09-14    10680.837891
2020-09-15    10796.951172
2020-09-16    10974.905273
2020-09-17    10948.990234
2020-09-18    10944.585938
2020-09-19    11094.346680
2020-09-20    10938.271484
2020-09-21    10462.259766
2020-09-22    10538.459961
2020-09-23    10246.186523
2020-09-24    10760.066406
2020-09-25    10692.716797
2020-09-26    10750.723633
2020-09-27    10775.269531
2020-09-28    10709.652344
2020-09-29    10844.640625
2020-09-30    10784.491211
2020-10-01    10619.452148
2020-10-02    10575.974609
2020-10-03    10549.329102
2020-10-04    10669.583008
2020-10-05    10793.339844
2020-10-06    10604.406250
2020-10-07    10668.968

In [8]:
type(y)

pandas.core.series.Series

- Rename the data array like this

In [9]:
y = y.rename("y")
y.tail()

Date
2020-10-19    11742.037109
2020-10-20    11916.334961
2020-10-21    12823.689453
2020-10-22    12965.891602
2020-10-23    12931.539062
Name: y, dtype: float64

- Training Group is nothing but **Control Group**.

- Here the treatment and training group are assumed to have same KPI.

- More the number of training groups precise the analysis would be.

- Try keeping the post-period (treatment duration) to bare minimum.

In [15]:
# Loading the Stock Data
stocks = ["CARL-B.CO", "ZAL.DE", "SQ", "CRSP", "TRMB", "JD", "DE", "KTOS", "GOOG"]
x = yf.download(tickers = stocks,
                start = training_start,
                end = end_stock,
                interval = '1d')

x.head()

[*********************100%%**********************]  9 of 9 completed


Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Close,Close,Close,Close,Close,Close,Close,Close,Close,High,High,High,High,High,High,High,High,High,Low,Low,Low,Low,Low,Low,Low,Low,Low,Open,Open,Open,Open,Open,Open,Open,Open,Open,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2,Unnamed: 32_level_2,Unnamed: 33_level_2,Unnamed: 34_level_2,Unnamed: 35_level_2,Unnamed: 36_level_2,Unnamed: 37_level_2,Unnamed: 38_level_2,Unnamed: 39_level_2,Unnamed: 40_level_2,Unnamed: 41_level_2,Unnamed: 42_level_2,Unnamed: 43_level_2,Unnamed: 44_level_2,Unnamed: 45_level_2,Unnamed: 46_level_2,Unnamed: 47_level_2,Unnamed: 48_level_2,Unnamed: 49_level_2,Unnamed: 50_level_2,Unnamed: 51_level_2,Unnamed: 52_level_2,Unnamed: 53_level_2,Unnamed: 54_level_2
2020-09-01,794.451111,93.419998,209.291962,83.0355,79.267685,19.700001,166.660004,53.41,77.0,860.0,93.419998,217.690002,83.0355,82.489998,19.700001,166.660004,53.41,77.0,874.599976,94.5,217.720001,83.286499,83.0,19.93,170.460007,53.470001,77.5,860.0,90.730003,208.270004,81.611,79.144997,19.23,162.0,52.099998,74.779999,873.200012,91.989998,208.520004,81.831497,79.489998,19.459999,164.809998,52.669998,74.959999,269532,779500.0,1873400.0,36506000.0,11431400.0,587600.0,12306400.0,542300.0,790198
2020-09-02,802.765198,93.93,208.734344,86.414001,79.959564,19.91,162.880005,54.310001,77.0,869.0,93.93,217.110001,86.414001,83.209999,19.91,162.880005,54.310001,77.0,873.599976,94.410004,219.710007,86.658997,86.580002,20.0,170.610001,54.509998,78.080002,861.200012,90.714996,215.669998,83.316498,81.800003,19.58,158.110001,52.959999,76.220001,865.0,94.150002,218.240005,83.688751,85.459,19.73,170.600006,53.779999,77.5,202539,532000.0,1983000.0,50224000.0,13860900.0,612800.0,11214800.0,728800.0,496554
2020-09-03,818.099915,85.690002,202.677383,82.092003,75.952454,19.52,152.860001,50.900002,75.080002,885.599976,85.690002,210.809998,82.092003,79.040001,19.52,152.860001,50.900002,75.080002,892.599976,92.128998,218.729996,85.485703,81.690002,20.15,157.229996,54.130001,77.940002,874.0,84.195,209.070007,80.752998,76.029999,19.299999,149.509995,50.66,74.400002,874.799988,91.690002,217.270004,85.485703,81.375,20.0,157.0,54.130001,77.260002,248867,1278900.0,1975200.0,62156000.0,19254000.0,847100.0,16421200.0,1220300.0,736259
2020-09-04,804.612732,82.019997,203.186951,79.552002,76.874954,19.26,146.389999,49.959999,71.739998,871.0,82.019997,211.339996,79.552002,80.0,19.26,146.389999,49.959999,71.739998,885.599976,87.0,214.0,82.255501,80.800003,19.83,152.220001,51.529999,76.300003,867.200012,76.709999,208.179993,77.380653,75.400002,18.84,134.0,49.07,71.300003,874.799988,86.489998,213.190002,81.212997,77.966003,19.68,149.630005,51.040001,74.800003,221091,1570300.0,1662300.0,52172000.0,21500900.0,1060000.0,17995200.0,914700.0,662250
2020-09-07,812.926697,,,,,,,,74.0,880.0,,,,,,,,74.0,880.0,,,,,,,,74.019997,870.200012,,,,,,,,71.940002,870.200012,,,,,,,,72.239998,90498,,,,,,,,362062


In [16]:
# Getting just the required data
x = x.iloc[:,:len(stocks)]
x.head(1)

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close
Unnamed: 0_level_1,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2
2020-09-01,794.451111,93.419998,209.291962,83.0355,79.267685,19.700001,166.660004,53.41,77.0


In [18]:
x.columns = x.columns.droplevel()
x.head()

Unnamed: 0_level_0,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2020-09-01,794.451111,93.419998,209.291962,83.0355,79.267685,19.700001,166.660004,53.41,77.0
2020-09-02,802.765198,93.93,208.734344,86.414001,79.959564,19.91,162.880005,54.310001,77.0
2020-09-03,818.099915,85.690002,202.677383,82.092003,75.952454,19.52,152.860001,50.900002,75.080002
2020-09-04,804.612732,82.019997,203.186951,79.552002,76.874954,19.26,146.389999,49.959999,71.739998
2020-09-07,812.926697,,,,,,,,74.0


In [19]:
# Combine all the data
df = pd.concat([y, x], axis=1).dropna()
df.head()

Unnamed: 0_level_0,y,CARL-B.CO,CRSP,DE,GOOG,JD,KTOS,SQ,TRMB,ZAL.DE
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-09-01,11970.478516,794.451111,93.419998,209.291962,83.0355,79.267685,19.700001,166.660004,53.41,77.0
2020-09-02,11414.03418,802.765198,93.93,208.734344,86.414001,79.959564,19.91,162.880005,54.310001,77.0
2020-09-03,10245.296875,818.099915,85.690002,202.677383,82.092003,75.952454,19.52,152.860001,50.900002,75.080002
2020-09-04,10511.813477,804.612732,82.019997,203.186951,79.552002,76.874954,19.26,146.389999,49.959999,71.739998
2020-09-08,10131.516602,799.439575,81.459999,202.129395,76.619499,73.242615,19.23,139.110001,49.139999,73.440002
