## Interest Rate

In [4]:
import pandas as pd

- **path_ir**: path of the interest rate file for an individual country (the data can only be downloaded one country at a time)
- **country_code**: country code used throughout (refer to README)
- **fred_code**: alphanumeric code given for each dataset, for each country by the FRED website

In [5]:
def get_interest_rate(path_ir, country_code, fred_code):

    df_ir = pd.read_csv(path_ir)
    df_ir['DATE'] = pd.to_datetime(df_ir['DATE'])
    df_ir['month'] = df_ir['DATE'].dt.month
    df_ir['year'] = df_ir['DATE'].dt.year
    df_ir = df_ir.rename(columns={fred_code: f'{country_code}_IR', 'DATE': 'index'})
    df_ir[f'{country_code}_IR'] = df_ir[f'{country_code}_IR']/100
    
    return df_ir

In [18]:
def append_df(df, df_ir):
    
    df_ir.drop(columns='index', inplace=True)
    df = pd.merge(df, df_ir, left_on=['month', 'year'], right_on=['month', 'year'])

    return df

In [6]:
def data_combine(path, df_ir):
    
    exchange_df = pd.read_csv(path) 
    print(exchange_df.shape)

    df_with_ir = pd.merge(exchange_df, df_ir, left_on=['month', 'year'], right_on=['month', 'year'])
    df_with_ir.drop(columns='index', inplace=True)

    return df_with_ir

In [89]:
#replace country_code & fred_code values with values for the chosen interest rate file 

path_ir = '<path to interest rate file>'
country_code = 'USD' 
fred_code = 'IRSTCI01USM156N' 

In [90]:
df_ir = get_interest_rate(path_ir, country_code, fred_code)
df_ir

Unnamed: 0,index,USD_IR,month,year
0,2000-01-01,0.0545,1,2000
1,2000-02-01,0.0573,2,2000
2,2000-03-01,0.0585,3,2000
3,2000-04-01,0.0602,4,2000
4,2000-05-01,0.0627,5,2000
...,...,...,...,...
235,2019-08-01,0.0213,8,2019
236,2019-09-01,0.0204,9,2019
237,2019-10-01,0.0183,10,2019
238,2019-11-01,0.0155,11,2019


As the data can only be downloaded one country at a time, we have to manually merge the interest rate values into one dataframe. In order to do so, a copy of the first 'df_ir' is made and set aside as the dataframe (df) to which the other interest rate values will be appended to. From the second 'df_ir' onward, we will be appending the values to 'df' to get a dataset that only contains the interest rate values.

In [86]:
#use only for the first instance, comment out afterwards
df = df_ir.copy()

In [91]:
#start using from the second instance
df = append_df(df, df_ir)

In [92]:
df #240 rows

Unnamed: 0,index,AUD_IR,month,year,NZD_IR,GBP_IR,BRL_IR,CND_IR,CNY_IR,IDR_IR,KRW_IR,MXN_IR,ZAR_IR,DKK_IR,JPY_IR,NOK_IR,SEK_IR,CHF_IR,USD_IR
0,2000-01-01,0.0500,1,2000,0.0512,0.053527,0.190,0.047700,0.0324,0.1148,0.0478,0.1529,0.1175,0.031547,0.000205,0.0584,0.0198,0.011843,0.0545
1,2000-02-01,0.0548,2,2000,0.0526,0.058408,0.190,0.049714,0.0324,0.1113,0.0502,0.1518,0.1175,0.034249,0.000345,0.0592,0.0200,0.019418,0.0573
2,2000-03-01,0.0550,3,2000,0.0551,0.057620,0.185,0.052486,0.0324,0.1103,0.0510,0.1367,0.1175,0.036486,0.000214,0.0587,0.0200,0.016124,0.0585
3,2000-04-01,0.0572,4,2000,0.0582,0.059347,0.185,0.052586,0.0324,0.1100,0.0510,0.1248,0.1175,0.038424,0.000200,0.0611,0.0247,0.026279,0.0602
4,2000-05-01,0.0598,5,2000,0.0624,0.059122,0.185,0.057529,0.0324,0.1108,0.0512,0.1251,0.1175,0.039739,0.000210,0.0620,0.0250,0.021906,0.0627
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
235,2019-08-01,0.0100,8,2019,0.0110,0.007101,0.060,0.017481,0.0290,0.0554,0.0151,0.0658,0.0650,-0.005500,-0.000450,0.0125,-0.0015,-0.010500,0.0213
236,2019-09-01,0.0100,9,2019,0.0096,0.007098,0.055,0.017454,0.0290,0.0524,0.0152,0.0646,0.0650,-0.005500,-0.000590,0.0135,-0.0015,-0.010000,0.0204
237,2019-10-01,0.0076,10,2019,0.0097,0.007108,0.050,0.017472,0.0290,0.0504,0.0135,0.0625,0.0650,-0.006300,-0.000210,0.0150,-0.0015,-0.009100,0.0183
238,2019-11-01,0.0075,11,2019,0.0099,0.007105,0.050,0.017487,0.0290,0.0483,0.0128,0.0611,0.0650,-0.006400,-0.000430,0.0150,-0.0015,-0.008300,0.0155


In [103]:
df.to_csv('<path to save interest rate dataset>', index=False)

- **forex_path**: path of the dataset with exchange rates

In [93]:
forex_path = '<path of main dataset>'

In [101]:
df_with_ir = data_combine(forex_path, df)
print(df_with_ir.shape) #4997 rows
print(df_with_ir.isna().sum())

(4997, 21)
(4997, 37)
Time Series    0
AUD_USD        0
NZD_USD        0
GBP_USD        0
BRL_USD        0
CND_USD        0
CNY_USD        0
IDR_USD        0
KRW_USD        0
MXN_USD        0
ZAR_USD        0
DKK_USD        0
JPY_USD        0
NOK_USD        0
SEK_USD        0
CHF_USD        0
month          0
year           0
USD_USD        0
price_gold     0
fc_year        0
AUD_IR         0
NZD_IR         0
GBP_IR         0
BRL_IR         0
CND_IR         0
CNY_IR         0
IDR_IR         0
KRW_IR         0
MXN_IR         0
ZAR_IR         0
DKK_IR         0
JPY_IR         0
NOK_IR         0
SEK_IR         0
CHF_IR         0
USD_IR         0
dtype: int64


In [102]:
df_with_ir.to_csv('<path to save the new main dataset 1>', index=False)