## Imports

In [None]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.24.0-py3-none-any.whl (5.5 MB)
[K     |████████████████████████████████| 5.5 MB 5.3 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 49.8 MB/s 
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.10.1-py3-none-any.whl (163 kB)
[K     |████████████████████████████████| 163 kB 9.1 MB/s 
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.10.1 tokenizers-0.13.1 transformers-4.24.0


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch

from transformers import pipeline, AutoModel, AutoTokenizer

plt.style.use('seaborn-whitegrid')

In [None]:
from google.colab import drive
drive.mount('/content/gdrive/')

Mounted at /content/gdrive/


In [None]:
BASE_PATH = "gdrive/MyDrive/ChamoAnalytics/data/dataset"

In [None]:
OUT_PATH = "gdrive/MyDrive/ChamoAnalytics/results"

## Processing the data

### Processing CanBank/monetary_policy_report.csv



In [None]:
monetary_policy_reports = pd.read_csv(f"{BASE_PATH}/CanBank/monetary_policy_report.csv")

In [None]:
monetary_policy_reports.iloc[0]['text']

"Monetary Policy\r\nReport\r\n\r\nJanuary 2021\r\n\r\n\x0cCanada’s infl ation-control strategy\r\nCanada’s infl ation-control strategy1\r\nInfl ation targeting and the economy\r\n\uf0a7 The Bank’s mandate is to conduct monetary policy to promote the\r\n\r\neconomic and fi nancial well-being of Canadians.\r\n\r\n\uf0a7 Canada’s experience with infl ation targeting since 1991 has shown\r\nthat the best way to foster confi dence in the value of money and to\r\ncontribute to sustained economic growth, employment gains and\r\nimproved living standards is by keeping infl ation low, stable and\r\npredictable.\r\n\r\n\uf0a7\r\n\r\nIn 2016, the Government and the Bank of Canada renewed\r\nCanada’s infl ation-control target for a further fi ve-year period, ending\r\nDecember 31, 2021. The target, as measured by the rate of infl ation\r\nof the consumer price index (CPI), remains at the 2\xa0percent midpoint\r\nof the control range of 1 to 3 percent.\r\n\r\nMonetary policy tools\r\n\uf0a7 Monetar

### Processing CanBank/can_bank_statements.csv

In [None]:
can_bank_statements = pd.read_csv(f"{BASE_PATH}/CanBank/can_bank_statements.csv")

In [None]:
can_bank_statements.iloc[0]['text']

'MONETARY\r\nPOLICY\r\nREPORT\r\n\r\nJuly 2014\r\n\r\n\x0cCanada’s Inﬂ ation-Control Strategy1\r\n\r\nInfl ation targeting and the economy\r\n•  the Bank’s mandate is to conduct monetary policy to pro-\r\nmote the economic and ﬁ nancial well-being of Canadians . \r\n\r\n•  Canada’s experience with infl ation targeting since 1991 \r\nhas shown that the best way to foster conﬁ dence in the \r\nvalue of money and to contribute to sustained economic \r\ngrowth, employment gains and improved living standards \r\nis by keeping infl ation low, stable and predictable . \r\n\r\n• \r\n\r\nIn 2011, the Government and the Bank of Canada renewed \r\nCanada’s infl ation-control target for a further ﬁ ve-year \r\nperiod, ending 31 december 2016 . the target, as measured \r\nby the total consumer price index (CPI), remains at the \r\n2\xa0per cent midpoint of the control range of 1 to 3 per cent .\r\n\r\nThe monetary policy instrument\r\n•  the Bank carries out monetary policy through changes \r\nin t

### Processing FOMC/meeting_script.csv

In [None]:
meeting_script = pd.read_csv(f"{BASE_PATH}/FOMC/meeting_script.csv")

In [None]:
meeting_script.iloc[0]['contents']



### Processing FOMC/minutes.csv

In [None]:
minutes = pd.read_csv(f"{BASE_PATH}/FOMC/minutes.csv")

In [None]:
minutes.iloc[0]['contents']

'A meeting of the Federal Open Market Committee was held in \r\n    the offices of the Board of Governors of the Federal Reserve System in \r\n    Washington, D.C., on Tuesday, February 2, 1993, at 2:30 p.m. and was \r\n    continued on Wednesday, February 3, 1993, at 9:00 a.m.\n\n[SECTION]\n\nPRESENT:\n\n[SECTION]\n\nMr. Greenspan, Chairman\r\n      Mr. Corrigan, Vice Chairman\r\n      Mr. Angell\r\n      Mr. Boehne\r\n      Mr. Keehn\r\n      Mr. Kelley\r\n      Mr. LaWare\r\n      Mr. Lindsey\r\n      Mr. McTeer\r\n      Mr. Mullins\r\n      Ms. Phillips\r\n      Mr. Stern\n\n[SECTION]\n\nMessrs. Broaddus, Jordan, Forrestal, and Parry, Alternate\r\n      Members of the Federal Open Market Committee\n\n[SECTION]\n\nMessrs. Hoenig, Melzer, and Syron, Presidents of the Federal\r\n      Reserve Banks of Kansas City, St. Louis, and Boston,\r\n      respectively\n\n[SECTION]\n\nMr. Kohn, Secretary and Economist\r\n      Mr. Bernard, Deputy Secretary\r\n      Mr. Coyne, Assistant Secretary

### Processing FOMC/presconf_script.csv



In [None]:
presconf_script = pd.read_csv(f"{BASE_PATH}/FOMC/presconf_script.csv")

In [None]:
presconf_script.iloc[0]['contents']

'CHAIRMAN BERNANKE.  Good afternoon.  Welcome.  In my opening remarks, I’d like to briefly first review today’s policy decision.  I’ll then turn next to the Federal Open Market Committee’s quarterly economic projections also being released today, and I’ll place today’s policy decision in the context of the Committee’s projections and the Federal Reserve’s statutory mandate to foster maximum employment and price stability.  I’ll then be glad to take your questions.  Throughout today’s briefing, my goal will be to reflect the consensus of the Committee, while taking note of the diversity of views as appropriate.  Of course, my remarks and interpretations are my own responsibility.   In its policy statement released earlier today, the Committee announced, first, that it is maintaining its existing policy of reinvesting principal payments from its security holdings, and, second, that it will complete its planned purchases of $600 billion of longer-term Treasury securities by the end of the

### Processing FOMC/speech.csv

In [None]:
speech = pd.read_csv(f"{BASE_PATH}/FOMC/speech.csv")

In [None]:
speech.iloc[0]['contents']

'Remarks by Chairman Alan Greenspan\nBank supervision in a world economy\r\nAt the International Conference of Banking Supervisors, Stockholm, Sweden\r\nJune 13, 1996\n\n[SECTION]\n\n\n\n[SECTION]\n\nI am honored to present the William Taylor Memorial\r\n     Lecture to such a distinguished group of senior bank supervisors\r\n     from around the world.  I am especially delighted to have with us\r\n     Bill\'s wife, Sharon, and daughter, Claire.  This visit gives them\r\n     the opportunity to meet more of Bill\'s colleagues and to\r\n     appreciate, once again, the great importance of the work he did.\n\n[SECTION]\n\nThose of you who had the opportunity to know Bill can\r\n     recall him as a dedicated bank supervisor and an outstanding\r\n     public servant.  We in the United States were certainly fortunate\r\n     to have had him lead our bank supervisory functions at the\r\n     Federal Reserve and the FDIC while the U.S. banking system was\r\n     experiencing quite difficult

### Processing FOMC/statement.csv

In [None]:
statement = pd.read_csv(f"{BASE_PATH}/FOMC/statement.csv")

In [None]:
statement.iloc[0]['contents']

"\n\n[SECTION]\n\nChairman Alan Greenspan announced today that the Federal Open Market Committee decided to increase slightly the degree of pressure on reserve positions. The action is expected to be associated with a small increase in short-term money market interest rates.\n\n[SECTION]\n\nThe decision was taken to move toward a less accommodative stance in monetary policy in order to sustain and enhance the economic expansion.\n\n[SECTION]\n\nChairman Greenspan decided to announce this action immediately so as to avoid any misunderstanding of the Committee's purposes, given the fact that this is the first firming of reserve market conditions by the Committee since early 1989.\n\n[SECTION]\n\n"

### Processing FOMC/testimony.csv

In [None]:
testimony = pd.read_csv(f"{BASE_PATH}/FOMC/testimony.csv")

In [None]:
testimony.iloc[0]['contents']

"\n\n[SECTION]\n\n\n\n[SECTION]\n\n\n\n[SECTION]\n\n\n\n[SECTION]\n\nIt is a pleasure to appear before this Subcommittee to discuss\r\n\r\n     the supervision of bank sales practices on behalf of the Federal\r\n\r\n     Reserve.  The recent publication of various survey results has\r\n\r\n     focused attention on the performance of the banking and\r\n\r\n     securities industries in educating customers about the critical\r\n\r\n     differences between FDIC-insured deposits and uninsured\r\n\r\n     investment products sold on bank premises.\n\n[SECTION]\n\nThe Board has a long history of concerns about possible\r\n\r\n     customer confusion between insured deposit instruments and\r\n\r\n     uninsured investment products sold on bank premises.  We have\r\n\r\n     worked and continue to work diligently to minimize customer\r\n\r\n     confusion through a number of supervisory and educational\r\n\r\n     initiatives.  These initiatives include coordination among the\r\n\r\n     ban

### Processing of FXUSDCAD.csv

In [None]:
exchange_rate_USDCAD = pd.read_csv(f"{BASE_PATH}/FXUSDCAD.csv")

In [None]:
exchange_rate_USDCAD

Unnamed: 0,date,FXUSDCAD
0,2017-01-03,1.3435
1,2017-01-04,1.3315
2,2017-01-05,1.3244
3,2017-01-06,1.3214
4,2017-01-09,1.3240
...,...,...
1454,2022-10-28,1.3615
1455,2022-10-31,1.3649
1456,2022-11-01,1.3614
1457,2022-11-02,1.3630


## Sentiment Analysis

In [None]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

In [None]:
gen_sentiment_analysis = pipeline("sentiment-analysis", model="siebert/sentiment-roberta-large-english", device=0, return_all_scores=True)

Downloading:   0%|          | 0.00/687 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/256 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/150 [00:00<?, ?B/s]



In [None]:
gen_tokenizer = AutoTokenizer.from_pretrained("siebert/sentiment-roberta-large-english", device=0)

In [None]:
fin_sentiment_analysis = pipeline("sentiment-analysis", model="ProsusAI/finbert", device=0, return_all_scores=True)

Downloading:   0%|          | 0.00/758 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/252 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [None]:
fin_tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert", device=0)

In [None]:
def predict_fin_sentiment(sent_pipeline, tokenizer, text):
    encoded_text = tokenizer(text, add_special_tokens=False)
    tokenized_text = tokenizer.convert_ids_to_tokens(encoded_text.input_ids)

    max_len = tokenizer.model_max_length - 10
    size = len(tokenized_text)
    nb_sent = (size // max_len) + (size % max_len  != 0)

    tokenized_texts = [tokenized_text[i*max_len:i*max_len + max_len] for i in range(nb_sent)]

    sents = [tokenizer.convert_tokens_to_string(text) for text in tokenized_texts]
    
    sentiments = sent_pipeline(sents)

    sum_sentiments = {'negative': 0, 'neutral': 0, 'positive': 0}
    for sentiment in sentiments:
        for feel in sentiment:
            sum_sentiments[feel['label']] += feel['score']

    total_sentiment = sum(sum_sentiments.values())
    avg_sentiment = (0 * sum_sentiments['negative'] + 0.5 * sum_sentiments['neutral'] + 1 * sum_sentiments['positive']) / total_sentiment

    return avg_sentiment

In [None]:
def predict_gen_sentiment(sent_pipeline, tokenizer, text):

    encoded_text = tokenizer(text, add_special_tokens=False)
    tokenized_text = tokenizer.convert_ids_to_tokens(encoded_text.input_ids)

    max_len = tokenizer.model_max_length - 2
    size = len(tokenized_text)
    nb_sent = (size // max_len) + (size % max_len  != 0)

    tokenized_texts = [tokenized_text[i*max_len:i*max_len + max_len] for i in range(nb_sent)]

    sents = [tokenizer.convert_tokens_to_string(text) for text in tokenized_texts]

    sentiments = sent_pipeline(sents)


    sum_sentiments = {'NEGATIVE': 0, 'POSITIVE': 0}
    for sentiment in sentiments:
        for feel in sentiment:
            sum_sentiments[feel['label']] += feel['score']

    total_sentiment = sum(sum_sentiments.values())
    avg_sentiment = (0 * sum_sentiments['NEGATIVE'] + 1 * sum_sentiments['POSITIVE']) / total_sentiment

    return avg_sentiment

In [None]:
# TODO: 
# model_pipline = pipeline("text-classification",model=model,tokenizer=tokenizer,device=0, return_all_scores=True)

# tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512,'return_tensors':'pt'}

# prediction = model_pipeline('sample text to predict',**tokenizer_kwargs)

### Analysis of CanBank/monetary_policy_report.csv 

In [None]:
monetary_policy_reports_sentiment_results = monetary_policy_reports.copy()

In [None]:
monetary_policy_reports_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in monetary_policy_reports.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['text'])
    print(idx, sentiment)
    monetary_policy_reports_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

0 0.5184578730623643
1 0.34097209455466587
2 0.24263447436475122
3 0.3362908244193005
4 0.24499734663910516




5 0.45114706195977927
6 0.48592653405432906
7 0.39687580366372094
8 0.4383694795867561
9 0.3172228075469145
10 0.3363231686233246
11 0.5758708801228957
12 0.3282633277509836
13 0.41402397745860003
14 0.49558834185531414
15 0.36320098132069445
16 0.4700314027098188
17 0.37519783719524696
18 0.43662104397116125
19 0.44206309577200537
20 0.41984348613707684
21 0.3360960306633479
22 0.3044515428946183
23 0.31144206934102253
24 0.5214128384358407
25 0.5360540300505887
26 0.428968558441116
27 0.2587928722704856


In [None]:
monetary_policy_reports_sentiment_results.to_csv(f"{OUT_PATH}/monetary_policy_reports.csv")

In [None]:
monetary_policy_reports_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in monetary_policy_reports.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['text'])
    print(idx, sentiment)
    monetary_policy_reports_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

Token indices sequence length is longer than the specified maximum sequence length for this model (22623 > 512). Running this sequence through the model will result in indexing errors


0 0.9060360197821731
1 0.80771540832594
2 0.7524197973474983
3 0.7829088667983558
4 0.8567787397311
5 0.8916587472067687
6 0.8940512469686298
7 0.737348254002062
8 0.9217970596206514
9 0.7795367031145802
10 0.7312404052625084
11 0.914190770957121
12 0.8367140197840996
13 0.9736847585391878
14 0.9732455094460379
15 0.8295653752640607
16 0.9245692262420625
17 0.8467519065171083
18 0.8482082104837788
19 0.9270659215409649
20 0.8044919781874894
21 0.8855572782582251
22 0.7679810346325835
23 0.8791815366478278
24 0.9103669589241271
25 0.863653171219789
26 0.8347406577180753
27 0.7194906774074318


In [None]:
monetary_policy_reports_sentiment_results.to_csv(f"{OUT_PATH}/monetary_policy_reports.csv")

### Analysis of CanBank/can_bank_statements.csv 

In [None]:
can_bank_statements_sentiment_results = can_bank_statements.copy()

In [None]:
can_bank_statements_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in can_bank_statements.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['text'])
    print(idx, sentiment)
    can_bank_statements_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

Token indices sequence length is longer than the specified maximum sequence length for this model (16336 > 512). Running this sequence through the model will result in indexing errors


0 0.4204229953918228
1 0.41038768486543886
2 0.35188868378527904
3 0.3163179081205288
4 0.379334287886353
5 0.5371179301548835
6 0.3787179950193509
7 0.5039095211056135
8 0.3445800399679772
9 0.3109541381119383




10 0.565733599841904
11 0.5290484771933963
12 0.45004769526030713
13 0.3480617450307568
14 0.4129315979796043
15 0.43852015050269316
16 0.4831246437803667
17 0.459216736091335
18 0.41036405152689703
19 0.5794011730835285
20 0.5112620300876723
21 0.28820599501735605
22 0.28621395926541515
23 0.5046986012533368
24 0.30361423171218804
25 0.5968241059105853
26 0.5889598148371038
27 0.25923359343778735
28 0.2921010015156305
29 0.4748647980124588
30 0.446638824351803
31 0.397109131944482
32 0.44546687988465605
33 0.3837020355063354
34 0.39192663543044165
35 0.5200933106716629
36 0.3683544071532381
37 0.4697196058470583
38 0.34354455489552077
39 0.5384489277860067
40 0.4917136755280058
41 0.43038080949113666
42 0.4070759457902397
43 0.3306215476128939
44 0.43212755076780984
45 0.3773382335574425
46 0.44146646875968976
47 0.4328616455814775
48 0.4995316807668161
49 0.5283863563236659
50 0.33485470735482253
51 0.2480966925992331
52 0.33162742140536094
53 0.5254440257824246
54 0.3069826912710276

In [None]:
can_bank_statements_sentiment_results.to_csv(f"{OUT_PATH}/can_bank_statements.csv")

In [None]:
can_bank_statements_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in can_bank_statements.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['text'])
    print(idx, sentiment)
    can_bank_statements_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

Token indices sequence length is longer than the specified maximum sequence length for this model (30325 > 512). Running this sequence through the model will result in indexing errors


0 0.7333943132397961
1 0.8081476045509532
2 0.8946921633037076
3 0.703847312270714
4 0.7666767621498237
5 0.9127344325626778
6 0.8275638096907609
7 0.9474534507170549
8 0.7742880418721454
9 0.6087229814077303
10 0.9418393364562856
11 0.9180822055473048
12 0.9253610460072103
13 0.7839986677271567
14 0.7921640929813224
15 0.9162331924651611
16 0.9946743582037209
17 0.9928784678089816
18 0.7694574169924586
19 0.9401141860983323
20 0.894111046081688
21 0.7193724516506412
22 0.8165355834415852
23 0.9048901945745833
24 0.7532726778536745
25 0.9023432644303552
26 0.9940844897758563
27 0.7396400388406716
28 0.6708113769164821
29 0.9405461477729962
30 0.8656873550638262
31 0.8350863936677057
32 0.8778682823097735
33 0.8751789049339601
34 0.7754334154710419
35 0.9452226524861328
36 0.8084952387788745
37 0.8956381114105629
38 0.8100839625164344
39 0.9463055817459228
40 0.9042290138279041
41 0.8453688053050145
42 0.9189600629832216
43 0.9178871961558379
44 0.9275218238896259
45 0.8483047343975482


In [None]:
can_bank_statements_sentiment_results.to_csv(f"{OUT_PATH}/can_bank_statements.csv")

### Analysis of FOMC/minutes.csv

In [None]:
minutes_sentiment_results = minutes.copy()

In [None]:
minutes_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in minutes.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['contents'])
    print(idx, sentiment)
    minutes_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

0 0.43113205532735277
1 0.3264678447253472
2 0.258085805782067
3 0.3594591279421873
4 0.3322222100734643
5 0.32432106793070165
6 0.435830918308735
7 0.551245306357966
8 0.49027709605821534
9 0.37329432632221327
10 0.37329432632221327
11 0.47966800386290076
12 0.3702724527863844
13 0.5548154083310345
14 0.4550059044204572
15 0.5830889915076077
16 0.5979980143872491
17 0.4649346896534201
18 0.33659634203201294
19 0.34712560010116317
20 0.4697644177022829
21 0.5609125472969774
22 0.43274728495004183
23 0.34042021496373814
24 0.36675774065847605
25 0.45849078125342146
26 0.4135347737714986
27 0.5237451211164592
28 0.419145355655565
29 0.31764993316108864
30 0.4503632848690316
31 0.41099279053491844
32 0.5742546233060808
33 0.49961394188540764
34 0.4931928443516141
35 0.5016341003856156
36 0.4330776010515844
37 0.477093973121264
38 0.4890250257495199
39 0.48333016495844194
40 0.5316563212218522
41 0.4548084279787683
42 0.44061810597078366
43 0.32531719702951123
44 0.39475091914065197
45 0.3

In [None]:
minutes_sentiment_results.to_csv(f"{OUT_PATH}/minutes.csv")

In [None]:
minutes_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in minutes.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['contents'])
    print(idx, sentiment)
    minutes_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

0 0.913638012270127
1 0.7715853424723685
2 0.7870300360332011
3 0.9919684876199621
4 0.8911359148306911
5 0.8948657548793517
6 0.9519177162854818
7 0.9912795772510068
8 0.9687165682757236
9 0.7984258662369187
10 0.7984258662369187
11 0.937916638137671
12 0.9570318969525862
13 0.9963651428764759
14 0.9354362128494076
15 0.8924616200786276
16 0.9453222585655181
17 0.9303287034927552
18 0.861820410386246
19 0.8377105400633503
20 0.9577522205056771
21 0.9896245449729791
22 0.9448836278938282
23 0.8402650323224714
24 0.975307040793688
25 0.9438591534107511
26 0.9967229141829533
27 0.9242005386504639
28 0.9926814712377351
29 0.8304368040275114
30 0.9933184157184214
31 0.934163689703269
32 0.9671451683939175
33 0.9937410651578967
34 0.9295125307749178
35 0.9698800623888277
36 0.9498036411674987
37 0.9957159016825391
38 0.9343371343475771
39 0.9273977437727053
40 0.9935104267488359
41 0.8955922365143573
42 0.8767263809263152
43 0.8825084777675283
44 0.9458788625611009
45 0.9368480645303765
46 

In [None]:
minutes_sentiment_results.to_csv(f"{OUT_PATH}/minutes.csv")

### Analysis of FOMC/presconf_script.csv

In [None]:
presconf_script_sentiment_results = presconf_script.copy()

In [None]:
presconf_script_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in presconf_script.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['contents'])
    print(idx, sentiment)
    presconf_script_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

0 0.44759217110572125
1 0.48911375281008196
2 0.44589632084791997
3 0.5156473181257889
4 0.473124129827837
5 0.4917305733883026
6 0.4997560252367778
7 0.48437809579952507
8 0.41986383350096046
9 0.502780208351864
10 0.4911142113683487
11 0.4467113925942843
12 0.41819283123075535
13 0.4293523975670179
14 0.3985043720099876
15 0.4790734479967422
16 0.5107837601324386
17 0.41125374263898706
18 0.41960962316481076
19 0.5357758260615605
20 0.4904126890497897
21 0.4978421425134013
22 0.5172524888251777
23 0.5042942633346561
24 0.5111774637241714
25 0.4660540655215852
26 0.48899168308061886
27 0.50129654190298
28 0.5052514398690824
29 0.46825514647375044
30 0.49739187265744955
31 0.4773886430688003
32 0.5410097586552927
33 0.546229119835208
34 0.4785899953739154
35 0.4179015721649998
36 0.4542932984466648
37 0.45504937905057863
38 0.5002086162575465
39 0.5301275347989524
40 0.5248033217242727
41 0.5033733889812004
42 0.5178685232272469
43 0.5075061258089241
44 0.5163309729706285
45 0.47450518

In [None]:
presconf_script_sentiment_results.to_csv(f"{OUT_PATH}/presconf_script.csv")

In [None]:
presconf_script_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in presconf_script.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['contents'])
    print(idx, sentiment)
    presconf_script_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

0 0.9249333296186323
1 0.9253674535838367
2 0.8515081428440767
3 0.9471602138364529
4 0.9507601660006454
5 0.8469150090511041
6 0.9935033133093132
7 0.9032118023499136
8 0.98420300860445
9 0.8822296161416749
10 0.9546261949255609
11 0.9955407105323889
12 0.9000087306679256
13 0.9935256680949149
14 0.9328542208123725
15 0.9807511796993928
16 0.8923768944127382
17 0.9058705336499486
18 0.9509528875923984
19 0.8989008113710836
20 0.8531201455133274
21 0.9949370616025348
22 0.9614188379362792
23 0.9968512899087548
24 0.9950466567793901
25 0.9463135789701167
26 0.9000709669695264
27 0.9480905985906443
28 0.9108340916121102
29 0.9653203081628005
30 0.9967776011482867
31 0.952450235270503
32 0.9599548506870462
33 0.9973404420831039
34 0.9475133121444361
35 0.9114857448161636
36 0.9394960413409748
37 0.9959536405305024
38 0.9937623744897853
39 0.955859996426175
40 0.8989359921548286
41 0.9960827944748935
42 0.9965169345842273
43 0.9958712648008349
44 0.9966651179117431
45 0.9603226311982004
46

In [None]:
presconf_script_sentiment_results.to_csv(f"{OUT_PATH}/presconf_script.csv")

### Analysis of FOMC/statement.csv

In [None]:
statement_sentiment_results = statement.copy()

In [None]:
statement_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in statement.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['contents'])
    print(idx, sentiment)
    statement_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

0 0.610876429405106
1 0.5821184434785859
2 0.564979855455308
3 0.5357449487121211
4 0.537660386132744
5 0.561012420756164
6 0.8085361046194256
7 0.13517611951908068
8 0.05953759932836595
9 0.41429948316260157
10 0.522166261911286
11 0.4121163616616889
12 0.2615739767950572
13 0.3151484679731322
14 0.4105533606127857
15 0.45994741161585967
16 0.46828634957009124
17 0.3679348833573029
18 0.47429229804437906
19 0.45355843446769006
20 0.44684687833310005
21 0.4270122451232526
22 0.41420997643979596
23 0.47352441523151945
24 0.45466361999200566
25 0.5232032519147497
26 0.46455877990742706
27 0.3380640582500085
28 0.28355960008683967
29 0.24986515752609498
30 0.23117667328632513
31 0.29537798067690285
32 0.32312088557598856
33 0.1547327509939497
34 0.220907201225098
35 0.1506061599885271
36 0.40259909437917296
37 0.42912650125456536
38 0.3309591975889743
39 0.49836568163062867
40 0.47783185005042955
41 0.48337749441333133
42 0.5048400378035011
43 0.5162918660358045
44 0.36909941450848915
45 

In [None]:
statement_sentiment_results.to_csv(f"{OUT_PATH}/statement.csv")

In [None]:
statement_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in statement.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['contents'])
    print(idx, sentiment)
    statement_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

0 0.9954305275358122
1 0.9892770540264285
2 0.9864577393784963
3 0.9907111746976787
4 0.9968707904548114
5 0.9944895505905151
6 0.9971698617123324
7 0.9939745908872835
8 0.9949880041471045
9 0.7192031790240536
10 0.9741937868442246
11 0.42734511800240815
12 0.7632775692492147
13 0.9486727418669187
14 0.6701123688109081
15 0.9748090948648237
16 0.8827078865106185
17 0.8819556112410398
18 0.8019678708625252
19 0.6367714691373678
20 0.8056790088380194
21 0.8060615446001121
22 0.9223641922316002
23 0.7453293506204415
24 0.7617689437626832
25 0.7598119631274012
26 0.5037365283629784
27 0.5728119040728378
28 0.9098116642790892
29 0.8423571412067395
30 0.8087749668428644
31 0.8319370488267274
32 0.8710653441211154
33 0.9915124905787177
34 0.9797320274829874
35 0.983193409060789
36 0.9937011112235375
37 0.9959777901337392
38 0.9961162103599205
39 0.9960183198941687
40 0.995923151484928
41 0.9971806603841912
42 0.9974102237202559
43 0.9957313036660773
44 0.5440646728811028
45 0.9959980473909554

In [None]:
statement_sentiment_results.to_csv(f"{OUT_PATH}/statement.csv")

### Analysis of FOMC/testimony.csv

In [None]:
testimony_sentiment_results = testimony.copy()

In [None]:
testimony_sentiment_results['fin_sentiment'] = float('NaN')

In [None]:
for idx, item in testimony.iterrows():
    sentiment = predict_fin_sentiment(fin_sentiment_analysis, fin_tokenizer, item['contents'])
    print(idx, sentiment)
    testimony_sentiment_results.loc[idx, 'fin_sentiment'] = sentiment

0 0.575679120697835
1 0.4400732680032159
2 0.5886216617605757
3 0.3932202994823445
4 0.5068910739891543
5 0.5074398566491529
6 0.42648880584515975
7 0.35860475055256164
8 0.43609232206207066
9 0.5077065713871103
10 0.3700725648595483
11 0.4672196735958991
12 0.48727596326248734
13 0.4577469024374379
14 0.48474484779004046
15 0.5003829269459366
16 0.5346949756882926
17 0.5174312232290066
18 0.5068665490635442
19 0.4708678459169062
20 0.43174870942085214
21 0.48001528533445137
22 0.5460070158757531
23 0.5512643660003341
24 0.4665534217650607
25 0.4630337647354233
26 0.5393290832341227
27 0.48905519630363836
28 0.47335456932331343
29 0.6201569160773827
30 0.49437594738132096
31 0.3720524599446259
32 0.5255836668117567
33 0.3300006950972715
34 0.49987700381938927
35 0.4310828453924055
36 0.46760452603437236
37 0.3168538889745148
38 0.2990540883287847
39 0.49969063028237964
40 0.40132607404991144
41 0.43700038767279714
42 0.38950173172732205
43 0.5283252680674195
44 0.45129879521411453
45 0

In [None]:
testimony_sentiment_results.to_csv(f"{OUT_PATH}/testimony.csv")

In [None]:
testimony_sentiment_results['gen_sentiment'] = float('NaN')

In [None]:
for idx, item in testimony.iterrows():
    sentiment = predict_gen_sentiment(gen_sentiment_analysis, gen_tokenizer, item['contents'])
    print(idx, sentiment)
    testimony_sentiment_results.loc[idx, 'gen_sentiment'] = sentiment

0 0.9967998896689902
1 0.5107453722802243
2 0.9962410545771384
3 0.897642448174809
4 0.8261005217302443
5 0.9340565128659039
6 0.5626665555955854
7 0.8357434035992478
8 0.7457348161759532
9 0.8306975084517026
10 0.8297632224375479
11 0.7137058850701257
12 0.9056076732267546
13 0.8956595869181557
14 0.9948545529119923
15 0.9969977761620678
16 0.9902983866416969
17 0.9957669318786089
18 0.8845998924002445
19 0.8513830736788311
20 0.9968401505838353
21 0.9965610573554994
22 0.9980100551127604
23 0.9980343698937191
24 0.997293069067956
25 0.6717422938575067
26 0.714912919170081
27 0.9478460672485338
28 0.886199375788527
29 0.9981506442652225
30 0.9950289810090507
31 0.6927250666481346
32 0.9810943545014456
33 0.7513288587391714
34 0.8320457310251352
35 0.751582624689231
36 0.6670603219700829
37 0.6964398337718092
38 0.7440085151369235
39 0.9453709589818305
40 0.7782850851753925
41 0.9950985568050607
42 0.9861110203702197
43 0.7488338420326854
44 0.9877409585804594
45 0.8782436548403244
46 

In [None]:
testimony_sentiment_results.to_csv(f"{OUT_PATH}/testimony.csv")