In [7]:
import os
import logging
import pandas as pd
import numpy as np
import torch
from transformers import (
    AutoModelForSequenceClassification,
    BertTokenizer,
    pipeline
)
from finbert.finbert import predict
import nltk

logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)
nltk.download('punkt')

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x, axis=1)[:, None])
    return e_x / np.sum(e_x, axis=1)[:, None]

[nltk_data] Downloading package punkt to /home/oonisim/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


---
# FinBERT - Financial Sentiment Prediction

* [Github FinBERT: Financial Sentiment Analysis with BERT](https://github.com/ProsusAI/finBERT)

> FinBERT is a pre-trained NLP model to analyze sentiment of financial text. It is built by further training the BERT language model in the finance domain, using a large financial corpus and thereby fine-tuning it for financial sentiment classification. For the details, please see [FinBERT: Financial Sentiment Analysis with BERT](https://medium.com/prosus-ai-tech-blog/finbert-financial-sentiment-analysis-with-bert-b277a3607101)
> <br><br>
FinBERT sentiment analysis model is now available on Hugging Face model hub. You can get the model here. Or, you can download the models from the links below:
>
>* Language model trained on TRC2
>* Sentiment analysis model trained on Financial PhraseBank



---
## ABC news example to analyze the sentiment
* [Bitcoin back above $40k as US regulates; ASX closes flat as US stocks end losing streak](https://www.abc.net.au/news/2021-05-21/bitcoin-regulation-us-sharemarket-asx-200/100154742)

In [2]:
text = """Bitcoin back above $40k as US regulates; ASX closes flat as US stocks end losing streak. 
The Australian share market recovered from its earlier losses to end the week trading higher, 
after stocks on Wall Street snapped a three-day losing streak with help from technology companies. 
After a choppy day, the benchmark the ASX 200 closed 0.15 per cent up at 7,030, 
while the broader All Ordinaries rose 0.18 per cent to 7,265. 
Education, energy, basic materials, real estate and energy stocks weighed on the market. 
EML payments was the top performer (+15.9pc) on the ASX 200, after its share price halved yesterday, 
following Irish regulators raising concerns about anti-money laundering compliance in its subsidiary there. 
Webjet and A2 Milk Company gained 5.3 per cent, followed by Xero (+4.1pc) and Corporate Travel Management (+3.9pc). Afterpay closed down 0.9 per cent after rising 6.9 per cent in early trading. Retailer Kogan's shares had tumbled 13.9 per cent after the company announced its earnings would be well below market expectations. Next in line for losses were Nufarm (-6.9pc), Appen (-5.2pc) and Redbubble (-5.2pc). The Australian dollar was down a quarter of a per cent and worth 77.55 US cents at 4:20pm AEST. "We still expect AUD can lift towards 0.80 by quarter end," ANZ analysts said in a briefing this morning. 
"We judge it is undervalued against its fundamentals." Spot gold was flat and selling for $US1,875 an ounce.

US markets end three-day losing streak All the major US markets rose, with the tech-laden 
Nasdaq holding up the market with a 1.7 per cent gain. Over in Europe, all three majors were up, too. 
"Equity markets in the London, Europe and the US were back in the black today," NAB analysts said. 
"That was helped by the combination of some further easing in oil prices, general market stability 
including bitcoin and especially in the US session, a marked reduction in US Treasury yields, mainly 
from a pull-back in market-implied inflation rates." The spot price of Brent crude oil was up by 0.25
per cent to $US65.28 a barrel, while West Texas crude oil was unchanged at $US62.05 a barrel, after rising on 
sentiment that Iran could soon start boosting supply. The . Iranian President Hassan Rouhani said 
in a televised speech that sanctions on oil, shipping, petrochemicals, insurance and the central 
bank had been dealt with in diplomatic talks. Bitcoin . It was back above $US40,0000, at 4:20pm AEST. 
That is even despite a new announcement by US regulators that they will require larger 
cryptocurrency transfers to be reported to its Internal Revenue Service. 
The Federal Reserve has also flagged a closer look at cryptocurrencies. 
Fed chair Jerome Powell released a message about how "technological advances are driving 
rapid change in the global payments market". He also flagged that it was looking at 
issuing its own digital currency. Rabobank's head of financial research for the Asia-Pacific, 
Michael Every, said regulating cryptocurrency would diminish its ability to act as a true money substitute.

"Imagine if every time you transferred more than $US10,000, as businesses do multiple times every day, 
you had to: report it; and pay a tax rate, let's assume 15 per cent on the difference in the exchange 
rate of USD against EUR in a world where EUR/USD moves 30 per cent in a day," he said. 
"It makes crypto unviable at any scale; and if there is no successful strategy for that scale to emerge, 
then any recent increase in pricing is just speculation." Mr Every said that speculation would no doubt 
continue regardless.
"""

---
# News summary


In [3]:
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert")
summarizer = pipeline("summarization")
summary = summarizer(text, max_length=100, min_length=30, do_sample=False, num_beams=3)

for sentence in summary[0]["summary_text"].split('. '):
    print(f"{sentence.strip()}")

The ASX 200 closed 0.15 per cent up at 7,030, the broader All Ordinaries rose 0.18 per cent to 7,265
EHL payments was the top performer after its share price halved yesterday
The Australian dollar was down a quarter of a per cent and worth 77.55 US cents at 4:20pm AEST
Bitcoin back above $40k as US regulators regulate cryptocurrencies .


* **EHL** payment is **EML** payment

# Sentiment


In [4]:
sentiments = {0: 'Positive', 1: 'Negative', 2: 'Neutral'}

In [5]:
tokens = tokenizer(
    text,
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="pt"
)
with torch.no_grad():
    logits = model(**tokens).logits

scores = softmax(np.array(logits))
prediction = sentiments[np.argmax(scores)]

print(f"Sentiment:{prediction}")
print(f"Positive:{scores[0][0]:.02f} Negative:{scores[0][1]:.02f} Neutral:{scores[0][2]:.02f}")

Sentiment:Positive
Positive:0.51 Negative:0.45 Neutral:0.03


## Sentence by sentence sentiment

In [11]:
pd.options.display.max_colwidth = 300
#df.style.set_properties(**{'text-align': 'right'})
predict(text, model)

Unnamed: 0,sentence,logit,prediction,sentiment_score
0,Bitcoin back above $40k as US regulates; ASX closes flat as US stocks end losing streak.,"[0.1292165, 0.8272539, 0.04352963]",negative,-0.698037
1,"The Australian share market recovered from its earlier losses to end the week trading higher, \nafter stocks on Wall Street snapped a three-day losing streak with help from technology companies.","[0.9049035, 0.0731437, 0.021952903]",positive,0.83176
2,"After a choppy day, the benchmark the ASX 200 closed 0.15 per cent up at 7,030, \nwhile the broader All Ordinaries rose 0.18 per cent to 7,265.","[0.92177886, 0.042954043, 0.035267085]",positive,0.878825
3,"Education, energy, basic materials, real estate and energy stocks weighed on the market.","[0.022441179, 0.80385184, 0.17370698]",negative,-0.781411
4,"EML payments was the top performer (+15.9pc) on the ASX 200, after its share price halved yesterday, \nfollowing Irish regulators raising concerns about anti-money laundering compliance in its subsidiary there.","[0.49818787, 0.48233065, 0.019481493]",positive,0.015857
5,"Webjet and A2 Milk Company gained 5.3 per cent, followed by Xero (+4.1pc) and Corporate Travel Management (+3.9pc).","[0.6103204, 0.04448278, 0.3451968]",positive,0.565838
6,Afterpay closed down 0.9 per cent after rising 6.9 per cent in early trading.,"[0.009027276, 0.9739793, 0.016993508]",negative,-0.964952
7,Retailer Kogan's shares had tumbled 13.9 per cent after the company announced its earnings would be well below market expectations.,"[0.0073878197, 0.97456145, 0.01805074]",negative,-0.967174
8,"Next in line for losses were Nufarm (-6.9pc), Appen (-5.2pc) and Redbubble (-5.2pc).","[0.015159756, 0.9164471, 0.06839311]",negative,-0.901287
9,The Australian dollar was down a quarter of a per cent and worth 77.55 US cents at 4:20pm AEST.,"[0.009595359, 0.97070223, 0.019702388]",negative,-0.961107
