# Sentimental Analysis

Goal: Scrape data from a website like bloomberg, separate the news into categories, then assign a sentimental value

spaCY was used to do preprocessing

The following library are to be explored:

1. VADER
2. TextBlob
3. Flair
4. LLM
5. Models
- RoBERTA (HuggingFace)
- DistilliBERT (HuggingFace)
- spaCY (spaCY pipeline)

In [7]:
# setup
test_sentence = "Trump to Leave G-7 Tonight Due to Middle East Crisis"

### 1. VADER

In [8]:
# Prebuilt Vader sentiment package

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
scores = analyzer.polarity_scores(test_sentence)
print(scores)

{'neg': 0.398, 'neu': 0.602, 'pos': 0.0, 'compound': -0.6486}


### 2. TextBlob

In [9]:
# Prebuilt Textblob sentiment package

from textblob import TextBlob
text = TextBlob(test_sentence)
score = text.sentiment
print(score)

Sentiment(polarity=-0.0625, subjectivity=0.1875)


### 3. Flair

Is optimized for sequence labeling but also has prebuild sentiment classification

In [16]:
from flair.data import Sentence
from flair.nn import Classifier

sentence = Sentence(test_sentence)
tagger = Classifier.load('sentiment')
tagger.predict(sentence)
print(sentence)

2025-06-17 13:46:09,148 https://nlp.informatik.hu-berlin.de/resources/models/sentiment-curated-distilbert/sentiment-en-mix-distillbert_4.pt not found in cache, downloading to /var/folders/fb/jy_jt7nj1xj16d7j33smgbrw0000gn/T/tmp91z0jyqg


100%|██████████| 253M/253M [15:50<00:00, 279kB/s]    

2025-06-17 14:02:01,838 copying /var/folders/fb/jy_jt7nj1xj16d7j33smgbrw0000gn/T/tmp91z0jyqg to cache at /Users/jaytai/.flair/models/sentiment-en-mix-distillbert_4.pt





2025-06-17 14:02:02,039 removing temp file /var/folders/fb/jy_jt7nj1xj16d7j33smgbrw0000gn/T/tmp91z0jyqg


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Sentence[10]: "Trump to Leave G-7 Tonight Due to Middle East Crisis" → NEGATIVE (0.9858)


### 4. HuggingFace Transformers

In [4]:
python -m spacy download en_core_web_trf

SyntaxError: invalid syntax (1553972540.py, line 1)

In [3]:
from transformers import pipeline
import spacy
nlp = spacy.load('en_core_web_trf')

OSError: [E050] Can't find model 'en_core_web_trf'. It doesn't seem to be a Python package or a valid path to a data directory.

#### - RoBERTa

In [None]:
classifier = pipeline('sentiment-analysis', model='cardiffnlp/twitter-roberta-base-sentiment')
result = classifier(test_sentence)
print(result)

#### - DistilBERT

In [None]:
classifier = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')
result = classifier(test_sentence)
print(result)

#### - spaCY

In [None]:
import en_core_web_trf
nlp = en_core_web_trf.load()
doc = nlp(test_sentence)
print(doc)