# Model training: 

Generally, for sentiment analysis, many techinques have been used for a long time. Some of them namely being:
- Bag of words
- Tf-IDF
- Word2vec, Glove

But, in the last few years, after Attention is all you need came up, transformers have been on the rise, leading to the creation of BERT. This bi directional model provides a lot of growth in accuracy as well as performance. 

In this notebook, the idea is to finetune a pretrained bert model to classify the sentiment of the news article. 

## Importing libraries

In [3]:
import transformers
import pandas as pd
from tqdm.notebook import tqdm
import numpy as np

#BERT data prep:

from transformers import BertTokenizer

from sklearn.model_selection import train_test_split


# Checking GPU 

In [4]:
import torch

if torch.cuda.is_available():       
    device = torch.device("cuda")
    print(f'There are {torch.cuda.device_count()} GPU(s) available.')
    print('Device name:', torch.cuda.get_device_name(0))

else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

No GPU available, using the CPU instead.


## Setting up data

In [6]:
df = pd.read_csv('financial_sentiment_data.csv',delimiter=',',encoding='latin-1')

In [7]:
df = df.rename(columns={'neutral':'sentiment','According to Gran , the company has no plans to move all production to Russia , although that is where the company is growing .':'Message'})

In [8]:
sentiment_mapping = {'positive': 0, 'negative': 1, 'neutral': 2}

# Create a new column 'sentiment_numeric' with mapped values
df['sentiment_numeric'] = df['sentiment'].map(sentiment_mapping)

# Data Cleaning

- Prepare data for Bert using the BERT tokenizer

In order to apply the pre-trained BERT, we must prepare the data by using the tokenizer provided by the library. This is due to: 
- The model has a specific vocabulary 
- The BERT tokenizer has a certain way of preparing out of vocabulary words. 


Apart from this, BERT needs special tokens to the start and end of each sentence. 
Each sentence is also supposed to be padded and truncated to a single length, and each padding token is specified with an attention mask. 

Method will generally follow this approach: 
- Split text into tokens. 
- Add CLS and SEP tokens. 
- Convert tokens into indexes. 
- Pad or Truncate sentences to max length
- Create attention mask



In [9]:
# Splitting data into Validation and Training set
X = df['Message']
y = df['sentiment_numeric']


X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

In [10]:
# Loading the bert tokenizer

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)

# using the bert base model, which has 768 tokens as the size. The uncased implies that data will be in a 
#single case and will not have multiple cases. This will allow the model to not overfit based on characters. 

In [11]:
# Function for preprocessing based on the steps provided: 

def preprocessing_for_bert(data):
    """
    
    Perform required preprocessing steps for pretrained BERT.
    
    Input: 
        - Param: data: Array of texts. 
    
    Returns: 
        - input_ids in the form of torch.Tensor
        - attention masks in torch.Tensor. These are indices which specify which tokens should be focused on by 
        the model
  
    """
    # Create empty lists to store outputs
    input_ids = []
    attention_masks = []

    # For every sentence...
    for sent in tqdm(data):
        # `encode_plus` will:
        #    (1) Tokenize the sentence
        #    (2) Add the `[CLS]` and `[SEP]` token to the start and end
        #    (3) Truncate/Pad sentence to max length
        #    (4) Map tokens to their IDs
        #    (5) Create attention mask
        #    (6) Return a dictionary of outputs
        

        encoded_sent = tokenizer.encode_plus(
            text=sent,  
            add_special_tokens=True,        # Add `[CLS]` and `[SEP]`
            max_length=MAX_LEN,             # Max length to truncate/pad
            pad_to_max_length=True,         # Pad or Truncate sentences to max length
            return_attention_mask=True      # Return attention mask
            )
        
        # Add the outputs to the lists
        input_ids.append(encoded_sent.get('input_ids'))
        attention_masks.append(encoded_sent.get('attention_mask'))

    # Convert lists to tensors
    input_ids = torch.tensor(input_ids)
    attention_masks = torch.tensor(attention_masks)

    return input_ids, attention_masks

Since the approach would require sentences to be under 512 tokens, it is imperitive to proceed with a different method. 

There are two approaches that could be followed here: 

1. Each sequence truncated to 512 and measured. 
2. Each sequence split into senteces, seach sentence is encoded and then averaged out. 
3. Use a window method to split and encode. 

Generally, it has been found, that encoding to the first 512 is a great method to begin with. 

Thus, if that doesn't perform well, I will use a different approach. 

In [12]:
# Testing with first sentence
MAX_LEN = 128  # As that is the max length that is generally encoded. 

# Print sentence 0 and its encoded token ids
token_ids = list(preprocessing_for_bert([df['Message'][0]])[0].squeeze().numpy())
print('Original: ', df['Message'][0] )
# print('Token IDs: ', token_ids)



  0%|          | 0/1 [00:00<?, ?it/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Original:  Technopolis plans to develop in stages an area of no less than 100,000 square meters in order to host companies working in computer technologies and telecommunications , the statement said .




In [13]:
print('Tokenizing Data')
train_inputs, train_masks = preprocessing_for_bert(X_train)
val_inputs, val_masks = preprocessing_for_bert(X_val)

Tokenizing Data


  0%|          | 0/3876 [00:00<?, ?it/s]

  0%|          | 0/969 [00:00<?, ?it/s]

# Creating a Pytorch Dataloader

A dataloader allows for faster loading of data, thus making the training more efficient and it also helps to save on memory. 

In [14]:
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler

# Convert other data types to torch.Tensor
train_labels = torch.tensor(np.array(y_train))
val_labels = torch.tensor(np.array(y_val))

# For fine-tuning BERT, the authors recommend a batch size of 16 or 32.
batch_size =  16

# Create the DataLoader for our training set
train_data = TensorDataset(train_inputs, train_masks, train_labels)
train_sampler = RandomSampler(train_data)
train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=batch_size)

# Create the DataLoader for our validation set
val_data = TensorDataset(val_inputs, val_masks, val_labels)
val_sampler = SequentialSampler(val_data)
val_dataloader = DataLoader(val_data, sampler=val_sampler, batch_size=batch_size)

In [15]:
len(train_inputs)

3876

# Training the model: 

### Create BertClassifier
BERT-base consists of 12 transformer layers, each transformer layer takes in a list of token embeddings, and produces the same number of embeddings with the same hidden size (or dimensions) on the output. The output of the final transformer layer of the [CLS] token is used as the features of the sequence to feed a classifier.

The transformers library has the BertForSequenceClassification class which is designed for classification tasks. However, we will create a new class so we can specify our own choice of classifiers.

Below we will create a BertClassifier class with a BERT model to extract the last hidden layer of the [CLS] token and a single-hidden-layer feed-forward neural network as our classifier.

In [21]:
import torch
import torch.nn as nn
from transformers import BertModel

# Create the BertClassfier class
class BertClassifier(nn.Module):
    """Bert Model for Classification Tasks.
    """
    def __init__(self, freeze_bert=False):
        """
        @param    bert: a BertModel object
        @param    classifier: a torch.nn.Module classifier
        @param    freeze_bert (bool): Set `False` to fine-tune the BERT model
        """
        super(BertClassifier, self).__init__()
        # Specify hidden size of BERT, hidden size of our classifier, and number of labels
        D_in, H, D_out = 768, 50, 3

        # Instantiate BERT model
        self.bert = BertModel.from_pretrained('bert-base-uncased')

        # Instantiate an one-layer feed-forward classifier
        self.classifier = nn.Sequential(
            nn.Linear(D_in, H),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(H, D_out)
        )

        # Freeze the BERT model
        if freeze_bert:
            for param in self.bert.parameters():
                param.requires_grad = False
        
    def forward(self, input_ids, attention_mask):
        """
        Feed input to BERT and the classifier to compute logits.
        @param    input_ids (torch.Tensor): an input tensor with shape (batch_size,
                      max_length)
        @param    attention_mask (torch.Tensor): a tensor that hold attention mask
                      information with shape (batch_size, max_length)
        @return   logits (torch.Tensor): an output tensor with shape (batch_size,
                      num_labels)
        """
        # Feed input to BERT
        outputs = self.bert(input_ids=input_ids,
                            attention_mask=attention_mask)
        
        # Extract the last hidden state of the token `[CLS]` for classification task
        last_hidden_state_cls = outputs[0][:, 0, :]

        # Feed input to classifier to compute logits
        logits = self.classifier(last_hidden_state_cls)

        return logits

### Optimizer
To fine-tune our Bert Classifier, we need to create an optimizer. The authors recommend following hyper-parameters:

Batch size: 
- 16 or 32
- Learning rate (Adam): 5e-5, 3e-5 or 2e-5
- Number of epochs: 2, 3, 4


Huggingface provided the run_glue.py script, an examples of implementing the transformers library. In the script, the AdamW optimizer is used.

In [17]:
from transformers import AdamW, get_linear_schedule_with_warmup

def initialize_model(epochs=4):
    """Initialize the Bert Classifier, the optimizer and the learning rate scheduler.
    """
    # Instantiate Bert Classifier
    bert_classifier = BertClassifier(freeze_bert=False)

    # Tell PyTorch to run the model on GPU
    bert_classifier.to(device)

    # Create the optimizer
    optimizer = AdamW(bert_classifier.parameters(),
                      lr=3e-5,    # Default learning rate
                      eps=1e-8    # Default epsilon value
                      )

    # Total number of training steps
    total_steps = len(train_dataloader) * epochs

    # Set up the learning rate scheduler
    scheduler = get_linear_schedule_with_warmup(optimizer,
                                                num_warmup_steps=10, # Default value
                                                num_training_steps=total_steps)
    return bert_classifier, optimizer, scheduler

# Train: 

We will train our Bert Classifier for 4 epochs. In each epoch, we will train our model and evaluate its performance on the validation set. In more details, we will:

Training:

- Unpack our data from the dataloader and load the data onto the GPU
- Zero out gradients calculated in the previous pass
- Perform a forward pass to compute logits and loss
- Perform a backward pass to compute gradients (loss.backward())
- Clip the norm of the gradients to 1.0 to prevent "exploding gradients"
- Update the model's parameters (optimizer.step())
- Update the learning rate (scheduler.step())


Evaluation:

- Unpack our data and load onto the GPU
- Forward pass
- Compute loss and accuracy rate over the validation set

The script below is commented with the details of our training and evaluation loop.

In [22]:
import random
import time

# Specify loss function
loss_fn = nn.CrossEntropyLoss()

def set_seed(seed_value=42):
    """Set seed for reproducibility.
    """
    random.seed(seed_value)
    np.random.seed(seed_value)
    torch.manual_seed(seed_value)
    torch.cuda.manual_seed_all(seed_value)

def train(model, train_dataloader, val_dataloader=None, epochs=4, evaluation=False):
    """Train the BertClassifier model.
    """
    # Start training loop
    print("Start training...\n")
    for epoch_i in range(epochs):
        # =======================================
        #               Training
        # =======================================
        # Print the header of the result table
        print(f"{'Epoch':^7} | {'Batch':^7} | {'Train Loss':^12} | {'Val Loss':^10} | {'Val Acc':^9} | {'Elapsed':^9}")
        print("-"*70)

        # Measure the elapsed time of each epoch
        t0_epoch, t0_batch = time.time(), time.time()

        # Reset tracking variables at the beginning of each epoch
        total_loss, batch_loss, batch_counts = 0, 0, 0

        # Put the model into the training mode
        model.train()

        # For each batch of training data...
        for step, batch in enumerate(train_dataloader):
            batch_counts +=1
            # Load batch to GPU
            b_input_ids, b_attn_mask, b_labels = tuple(t.to(device) for t in batch)

            # Zero out any previously calculated gradients
            model.zero_grad()

            # Perform a forward pass. This will return logits.
            logits = model(b_input_ids, b_attn_mask)

            # Compute loss and accumulate the loss values
            loss = loss_fn(logits, b_labels)
            batch_loss += loss.item()
            total_loss += loss.item()

            # Perform a backward pass to calculate gradients
            loss.backward()

            # Clip the norm of the gradients to 1.0 to prevent "exploding gradients"
            torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

            # Update parameters and the learning rate
            optimizer.step()
            scheduler.step()

            # Print the loss values and time elapsed for every 20 batches
            if (step % 20 == 0 and step != 0) or (step == len(train_dataloader) - 1):
                # Calculate time elapsed for 20 batches
                time_elapsed = time.time() - t0_batch

                # Print training results
                print(f"{epoch_i + 1:^7} | {step:^7} | {batch_loss / batch_counts:^12.6f} | {'-':^10} | {'-':^9} | {time_elapsed:^9.2f}")

                # Reset batch tracking variables
                batch_loss, batch_counts = 0, 0
                t0_batch = time.time()

        # Calculate the average loss over the entire training data
        avg_train_loss = total_loss / len(train_dataloader)

        print("-"*70)
        # =======================================
        #               Evaluation
        # =======================================
        if evaluation == True:
            # After the completion of each training epoch, measure the model's performance
            # on our validation set.
            val_loss, val_accuracy = evaluate(model, val_dataloader)

            # Print performance over the entire training data
            time_elapsed = time.time() - t0_epoch
            
            print(f"{epoch_i + 1:^7} | {'-':^7} | {avg_train_loss:^12.6f} | {val_loss:^10.6f} | {val_accuracy:^9.2f} | {time_elapsed:^9.2f}")
            print("-"*70)
        print("\n")
    
    print("Training complete!")


def evaluate(model, val_dataloader):
    """After the completion of each training epoch, measure the model's performance
    on our validation set.
    """
    # Put the model into the evaluation mode. The dropout layers are disabled during
    # the test time.
    model.eval()

    # Tracking variables
    val_accuracy = []
    val_loss = []

    # For each batch in our validation set...
    for batch in val_dataloader:
        # Load batch to GPU
        b_input_ids, b_attn_mask, b_labels = tuple(t.to(device) for t in batch)

        # Compute logits
        with torch.no_grad():
            logits = model(b_input_ids, b_attn_mask)

        # Compute loss
        loss = loss_fn(logits, b_labels)
        val_loss.append(loss.item())

        # Get the predictions
        preds = torch.argmax(logits, dim=1).flatten()

        # Calculate the accuracy rate
        accuracy = (preds == b_labels).cpu().numpy().mean() * 100
        val_accuracy.append(accuracy)

    # Compute the average accuracy and loss over the validation set.
    val_loss = np.mean(val_loss)
    val_accuracy = np.mean(val_accuracy)

    return val_loss, val_accuracy

In [19]:
print(len(train_dataloader))

243


In [23]:
set_seed(42)    # Set seed for reproducibility
bert_classifier, optimizer, scheduler = initialize_model(epochs=2)
print("hi")
train(bert_classifier, train_dataloader, val_dataloader, epochs=3, evaluation=True)

hi
Start training...

 Epoch  |  Batch  |  Train Loss  |  Val Loss  |  Val Acc  |  Elapsed 
----------------------------------------------------------------------
   1    |   20    |   0.959710   |     -      |     -     |   37.25  
   1    |   40    |   0.874787   |     -      |     -     |   34.74  
   1    |   60    |   0.664152   |     -      |     -     |   33.78  
   1    |   80    |   0.638176   |     -      |     -     |   35.40  
   1    |   100   |   0.544360   |     -      |     -     |   34.28  
   1    |   120   |   0.593673   |     -      |     -     |   35.98  
   1    |   140   |   0.579258   |     -      |     -     |   34.77  
   1    |   160   |   0.477073   |     -      |     -     |   33.90  
   1    |   180   |   0.498212   |     -      |     -     |   33.78  
   1    |   200   |   0.442593   |     -      |     -     |   33.37  
   1    |   220   |   0.410930   |     -      |     -     |   33.02  
   1    |   240   |   0.377819   |     -      |     -     |   33.30

In [23]:
torch.cuda.empty_cache()

# Evaluation on validation set

In [41]:
import torch.nn.functional as F

def bert_predict(model, test_dataloader):
    """Perform a forward pass on the trained BERT model to predict probabilities
    on the test set.
    """
    # Put the model into the evaluation mode. The dropout layers are disabled during
    # the test time.
    model.eval()

    all_logits = []

    # For each batch in our test set...
    for batch in test_dataloader:
        # Load batch to GPU
        b_input_ids, b_attn_mask = tuple(t.to(device) for t in batch)[:2]

        # Compute logits
        with torch.no_grad():
            logits = model(b_input_ids, b_attn_mask)
        all_logits.append(logits)
    
    # Concatenate logits from each batch
    all_logits = torch.cat(all_logits, dim=0)

    # Apply softmax to calculate probabilities
    probs = F.softmax(all_logits, dim=1).cpu().numpy()

    
    return probs

In [25]:
from sklearn.metrics import accuracy_score, roc_curve, auc
import matplotlib.pyplot as plt

def evaluate_roc(probs, y_true):
    """
    - Print AUC and accuracy on the test set
    - Plot ROC
    @params    probs (np.array): an array of predicted probabilities with shape (len(y_true), 2)
    @params    y_true (np.array): an array of the true values with shape (len(y_true),)
    """
    preds = probs[:, 1]
    fpr, tpr, threshold = roc_curve(y_true, preds)
    roc_auc = auc(fpr, tpr)
    print(f'AUC: {roc_auc:.4f}')
       
    # Get accuracy over the test set
    y_pred = np.where(preds >= 0.5, 1, 0)
    accuracy = accuracy_score(y_true, y_pred)
    print(f'Accuracy: {accuracy*100:.2f}%')
    
    # Plot ROC AUC
    plt.title('Receiver Operating Characteristic')
    plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
    plt.legend(loc = 'lower right')
    plt.plot([0, 1], [0, 1],'r--')
    plt.xlim([0, 1])
    plt.ylim([0, 1])
    plt.ylabel('True Positive Rate')
    plt.xlabel('False Positive Rate')
    plt.show()

In [42]:
def predict_sentiment(url_text):
    '''
        Input:
            @param: url. A url from which information will be fetched. 
        Output:
            Sentiment: Boolean. 0/1. 0=Neg, 1=Pos
    
    
    '''


    #Preprocess
    test_inputs, test_masks = preprocessing_for_bert([url_text])
    
    #Add to dataloader
    test_dataset = TensorDataset(test_inputs, test_masks)
    test_sampler = SequentialSampler(test_dataset)
    test_dataloader = DataLoader(test_dataset, sampler=test_sampler, batch_size=32)
    
    
    # Compute predicted. Threshold kept as 90% positive.
    
    # Compute predicted probabilities on the test set
    probs = bert_predict(bert_classifier, test_dataloader)

    # Get predictions from the probabilities
   

    sentiment = probs
    
    return sentiment

In [1]:
op = predict_sentiment('Tim Cook rocked ‘Made on iPad’ Nike Air Max 1 ’86s during Apple’s ‘Let Loose’ event')

NameError: name 'predict_sentiment' is not defined

In [47]:
print(op)
sentiment_mapping = {'positive': 0, 'negative': 1, 'neutral': 2}

[[0.92344797 0.01378809 0.06276395]]


In [27]:
torch.save(bert_classifier.state_dict(), "bert_news_sentiment.pth")

In [48]:
torch.save(bert_classifier.state_dict(), "bert_news_sentiment.h5")

In [27]:
from transformers.models import save_pretrained
# Optional: Include tokenizer and training arguments
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')  # Assuming you used this tokenizer
save_pretrained(bert_classifier, "bert_for_sentiment", tokenizer=tokenizer)

ImportError: cannot import name 'save_pretrained' from 'transformers.models' (/Users/prinks/Library/Python/3.9/lib/python/site-packages/transformers/models/__init__.py)

In [30]:
torch.save(bert_classifier,"bert_for_sentimet.pt")

# Model save and run

In [1]:
from transformers import BertTokenizer
import torch.nn.functional as F
import torch.nn as nn
import torch
from transformers import BertTokenizer,BertModel
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
import tqdm
import numpy as np

In [2]:
class BertClassifier(nn.Module):
    """Bert Model for Classification Tasks.
    """
    def __init__(self, freeze_bert=False):
        """
        @param    bert: a BertModel object
        @param    classifier: a torch.nn.Module classifier
        @param    freeze_bert (bool): Set `False` to fine-tune the BERT model
        """
        super(BertClassifier, self).__init__()
        # Specify hidden size of BERT, hidden size of our classifier, and number of labels
        D_in, H, D_out = 768, 50, 3

        # Instantiate BERT model
        self.bert = BertModel.from_pretrained('bert-base-uncased')

        # Instantiate an one-layer feed-forward classifier
        self.classifier = nn.Sequential(
            nn.Linear(D_in, H),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(H, D_out)
        )

        # Freeze the BERT model
        if freeze_bert:
            for param in self.bert.parameters():
                param.requires_grad = False
        
    def forward(self, input_ids, attention_mask):
        """
        Feed input to BERT and the classifier to compute logits.
        @param    input_ids (torch.Tensor): an input tensor with shape (batch_size,
                      max_length)
        @param    attention_mask (torch.Tensor): a tensor that hold attention mask
                      information with shape (batch_size, max_length)
        @return   logits (torch.Tensor): an output tensor with shape (batch_size,
                      num_labels)
        """
        # Feed input to BERT
        outputs = self.bert(input_ids=input_ids,
                            attention_mask=attention_mask)
        
        # Extract the last hidden state of the token `[CLS]` for classification task
        last_hidden_state_cls = outputs[0][:, 0, :]

        # Feed input to classifier to compute logits
        logits = self.classifier(last_hidden_state_cls)

        return logits


 # Replace with your tokenizer name if different


def find_sentiment(text):
    if torch.cuda.is_available():       
        device = torch.device("cuda")
    
    else:
        device = torch.device("cpu")
    # Define the path to your saved model
    model_path = "bert_for_sentimet.pt"

    # Load the model (assuming BertClassifier is your model class)
    model = torch.load(model_path)
    # Import the tokenizer (assuming you saved it or can download it)
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')  # Replace with your tokenizer name if different

    # Function to preprocess text for BERT
    def preprocess_for_bert(text):
        """Performs tokenization and mask creation for a single text input."""
        encoded_text = tokenizer(text, add_special_tokens=True, return_tensors='pt')
        return encoded_text['input_ids'], encoded_text['attention_mask']

        # Define your test text
        

        # Preprocess the test text
    test_inputs, test_masks = preprocess_for_bert(text)

        # Create a dummy dataset and dataloader for the test case (batch size of 1)
    test_dataset = TensorDataset(test_inputs, test_masks)
    test_sampler = SequentialSampler(test_dataset)
    test_dataloader = DataLoader(test_dataset, sampler=test_sampler, batch_size=1)

    # Function to predict sentiment (assuming yours returns probabilities)
    def predict_sentiment(model, test_dataloader):
        """Perform a forward pass on the model to predict probabilities on the test set."""
        all_logits = []
        with torch.no_grad():
            for batch in test_dataloader:
                b_input_ids, b_attn_mask = tuple(t.to(device) for t in batch)[:2]
                logits = model(b_input_ids, b_attn_mask)
                all_logits.append(logits)
        all_logits = torch.cat(all_logits, dim=0)
        probs = F.softmax(all_logits, dim=1).cpu().numpy()
        return probs

    # Predict sentiment
    sentiment_probs = predict_sentiment(model, test_dataloader)[0]
    sentiment_probs = list(sentiment_probs)
    sentiment_mapping = { 0:'positive',  1:'negative',  2:'neutral'}
    maximum = max(sentiment_probs)
    sentiment = sentiment_mapping[sentiment_probs.index(maximum)]
    
    # Print the predicted sentiment probabilities
    return sentiment,text


def create_user(email,password):
  try:
      user= auth.create_user_with_email_and_password(email,password)
      return "User Creation Successful. You can now login"
  except Exception as e :
      error_message = str(e)
      return f"OOPS Something went wrong \n{error_message}"

def sentiment_analysis(headlines, article_links, descriptions):
    sentiment_results = []
    for headline, link, description in zip(headlines, article_links, descriptions):
        # Sentiment Analysis Placeholders
        sentiment = find_sentiment(headline)[0]  # Placeholder sentiment analysis result

        # Store headline, link, description, and sentiment in a dictionary
        result = OrderedDict()
        result['headline'] = headline
        result['link'] = link
        result['description'] = description
        result['sentiment'] = sentiment
        sentiment_results.append(result)


    return sentiment_results


In [6]:
find_sentiment('This is a bad day for apple as we head toward a drop in stock')

('negative', 'This is a bad day for apple as we head toward a drop in stock')

In [4]:
find_sentiment("Despite the fall in previous quaters, the dip this quater is not as prominent")

('neutral',
 'Despite the fall in previous quaters, the dip this quater is not as prominent')

In [5]:
find_sentiment('against all odds, microsoft rises after fall last week')

('positive', 'against all odds, microsoft rises after fall last week')