# 🎉 QAG Sentiment Analysis: Understanding Sentiments in French Political Interventions 🇫🇷

Welcome to the QAG Sentiment Analysis notebook! This notebook aims to analyze the sentiment of interventions during the "Questions Au Gouvernement" (QAG) sessions in French politics. By leveraging Natural Language Processing (NLP) techniques and state-of-the-art pre-trained models, we'll infer the sentiment of each intervention to better understand the overall tone of the discussions. 🗣️

## 🔍 Overview

We will be using the Hugging Face Transformers library and the "nlptown/bert-base-multilingual-uncased-sentiment" model to perform sentiment analysis on the QAG dataset. This pre-trained model can efficiently handle multilingual text inputs and provide accurate sentiment predictions. The sentiment scores will be appended to the original dataset, offering a comprehensive view of the sentiment landscape in QAG sessions. 📊

## 📝 Steps

1. Install the Transformers library. 📦
2. Import the necessary libraries and load the QAG dataset. 📚
3. Create a sentiment analysis pipeline using the pre-trained multilingual sentiment model. 🧪
4. Test the sentiment analysis pipeline with sample sentences. 📖
5. Define a custom function (`sentiment_to_df`) to tokenize, truncate, and infer the sentiment of each intervention in the dataset. 🛠️
6. Apply the custom function to the entire dataset, creating new columns for sentiment scores. 📈
7. Save the updated dataset with sentiment scores as a new CSV file. 💾

## 📌 Results

Once the sentiment analysis is complete, the updated dataset will contain the sentiment scores for each intervention. This information can be utilized to visualize trends, perform further analysis, and gain insights into the dynamics of the QAG sessions. 📉

````
   session_date  ... sentiment_score
0    2021-12-15  ...        0.987380
1    2021-12-15  ...        0.999869
2    2021-12-15  ...        0.999779
3    2021-12-15  ...        0.998926
4    2021-12-15  ...        0.996955

[5 rows x 7 columns]
````


This notebook provides an effective and efficient way to analyze the sentiment of political interventions during QAG sessions, offering valuable insights into the tone and mood of these important discussions. 💡🌟



# Install transformers

In [1]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


# import libs and load dataset

In [2]:
import pandas as pd
import seaborn as sns
import numpy as np
from tqdm.notebook import tqdm
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'
tqdm.pandas()
from transformers import pipeline

In [3]:
df = pd.read_csv("/content/drive/MyDrive/AI/QAG/raw_qag/complete_qag_15.csv", index_col=0)

In [4]:
df.head()

Unnamed: 0,legislature_number,official_date,qag_number,intervention_number,speaker_name,intervention_sentences
0,15,11/04/2018,806,0,M. le président.,"La parole est à Mme Sophie Auconie, pour le gr..."
1,15,11/04/2018,806,1,Mme Sophie Auconie.,Madame la ministre des solidarités et de la sa...
2,15,11/04/2018,806,2,Mme Danièle Obono.,Très bien !
3,15,11/04/2018,806,3,Mme Sophie Auconie.,Il est temps de nous préoccuper du mal-être de...
4,15,11/04/2018,806,4,M. Éric Coquerel et Mme Laurence Dumont .,Très bien !


In [5]:
model_path = "nlptown/bert-base-multilingual-uncased-sentiment"
sentiment_task = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)
sentiment_task("Covid cases are increasing fast!")
sentiment_task("Ce fromage est dégeulasse")
sentiment_task("J'adore les fleurs")

[{'label': '5 stars', 'score': 0.5750746130943298}]

In [6]:
import torch

In [7]:
from transformers import pipeline

sentiment_pipeline = pipeline("sentiment-analysis", model="nlptown/bert-base-multilingual-uncased-sentiment", tokenizer="nlptown/bert-base-multilingual-uncased-sentiment", device=0)

def sentiment_to_df(sentences):
    tokens = sentiment_pipeline.tokenizer.encode(sentences, truncation=True, max_length=512)
    input_ids = torch.tensor(tokens).unsqueeze(0).to(sentiment_pipeline.device)  # Move input tensor to GPU
    outputs = sentiment_pipeline.model(input_ids)
    result = outputs.logits.detach().cpu().numpy()  # Move output tensor back to CPU
    stars = np.argmax(result, axis=1)[0]
    score = np.max(result, axis=1)[0]

    return [stars, score]

df[['sentiment_stars', 'sentiment_score']] = df.progress_apply(lambda row: sentiment_to_df(row['intervention_sentences']), axis=1, result_type="expand")

  0%|          | 0/57680 [00:00<?, ?it/s]

In [8]:
df.head()

Unnamed: 0,legislature_number,official_date,qag_number,intervention_number,speaker_name,intervention_sentences,sentiment_stars,sentiment_score
0,15,11/04/2018,806,0,M. le président.,"La parole est à Mme Sophie Auconie, pour le gr...",4.0,0.80393
1,15,11/04/2018,806,1,Mme Sophie Auconie.,Madame la ministre des solidarités et de la sa...,4.0,2.354871
2,15,11/04/2018,806,2,Mme Danièle Obono.,Très bien !,4.0,2.843999
3,15,11/04/2018,806,3,Mme Sophie Auconie.,Il est temps de nous préoccuper du mal-être de...,2.0,0.683606
4,15,11/04/2018,806,4,M. Éric Coquerel et Mme Laurence Dumont .,Très bien !,4.0,2.843999


In [9]:
df.to_csv("/content/drive/MyDrive/AI/QAG/sentiment/sentiment_qag_15.csv")