# French sentiment analysis with BERT

This notebook acts as an online demo for [this repository](https://github.com/TheophileBlard/french-sentiment-analysis-with-bert).

With this notebook, you can perform inference on your own sentences. 
You cannot train the model, to do so, please clone the repo.

*This is still experimental, so let me know if something doesn't work !*

**Author**: Théophile Blard ([LinkedIn](https://www.linkedin.com/in/theophile-blard))

## Preliminary steps

*Please run these cells, otherwise inference won't work.*

In [1]:
!pip install transformers>=4.0
!pip install sentencepiece

import tensorflow as tf
assert tf.__version__ >= "2.0"

Collecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[?25l[K     |▎                               | 10 kB 33.5 MB/s eta 0:00:01[K     |▌                               | 20 kB 18.3 MB/s eta 0:00:01[K     |▉                               | 30 kB 10.7 MB/s eta 0:00:01[K     |█                               | 40 kB 8.8 MB/s eta 0:00:01[K     |█▍                              | 51 kB 5.0 MB/s eta 0:00:01[K     |█▋                              | 61 kB 5.5 MB/s eta 0:00:01[K     |██                              | 71 kB 5.9 MB/s eta 0:00:01[K     |██▏                             | 81 kB 6.1 MB/s eta 0:00:01[K     |██▍                             | 92 kB 4.0 MB/s eta 0:00:01[K     |██▊                             | 102 kB 4.4 MB/s eta 0:00:01[K     |███                             | 112 kB 4.4 MB/s eta 0:00:01[K     |███▎                            | 122 kB 4.4 MB/s eta 0:00:01[K     |███▌       

In [2]:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("tblard/tf-allocine", use_fast=True)
model = TFAutoModelForSequenceClassification.from_pretrained("tblard/tf-allocine")

nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=2.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=666.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=810912.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=210.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=445132512.0, style=ProgressStyle(descri…




All model checkpoint layers were used when initializing TFCamembertForSequenceClassification.

All the layers of TFCamembertForSequenceClassification were initialized from the model checkpoint at tblard/tf-allocine.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFCamembertForSequenceClassification for predictions without further training.


## Inference

Here, you can enter your own sentences, and click on the "CLASSIFY!" button to feed BERT with your input. You can expand the text area by dragging the bottom right corner.

*Please run this cell, or nothing will happen. You don't have to click on SHOW CODE*.

In [3]:
#@title
import ipywidgets as widgets
from IPython.display import display

class Color:   
   GREEN = '\033[92m'
   RED = '\033[91m'
   BOLD = '\033[1m'   
   END = '\033[0m'

button = widgets.Button(
    description='CLASSIFY !',
    button_style='success'
  )

text_area = widgets.Textarea(
    value='',
    placeholder='Type something',
    description='',
    disabled=False
)
output = widgets.Output()

def on_button_clicked(b):
  text = text_area.value
  result = nlp(text)
  prediction = result[0]["label"]

  if prediction == "POSITIVE":    
    color = Color.GREEN    
  else:
    color = Color.RED

  with output:    
    print(Color.BOLD + color + f'{prediction}: ' + Color.END + f'"{text[:50]}"')

button.on_click(on_button_clicked)
display(text_area, button, output)

Textarea(value='', placeholder='Type something')

Button(button_style='success', description='CLASSIFY !', style=ButtonStyle())

Output()

In [4]:
result = nlp("J'aim bon heure.")
prediction = result[0] # ["label"]
print(f'prediction: {prediction}')

prediction: {'label': 'POSITIVE', 'score': 0.6319448947906494}
