## Installation on Google Colab

Skip if you're not on Google Colab

In [2]:
!pip install transformers[sentencepiece]

Collecting transformers[sentencepiece]
[?25l  Downloading https://files.pythonhosted.org/packages/b5/d5/c6c23ad75491467a9a84e526ef2364e523d45e2b0fae28a7cbe8689e7e84/transformers-4.8.1-py3-none-any.whl (2.5MB)
[K     |▏                               | 10kB 24.8MB/s eta 0:00:01[K     |▎                               | 20kB 30.8MB/s eta 0:00:01[K     |▍                               | 30kB 35.2MB/s eta 0:00:01[K     |▌                               | 40kB 34.8MB/s eta 0:00:01[K     |▋                               | 51kB 31.4MB/s eta 0:00:01[K     |▉                               | 61kB 30.9MB/s eta 0:00:01[K     |█                               | 71kB 28.4MB/s eta 0:00:01[K     |█                               | 81kB 27.5MB/s eta 0:00:01[K     |█▏                              | 92kB 27.6MB/s eta 0:00:01[K     |█▎                              | 102kB 29.0MB/s eta 0:00:01[K     |█▌                              | 112kB 29.0MB/s eta 0:00:01[K     |█▋                 

## Super easy NLP with the pipeline interface

### Sentiment analysis in english

In [3]:
from transformers import pipeline

In [15]:
classifier = pipeline("sentiment-analysis")

In [16]:
classifier(["this is a great tutorial, thank you", 
            "your content just sucks"])

[{'label': 'POSITIVE', 'score': 0.9998582601547241},
 {'label': 'NEGATIVE', 'score': 0.9971919059753418}]

In [21]:
classifier(["Ton tuto est vraiment bien", 
            "il est complètement nul"])

[{'label': 'POSITIVE', 'score': 0.7650704979896545},
 {'label': 'POSITIVE', 'score': 0.8282670974731445}]

### Sentiment analysis in Dutch, German, French, Spanish and Italian

Search the Hub for a french classification model: https://huggingface.co/models?filter=fr&pipeline_tag=text-classification&sort=downloads

In [18]:
multilang_classifier = pipeline("sentiment-analysis", 
                                model="nlptown/bert-base-multilingual-uncased-sentiment")

In [20]:
multilang_classifier(["Ton tuto est vraiment bien", 
                      "il est complètement nul"])

[{'label': '5 stars', 'score': 0.5787978172302246},
 {'label': '1 star', 'score': 0.9223358035087585}]

### Translation 

In [112]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
  
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-fr-en")

model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-fr-en")

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=300827685.0, style=ProgressStyle(descri…




In [127]:
french = "Ton tutoriel est vraiment bien"

In [128]:
tokens = tokenizer.tokenize(french)
tokens

['▁Ton', '▁tutoriel', '▁est', '▁vraiment', '▁bien']

In [129]:
model_inputs = tokenizer(french, 
                         return_tensors="pt", padding=True, truncation=True)
model_inputs

{'input_ids': tensor([[ 9923, 43821,    43,  1836,   229,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}

In [130]:
outputs = model.generate(**model_inputs)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded)

Your tutorial is really good.
