In [1]:
from transformers import pipeline 
import pandas as pd 
from transformers import TFAutoModelForTokenClassification


modelId = "bert-base-multilingual-cased"
# model = TFAutoModelForTokenClassification.from_pretrained(modelId)
modelId = None 

### Text classification

In [2]:
classifier =  pipeline('text-classification' ) 

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [3]:
text = """Cet arbre trois fois centenaire de la forêt de Brocéliande est tombé à cause de la tempête Ciaran
Vieux de 300 ans, le hêtre de Ponthus était l’un des arbres remarquables de la forêt de Paimpont (Ille-et-Vilaine). 
Il s’est cassé, victime de la tempête Ciaran dans la nuit du 1er au 2 novembre 2023. « Nous sommes tous très tristes. 
Il était notre grand sage de la forêt », commente, ému, Philippe Manguin, photographe."""

In [4]:
outputs = classifier(text) 
pd.DataFrame(outputs)

Unnamed: 0,label,score
0,NEGATIVE,0.977983


### Named Entity Recognition

#### Note : quand on fournit comme modèle bert-base-multilingual-cased on a une exception. 

In [5]:

ner_tagger = pipeline("ner",aggregation_strategy="simple")
outputs = ner_tagger(text)
pd.DataFrame(outputs)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Unnamed: 0,entity_group,score,word,start,end
0,PER,0.52453,Ce,0,2
1,LOC,0.71328,Brocéliande,47,58
2,PER,0.443098,C,91,92
3,LOC,0.357545,##iara,92,96
4,PER,0.215788,##n,96,97
5,LOC,0.985734,Ponthus,128,135
6,LOC,0.830006,Paimpont,186,194
7,LOC,0.973486,Ille - et - Vilaine,196,211
8,PER,0.522639,C,253,254
9,ORG,0.391045,##iara,254,258


### Question answering

In [6]:
reader = pipeline('question-answering')  
question = "Que s'est il passé dans la nuit du 1°er au 2 Novembre 2023 ?"
outputs = reader(question=question,context=text)
print(pd.DataFrame([outputs])) 
question = "Qu'est il arrivé à l'arbre dans la nuit du 1°er au 2 Novembre 2023 ?"
outputs = reader(question=question,context=text)
pd.DataFrame([outputs])


No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


      score  start  end          answer
0  0.073886    215  229  Il s’est cassé


Unnamed: 0,score,start,end,answer
0,0.061515,215,229,Il s’est cassé


#### Résumé

In [7]:
summarizer = pipeline('summarization') 
outputs = summarizer(text, max_length=45,clean_up_tokenization_spaces=True) 
print("<%s>" % outputs[0]['summary_text']) 
t = outputs[0]['summary_text'] 
len(t)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Your min_length=56 must be inferior than your max_length=45.


< Le hêtre de Ponthus était l’un des arbres remarquables de la forêt de Paimpont (Ille-et-Vilaine)>


97

#### Traduction 

In [11]:
model = "Helsinki-NLP/opus-mt-fr-en" 
translator = pipeline('translation_fr_to_en'  , model=model ) 
outputs = translator(text,clean_up_tokenization_spaces=True,min_length=100)
print(outputs[0]['translation_text'])


Downloading pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

This three-hundred-year-old tree of the Broceliande forest fell because of the storm Ciaran 300 years old, the beech of Ponthus was one of the remarkable trees of the forest of Paimpont (Ille-et-Vilaine). It broke down, victim of the storm Ciaran on the night of November 1 to 2, 2023. "We are all very sad. He was our great sage of the forest", commented, moved, Philippe Manguin, photographer.


#### Génération de texte

In [16]:
model = "Helsinki-NLP/opus-mt-fr-en" 
generator = pipeline('text-generation',model=model)
response = "Le hêtre de Ponthus était l’un des arbres remarquables de la forêt de Paimpont (Ille-et-Vilaine)"
prompt = text + "\n\nCommentaire du journal de 20 heures :\n" + response 
outputs = generator(prompt,max_length=400)
print(outputs[0]['generated_text'])

Cet arbre trois fois centenaire de la forêt de Brocéliande est tombé à cause de la tempête Ciaran
Vieux de 300 ans, le hêtre de Ponthus était l’un des arbres remarquables de la forêt de Paimpont (Ille-et-Vilaine). 
Il s’est cassé, victime de la tempête Ciaran dans la nuit du 1er au 2 novembre 2023. « Nous sommes tous très tristes. 
Il était notre grand sage de la forêt », commente, ému, Philippe Manguin, photographe.

Commentaire du journal de 20 heures :
Le hêtre de Ponthus était l’un des arbres remarquables de la forêt de Paimpont (Ille-et-Vilaine)ColumNIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUETRIQUETRIQUETRIQUETRIQUETRIQUERIQUESTULTIQUERIQUERIQUERIQUERIQUERIQUERIQUERIQUESTTRIRTRIRTRIRTRIRUL