<a href="https://colab.research.google.com/github/Kowsalya-creater/3Repo/blob/main/transformersLibrary.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install transformers



Getting started on a task with a pipeline

The easiest way to use a pretrained model on a given task is to use pipeline. 🤗 Transformers provides the following tasks out of the box:
Sentiment analysis: is a text positive or negative?
Text generation (in English): provide a prompt and the model will generate what follows.
Name entity recognition (NER): in an input sentence, label each word with the entity it represents (person, place, etc.)
Question answering: provide the model with some context and a question, extract the answer from the context.
Filling masked text: given a text with masked words (e.g., replaced by [MASK]), fill the blanks.
Summarization: generate a summary of a long text.
Translation: translate a text in another language.
Feature extraction: return a tensor representation of the text.
Let's see how this work for sentiment analysis (the other tasks are all covered in the task summary):

In [2]:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda:0


In [3]:
classifier("the pizza was not that great but the crust is awesome")

[{'label': 'POSITIVE', 'score': 0.9998167157173157}]

In [7]:
results = classifier(["we hope you dont hate it.","We are very happy to show you the huggingface transformers library.","I love AI.","I hate the wait."])

for result in results:
  print(f"label: {result['label']}, with score: {round(result['score'],4)}")



label: NEGATIVE, with score: 0.9796
label: POSITIVE, with score: 0.9998
label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.9969


In [10]:
results = classifier(["we hope you dont hate it.","We are very happy to show you the huggingface transformers library.","I love AI.","I hate the wait."])

for result in results:
  print(result)

{'label': 'NEGATIVE', 'score': 0.9795596599578857}
{'label': 'POSITIVE', 'score': 0.9997598528862}
{'label': 'POSITIVE', 'score': 0.9998161196708679}
{'label': 'NEGATIVE', 'score': 0.9969117045402527}


In [11]:
from transformers import pipeline
classifier = pipeline(
    model="lxyuan/distilbert-base-multilingual-cased-sentiments-student",
    return_all_scores=True
)

config.json:   0%|          | 0.00/759 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/541M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/373 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/996k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.92M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Device set to use cuda:0


In [12]:
classifier(("I love this movie and i would watch it again and again!"))

[[{'label': 'positive', 'score': 0.9731043577194214},
  {'label': 'neutral', 'score': 0.016910096630454063},
  {'label': 'negative', 'score': 0.00998549722135067}]]

In [13]:
results = classifier(["we hope you dont hate it.","We are very happy to show you the huggingface transformers library.","I love AI.","I hate the wait."])

for result in results:
  print(result)

[{'label': 'positive', 'score': 0.04246370494365692}, {'label': 'neutral', 'score': 0.03570447117090225}, {'label': 'negative', 'score': 0.9218318462371826}]
[{'label': 'positive', 'score': 0.9483305811882019}, {'label': 'neutral', 'score': 0.026439577341079712}, {'label': 'negative', 'score': 0.02522985078394413}]
[{'label': 'positive', 'score': 0.970418393611908}, {'label': 'neutral', 'score': 0.02003868669271469}, {'label': 'negative', 'score': 0.009542925283312798}]
[{'label': 'positive', 'score': 0.01888461783528328}, {'label': 'neutral', 'score': 0.0287020280957222}, {'label': 'negative', 'score': 0.9524133205413818}]


In [14]:
classifier = pipeline('sentiment-analysis', model="nlptown/bert-base-multilingual-uncased-sentiment")

config.json:   0%|          | 0.00/953 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/669M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/872k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cuda:0


In [15]:
classifier("Esperamos que no lo odie.")

[{'label': '3 stars', 'score': 0.3368820548057556}]

In [16]:
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="nlptown/bert-base-multilingual-uncased-sentiment")

Device set to use cuda:0


In [20]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

Device set to use cuda:0


In [21]:
classifier("I am a good boy")

[{'label': '4 stars', 'score': 0.4229269027709961}]

In [24]:
classifier(["You have won a prize!","I love this product, but it's too expensive"])

[{'label': '5 stars', 'score': 0.6135199069976807},
 {'label': '3 stars', 'score': 0.41727641224861145}]

In [26]:
input = tokenizer("I am a good boy")
print(input)

{'input_ids': [101, 151, 10345, 143, 12050, 14140, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}
