# Sentiment Analysis

In [1]:
from transformers import TFAutoModelForSequenceClassification,AutoTokenizer,pipeline






In [2]:
model_name="lxyuan/distilbert-base-multilingual-cased-sentiments-student"
model=TFAutoModelForSequenceClassification.from_pretrained(model_name,from_pt=True)
tokenizer=AutoTokenizer.from_pretrained(model_name)
classifier=pipeline('sentiment-analysis',model=model,tokenizer=tokenizer)





All PyTorch model weights were used when initializing TFDistilBertForSequenceClassification.

All the weights of TFDistilBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


In [3]:
classifier.predict("Very bad")

[{'label': 'negative', 'score': 0.7982096672058105}]

## Text generation

In [4]:
generator=pipeline('text-generation',model='distilgpt2')

In [14]:
generator("I like",max_length=3,num_return_sequences=3)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'I like to'},
 {'generated_text': 'I like to'},
 {'generated_text': 'I like a'}]

# Zero-text-classification

In Hugging Face Transformers, zero-shot text classification refers to the ability of a model to classify text into predefined categories without being explicitly trained on examples from those categories. This is a powerful capability as it allows you to leverage existing models for new classification tasks without the need for additional labeled training data.

In [6]:
classifier=pipeline('zero-shot-classification')

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [12]:
classifier("I make students learn ",candidate_labels=["teacher","security guard"])

{'sequence': 'I make students learn ',
 'labels': ['teacher', 'security guard'],
 'scores': [0.9903699159622192, 0.009630076587200165]}

## Fill mask

It involves training the model to predict a masked word based on the surrounding context in a sentence. Here's a detailed explanation

In [8]:
unmasker=pipeline("fill-mask")
unmasker("I live in <mask> and i want to be a data scientist",top_k=2)

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'score': 0.034582920372486115,
  'token': 14415,
  'token_str': ' NYC',
  'sequence': 'I live in NYC and i want to be a data scientist'},
 {'score': 0.024085989221930504,
  'token': 20071,
  'token_str': ' Bangalore',
  'sequence': 'I live in Bangalore and i want to be a data scientist'}]

# Summarization

In [9]:
summarizer=pipeline("summarization")
summarizer("""In the realm of natural language processing (NLP), transformers are a powerful neural network architecture renowned for their exceptional performance on a wide range of tasks. Here's a comprehensive breakdown of their key characteristics and applications:

Core Components:

Encoder-Decoder Structure: Transformers essentially consist of two sub-networks:

Encoder: This sub-network processes the input text, capturing its meaning and the relationships between words. It often utilizes multiple stacked encoder layers.
Decoder: This sub-network generates the output text, conditioned on the encoded representation from the encoder.
Self-Attention Mechanism: A key innovation in transformers is the self-attention mechanism. Unlike traditional recurrent neural networks (RNNs) that process sequences one element at a time, self-attention allows transformers to:

Focus on important parts of the input sequence: The model can attend to relevant words in the entire sentence simultaneously, not just the previous word.
Learn long-range dependencies: This mechanism enables the model to capture relationships between words that might be far apart in the sentence, a limitation faced by RNNs.
Benefits of Transformers:

Superior Performance: Transformers have consistently achieved state-of-the-art results on various NLP tasks, including:

Text classification (sentiment analysis, topic labeling)
Question answering
Machine translation
Text summarization
Text generation
Parallelization Potential: Due to their ability to process the entire input sequence at once, transformers have the potential for parallelization, leading to faster training compared to RNNs.

Applications:

Chatbots and Virtual Assistants: Transformers power the ability of chatbots and virtual assistants to understand complex questions and provide informative responses.
Machine Translation: Transformers have revolutionized machine translation, enabling more accurate and nuanced translations between languages.
Text Summarization: Transformers can effectively condense lengthy text into concise summaries while preserving key information.
Content Creation: They are used in applications like generating creative text formats, marketing copy, or different writing styles.
Example:

Consider the sentence "The quick brown fox jumps over the lazy dog." In traditional RNNs, the network processes words sequentially, potentially missing long-range dependencies like the relationship between "fox" and "jumps" (which are not directly next to each other). However, transformers can attend to the entire sentence at once, capturing these crucial connections.

Overall, transformers represent a significant advancement in NLP, offering superior performance, versatility, and the potential for faster training. Their ability to effectively capture long-range dependencies and contextual information makes them a cornerstone of modern NLP applications.""")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'summary_text': ' Transformers are a powerful neural network architecture renowned for their exceptional performance on a wide range of tasks . Transformers essentially consist of two sub-networks: Encoder-Decoder Structure and Decoder Structure . Transformers have consistently achieved state-of-the-art results on various NLP tasks, including text classification, machine translation and question answering .'}]

## Question Answering

In [10]:
question_answer=pipeline("question-answering")
question_answer(question="What is my age?",context="Iam an engineering student")

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'score': 0.48041731119155884,
 'start': 7,
 'end': 26,
 'answer': 'engineering student'}

# Translation

In [11]:
# Use a pipeline as a high-level helper
from transformers import pipeline

trans= pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en")
trans("你好你好吗")



[{'translation_text': 'Hello. How are you?'}]