In [4]:
from transformers import pipeline


In [37]:
# Sentement Analysis

classifier = pipeline('sentiment-analysis')


No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [38]:
classifier(['worst book,haha kidding its the best and its absolutely useful',
            'hello where are you',
            'war'])

[{'label': 'NEGATIVE', 'score': 0.9731588363647461},
 {'label': 'POSITIVE', 'score': 0.9970099925994873},
 {'label': 'NEGATIVE', 'score': 0.9994986057281494}]

In [39]:
# Zero Shot Classification

classifier = pipeline('zero-shot-classification')


No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [40]:

classifier('worst book,haha kidding its the best and its absolutely useful',candidate_labels=['Purchase','Will not purchase'])


{'sequence': 'worst book,haha kidding its the best and its absolutely useful',
 'labels': ['Purchase', 'Will not purchase'],
 'scores': [0.7698447108268738, 0.23015525937080383]}

In [42]:

# Text Generation

generator = pipeline('text-generation')

generator('BERT Models main idea to have a model that understand')

No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'BERT Models main idea to have a model that understand the various aspects of language is to use different models of the same text to predict what language that text contains. The way certain words work can be mapped to other words, either by making the type'}]

In [3]:
from transformers import pipeline
generator = pipeline('text-generation',model='distilgpt2')

generator('BERT Models main idea to have a model that understand',
          max_length = 100,num_return_sequences=2)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'BERT Models main idea to have a model that understand the behavior of the animals. To understand this problem, many more than once in a while, we are confronted with the question: How much do animals like to eat or that do? As one of my main authors and others who has worked on modeling, there are a few important questions to consider for the purpose of understanding the behavior of animals.\n\n\n\nFigure 1. A model model of the model of the model of the model of'},
 {'generated_text': 'BERT Models main idea to have a model that understand what we want:\n\nThe idea is to create a model that can predict and predict a behavior. It takes an algorithm to have a model that can predict and predict behavior. So our model is to have a model that identifies things we want and identifies them based on the model to then predict behavior. But it is not exactly an ideal solution because it was designed for using the concepts of the model and the models it has created. For examp

In [7]:
# Mask Filling

mask_fill = pipeline('fill-mask')


No model was supplied, defaulted to distilbert/distilroberta-base and revision ec58a5b (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [16]:
sentence = 'Eating <mask> every day, reserve your health'
[result['token_str'] for result in mask_fill(sentence,top_k=5)]

[' healthy', ' healthier', ' yogurt', ' vegetables', ' well']

In [24]:
# NER

ner = pipeline('ner',aggregation_strategy="simple")


No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [27]:
[(ent['word'],ent['entity_group']) for ent in ner('Ali is the president of Jordan')]

[('Ali', 'PER'), ('Jordan', 'LOC')]

In [30]:
# Question Answering

qa = pipeline('question-answering')

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [33]:
jordan_text = "Jordan, a captivating country in the heart of the Middle East, is renowned for its rich history, stunning landscapes, and warm hospitality. Its capital, Amman, is a vibrant city that seamlessly blends the ancient with the modern. Amman's bustling streets are lined with a mix of traditional souks and contemporary shopping centers, offering a unique cultural experience. Visitors can explore historical sites such as the ancient Roman Amphitheatre and the Citadel, which provide panoramic views of the city. Amman's diverse culinary scene, featuring delicious Jordanian cuisine, adds to the city's charm. The friendly locals and the city's lively atmosphere make Amman a must-visit destination for travelers seeking a blend of history and modernity."

qa(question = "what places do you recomend to visit in jordan",
   context=jordan_text)

{'score': 0.37246036529541016,
 'start': 421,
 'end': 463,
 'answer': 'ancient Roman Amphitheatre and the Citadel'}

In [36]:
# Summarization

summarize = pipeline('summarization')

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [37]:
summarize(jordan_text)

[{'summary_text': " Jordan's capital, Amman, is renowned for its rich history, stunning landscapes and warm hospitality . Amman's bustling streets are lined with a mix of traditional souks and contemporary shopping centers . Visitors can explore historical sites such as the ancient Roman Amphitheatre and the Citadel ."}]

In [40]:
# Translation

translator = pipeline('translation',model='google-t5/t5-small')

In [48]:
translator('You are my friend is the best, let us talk about Jordan.')[0]['translation_text']

'Sie sind mein Freund ist das Beste, sprechen wir über Jordanien.'

#### Major types of LLM based on Transformers

**causal language modeling**:
 - Autoregressive, Decoder Only
 - Generate text
 - Decoder Only Models
 - GPT

**masked language modeling**:
 - Bidirectional, Auto-encoding, Encoder Only
 - Classify, Inference text
 - Encoder Only Models
 - BERT



**Encoder-only models:** Good for tasks that require understanding of the input, such as sentence classification and named entity recognition.
- Masked token prediction
- Sentement Analysis
- Named entity recognition
- QA

**Decoder-only models:** Good for generative tasks such as text generation.
- Text Generation
- ChatBots

**Encoder-decoder models or sequence-to-sequence models:** Good for generative tasks that require an input, such as translation or summarization.
- Transduction
- Images, Texts
- Summarization