# Pretrained Models


- Pretrained models are models that have been trained on large datasets for a specific task before being fine-tuned for a more specific application.
- The idea is that these models learn general language patterns and representations during the initial training on a massive corpus of text, which can then be fine-tuned on smaller datasets for specific tasks like text classification, named entity recognition, machine translation, etc.


## Using Hugging Face Transformers Library:

- Hugging Face is a company that provides an open-source platform for natural language processing (NLP). 
- They offer a library called Transformers, which simplifies the use and management of pretrained models.
- The Transformers library by Hugging Face allows easy access to a wide variety of pretrained models, including BERT, GPT,
  RoBERTa, T5, and more. It provides a unified interface for using these models in various NLP tasks.
- Hugging Face's Transformers library acts as a central hub for accessing and utilizing pretrained models, including BERT and GPT.

- Users can easily download and use pretrained models from Hugging Face using a consistent and easy-to-use API. This allows researchers and developers to leverage powerful models without having to implement them from scratch.

- Hugging Face supports a wide range of models, architectures, and tasks, making it a go-to resource for the NLP community.

### BERT (Bidirectional Encoder Representations from Transformers):

- BERT is a pretrained model architecture developed by Google. It's designed to understand the context of words in a sentence by considering the surrounding words bidirectionally.
- BERT has been pretrained on a massive amount of text data and can be fine-tuned for specific NLP tasks. It has become a widely used and influential model in the NLP community.

### GPT (Generative Pretrained Transformer):
- GPT is another pretrained model architecture, developed by OpenAI. GPT is a generative model, meaning it can generate coherent and contextually relevant text.
- GPT, like BERT, has been pretrained on large datasets and can be fine-tuned for specific tasks. GPT-3, the latest version, is particularly known for its large scale and impressive language generation capabilities.

In [1]:
pip install transformers

Note: you may need to restart the kernel to use updated packages.


In [1]:
pip install --upgrade transformers

Collecting transformers
  Obtaining dependency information for transformers from https://files.pythonhosted.org/packages/c1/bd/f64d67df4d3b05a460f281defe830ffab6d7940b7ca98ec085e94e024781/transformers-4.34.1-py3-none-any.whl.metadata
  Using cached transformers-4.34.1-py3-none-any.whl.metadata (121 kB)
Collecting huggingface-hub<1.0,>=0.16.4 (from transformers)
  Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/ef/b5/b6107bd65fa4c96fdf00e4733e2fe5729bb9e5e09997f63074bb43d3ab28/huggingface_hub-0.18.0-py3-none-any.whl.metadata
  Using cached huggingface_hub-0.18.0-py3-none-any.whl.metadata (13 kB)
Collecting tokenizers<0.15,>=0.14 (from transformers)
  Obtaining dependency information for tokenizers<0.15,>=0.14 from https://files.pythonhosted.org/packages/c3/29/0d9975fb739bdbefc73b6c23f335ea18e752fe6d2e91f3266a10dc8be140/tokenizers-0.14.1-cp311-none-win_amd64.whl.metadata
  Using cached tokenizers-0.14.1-cp311-none-win_amd64.w

In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [3]:
from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer

### Text Generation:

In [4]:
from transformers import pipeline

# Load pre-trained text generation model
text_generation = pipeline("text-generation")

# Generate text
result = text_generation("Once upon a time in a", max_length=50, num_return_sequences=1)
print(result[0]['generated_text'])

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time in a galaxy far away, we see some vaping accessories, including e-juice that are sold by the likes of BestBuy, as a part of a small sample set of vaporizers available from a retail shop. The same


**"Once upon a time in a":** This is the input prompt or seed for text generation. The model will generate text starting from this input.

**max_length=50:** This parameter sets the maximum length of the generated text. In this case, it's set to 50 tokens (words or subwords). The generated text will be limited to this length.

**num_return_sequences=1:** This parameter specifies the number of different text sequences to generate. Here, it's set to 1, meaning you want one sequence as the output.

### Translation:

In [5]:
from transformers import pipeline

# Load pre-trained translation model and tokenizer
translation = pipeline("translation", model="Helsinki-NLP/opus-mt-en-hi")

# Translate text from English to Hindi
result = translation("Hello, how are you?")
print(result[0]['translation_text'])

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.39k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/306M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

Downloading (…)olve/main/source.spm:   0%|          | 0.00/812k [00:00<?, ?B/s]

Downloading (…)olve/main/target.spm:   0%|          | 0.00/1.07M [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/2.10M [00:00<?, ?B/s]

हैलो, तुम कैसे हो?


### Sentiment Analysis

In [6]:
sentiment_analysis = pipeline("sentiment-analysis")
result = sentiment_analysis("I love using pre-trained NLP models!")
print(result)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

[{'label': 'POSITIVE', 'score': 0.9979884624481201}]


### Named Entity Recognition (NER):

In [5]:
ner_model = AutoModelForTokenClassification.from_pretrained("dbmdz/bert-large-cased-finetuned-conll03-english")
ner_tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-large-cased-finetuned-conll03-english")

sequence = "This is a good home"
inputs = ner_tokenizer(sequence, return_tensors="pt")

outputs = ner_model(**inputs)
predictions = ner_tokenizer.batch_decode(outputs.logits.argmax(dim=2))
print(predictions)

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


['[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]']
