<p style="text-align:center">
    <a href="https://skills.network" target="_blank">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="200" alt="Skills Network Logo"  />
    </a>
</p>


# **Loading Models and Inference with Hugging Face Inferences**


Estimated time needed: **20** minutes


In this lab, you will explore how to leverage the Hugging Face `transformers` library for various natural language processing (NLP) tasks. You will start by performing text classification and text generation using pretrained models like DistilBERT and GPT-2 without using the `pipeline()` function, understanding the steps involved in loading models, tokenizing input, performing inference, and processing outputs. Then, you will discover the simplicity and efficiency of using the `pipeline()` function to accomplish the same tasks with minimal code. By comparing both approaches, you will appreciate how the `pipeline()` function streamlines the process, making it easier and faster to implement NLP solutions.


## __Table of Contents__

<ol>
    <li><a href="#Objectives">Objectives</a></li>
    <li>
        <a href="#Setup">Setup</a>
        <ol>
            <li><a href="#Installing-required-libraries">Installing required libraries</a></li>
            <li><a href="#Importing-required-libraries">Importing required libraries</a></li>
        </ol>
    </li>
    <li>
        <a href="#Text-classification-with-DistilBERT">Text classification with DistilBERT</a>
    </li>
    <li>
        <a href="#Text-generation-with-GPT-2">Text generation with GPT-2</a>
    </li>
    <li>
        <a href="#Hugging-Face-pipeline()-function">Hugging Face `pipeline()` function</a>
        <ol>
            <li><a href="#Definition">Definition</a></li>
            <li><a href="#Parameters">Parameters</a></li>
            <li><a href="#Task-types">Task types</a>
                <ol>
                    <li><a href="#Example-1:-Text-classification-using-pipeline()">Example 1: Text classification using `pipeline()`</a>
                    </li>
                    <li><a href="#Example-2:-Language-detection-using-pipeline()">Example 2: Language detection using `pipeline()`</a>
                    </li>
                    <li><a href="#Example-3:-Text-generation-using-pipeline()">Example 3: Text generation using `pipeline()`</a>
                    </li>
                    <li><a href="#Example-4:-Text-generation-using-T5-with-pipeline()">Example 4: Text generation using T5 with `pipeline()`</a>
                    </li>
                </ol>
            </li>
            <li><a href="#Benefits-of-using-pipeline()">Benefits of using `pipeline()`</a></li>
            <li><a href="#When-to-use-pipeline()">When to use `pipeline()`</a></li>
            <li><a href="#When-to-avoid-pipeline()">When to avoid `pipeline()`</a></li>
        </ol>
    </li>
    <li>
        <a href="#Exercise:-Fill-mask-task-using-BERT-with-pipeline()">Exercise: Fill-mask task using BERT with `pipeline()`</a>
    </li>
</ol>


## Objectives

- Learn to set up and use the Hugging Face `transformers` library.
- Perform text classification and text generation using DistilBERT and GPT-2 models without `pipeline()`.
- Understand and utilize the `pipeline()` function to simplify various NLP tasks.
- Compare the ease of using models directly versus using the `pipeline()` function.


----


## Setup


For this lab, you will be using the following libraries:

*   [`torch`](https://pytorch.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for deep learning and neural network modeling.
*   [`transformers`](https://huggingface.co/transformers/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for accessing pretrained models and performing various NLP tasks with ease.


### Installing required libraries


In [1]:
!pip install torch
!pip install transformers

Collecting torch
  Downloading torch-2.5.1-cp311-cp311-manylinux1_x86_64.whl.metadata (28 kB)
Collecting filelock (from torch)
  Downloading filelock-3.16.1-py3-none-any.whl.metadata (2.9 kB)
Collecting networkx (from torch)
  Downloading networkx-3.4.2-py3-none-any.whl.metadata (6.3 kB)
Collecting fsspec (from torch)
  Downloading fsspec-2024.12.0-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
C

### Importing required libraries

_It is recommended that you import all required libraries in one place (here):_


In [2]:
from transformers import pipeline
from transformers import DistilBertForSequenceClassification, DistilBertTokenizer
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# You can also use this section to suppress warnings generated by your code:
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn
warnings.filterwarnings('ignore')

# Text classification with DistilBERT


## Load the model and tokenizer

First, let's initialize a tokenizer and a model for sentiment analysis using DistilBERT fine-tuned on the SST-2 dataset. This setup is useful for tasks where you need to quickly classify the sentiment of a piece of text with a pretrained, efficient transformer model.


In [3]:
# Load the tokenizer and model
tokenizer = DistilBertTokenizer.from_pretrained(pretrained_model_name_or_path="distilbert-base-uncased-finetuned-sst-2-english")
model = DistilBertForSequenceClassification.from_pretrained(pretrained_model_name_or_path="distilbert-base-uncased-finetuned-sst-2-english")

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

## Preprocess the input text
Tokenize the input text and convert it to a format suitable for the model:


`return_tensors="pt"`: Returns the tokenized output as PyTorch tensors (torch.Tensor).

Other options are `tf.Tensor`, `np.tensor`, etc.


`attention_mask`: Indicates which tokens should be attended to (i.e., not ignored) by the model.

In [4]:
# Sample text
text = "Congratulations! You've won a free ticket to the Bahamas. Reply WIN to claim."

# Tokenize the input text
inputs = tokenizer(text=text, return_tensors="pt")

print(inputs)

{'input_ids': tensor([[  101, 23156,   999,  2017,  1005,  2310,  2180,  1037,  2489,  7281,
          2000,  1996, 17094,  1012,  7514,  2663,  2000,  4366,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


The token ids are the token indexes  ```attention_mask``` is essential for correctly processing padded sequences, ensuring efficient computation, and maintaining model performance. Even when no tokens are explicitly masked, it helps the model differentiate between actual content and padding, which is critical for accurate and efficient processing of input data.


###  Perform inference
The `torch.no_grad()` context manager is used to disable gradient calculation.
This reduces memory consumption and speeds up computation, as gradients are not needed for inference (i.e. when you are not training the model). The **inputs syntax is used to unpack a dictionary of keyword arguments in Python. In the context of the model(inputs). **


In [9]:
# Perform inference
with torch.no_grad():
    outputs = model(**inputs)
    print(type(outputs))
    print(outputs)

<class 'transformers.modeling_outputs.SequenceClassifierOutput'>
SequenceClassifierOutput(loss=None, logits=tensor([[-3.9954,  4.3336]]), hidden_states=None, attentions=None)


Another method is `input_ids`, and `attention_mask` is their own parameter.


In [7]:
model(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'])

SequenceClassifierOutput(loss=None, logits=tensor([[-3.9954,  4.3336]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

#### Get the logits
The logits are the raw, unnormalized predictions of the model. Let's extract the logits from the model's outputs to perform further processing, such as determining the predicted class or calculating probabilities.


In [11]:
logits = outputs.logits
print(logits)
print(f"Logits shape = {logits.shape}")

tensor([[-3.9954,  4.3336]])
Logits shape = torch.Size([1, 2])


## Post-process the output
Convert the logits to probabilities and get the predicted class:


In [12]:
# Convert logits to probabilities
probs = torch.softmax(input=logits, dim=-1)
print(probs)
# Get the predicted class
predicted_class = torch.argmax(input=probs, dim=-1)

# Map the predicted class to the label
labels = ["NEGATIVE", "POSITIVE"]
predicted_label = labels[predicted_class]

print(f"Predicted label: {predicted_label}")

tensor([[2.4134e-04, 9.9976e-01]])
Predicted label: POSITIVE


# Text generation with GPT-2 


## Load tokenizer
 Load the pretrained GPT-2 tokenizer. The tokenizer is responsible for converting text into tokens that the model can understand.


In [13]:
# Load the tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained(pretrained_model_name_or_path="gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Load the pretrained GPT-2 model with a language modeling head. The model generates text based on the input tokens.


In [14]:
# Load the tokenizer and model

model = GPT2LMHeadModel.from_pretrained(pretrained_model_name_or_path="gpt2")

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## Preprocess the input text  
Tokenize the input text and convert it to a format suitable for the model, like before you have the token indexes, i.e., inputs. 


In [15]:
# Prompt
prompt = "Once upon a time"

# Tokenize the input text
inputs = tokenizer(text=prompt, return_tensors="pt")
inputs

{'input_ids': tensor([[7454, 2402,  257,  640]]), 'attention_mask': tensor([[1, 1, 1, 1]])}

## Perform inference  
Generate text using the model

```inputs:``` Input token IDs from the tokenizer

```attention_mask:``` Mask indicating which tokens to attend to

```pad_token_id:```Padding token ID set to the end-of-sequence token ID

```max_length:``` Maximum length of the generated sequences

```num_return_sequence:``` Number of sequences to generate


In [18]:
# Generate text
output_ids = model.generate(
    inputs=inputs.input_ids, 
    attention_mask=inputs.attention_mask,
    pad_token_id=tokenizer.eos_token_id, # 50256
    max_length=50, 
    num_return_sequences=1
)

print(output_ids.shape)
output_ids

torch.Size([1, 50])


tensor([[7454, 2402,  257,  640,   11,  262,  995,  373,  257, 1295,  286, 1049,
         8737,  290, 1049, 3514,   13,  383,  995,  373,  257, 1295,  286, 1049,
         3514,   11,  290,  262,  995,  373,  257, 1295,  286, 1049, 3514,   13,
          383,  995,  373,  257, 1295,  286, 1049, 3514,   11,  290,  262,  995,
          373,  257]])

or


```python
with torch.no_grad():
    outputs = model(**inputs) 

outputs


## Post-process the output  
Decode the generated tokens to get the text:


In [19]:
# Decode the generated text
print(f"Output Id shape = {output_ids.shape}")
generated_text = tokenizer.decode(token_ids=output_ids[0], skip_special_tokens=True)

print(generated_text)

Output Id shape = torch.Size([1, 50])
Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a


# Hugging Face `pipeline()` function

The `pipeline()` function from the Hugging Face `transformers` library is a high-level API designed to simplify the usage of pretrained models for various natural language processing (NLP) tasks. It abstracts the complexities of model loading, tokenization, inference, and post-processing, allowing users to perform complex NLP tasks with just a few lines of code.

## Definition

```python
transformers.pipeline(
    task: str,
    model: Optional = None,
    config: Optional = None,
    tokenizer: Optional = None,
    feature_extractor: Optional = None,
    framework: Optional = None,
    revision: str = 'main',
    use_fast: bool = True,
    model_kwargs: Dict[str, Any] = None,
    **kwargs
)
```

## Parameters

- **task**: `str`
  - The task to perform, such as "text-classification", "text-generation", "question-answering", etc.
  - Example: `"text-classification"`

- **model**: `Optional`
  - The model to use. This can be a string (model identifier from Hugging Face model hub), a path to a directory containing model files, or a pre-loaded model instance.
  - Example: `"distilbert-base-uncased-finetuned-sst-2-english"`

- **config**: `Optional`
  - The configuration to use. This can be a string, a path to a directory, or a pre-loaded config object.
  - Example: `{"output_attentions": True}`

- **tokenizer**: `Optional`
  - The tokenizer to use. This can be a string, a path to a directory, or a pre-loaded tokenizer instance.
  - Example: `"bert-base-uncased"`

- **feature_extractor**: `Optional`
  - The feature extractor to use for tasks that require it (e.g., image processing).
  - Example: `"facebook/detectron2"`

- **framework**: `Optional`
  - The framework to use, either `"pt"` for PyTorch or `"tf"` for TensorFlow. If not specified, it will be inferred.
  - Example: `"pt"`

- **revision**: `str`, default `'main'`
  - The specific model version to use (branch, tag, or commit hash).
  - Example: `"v1.0"`

- **use_fast**: `bool`, default `True`
  - Whether to use the fast version of the tokenizer if available.
  - Example: `True`

- **model_kwargs**: `Dict[str, Any]`, default `None`
  - Additional keyword arguments passed to the model during initialization.
  - Example: `{"output_hidden_states": True}`

- **kwargs**: `Any`
  - Additional keyword arguments passed to the pipeline components.

## Task types

The `pipeline()` function supports a wide range of NLP tasks. Here are some of the common tasks:

1. **Text Classification**: `text-classification`
   - **Purpose**: Classify text into predefined categories.
   - **Use Cases**: Sentiment analysis, spam detection, topic classification.


2. **Text Generation**: `text-generation`
   - **Purpose**: Generate coherent text based on a given prompt.
   - **Use Cases**: Creative writing, dialogue generation, story completion.


3. **Question Answering**: `question-answering`
   - **Purpose**: Answer questions based on a given context.
   - **Use Cases**: Building Q&A systems, information retrieval from documents.


4. **Named Entity Recognition (NER)**: `ner` (or `token-classification`)
   - **Purpose**: Identify and classify named entities (like people, organizations, locations) in text.
   - **Use Cases**: Extracting structured information from unstructured text.


5. **Summarization**: `summarization`
   - **Purpose**: Summarize long pieces of text into shorter, coherent summaries.
   - **Use Cases**: Document summarization, news summarization.

6. **Translation**: `translation_xx_to_yy` (e.g., `translation_en_to_fr`)
   - **Purpose**: Translate text from one language to another.
   - **Use Cases**: Language translation, multilingual applications.


7. **Fill-Mask**: `fill-mask`
   - **Purpose**: Predict masked words in a sentence (useful for masked language modeling).
   - **Use Cases**: Language modeling tasks, understanding model predictions.


8. **Zero-Shot Classification**: `zero-shot-classification`
   - **Purpose**: Classify text into categories without needing training data for those categories.
   - **Use Cases**: Flexible and adaptable classification tasks.


9. **Feature Extraction**: `feature-extraction`
   - **Purpose**: Extract hidden state features from text.
   - **Use Cases**: Downstream tasks requiring text representations, such as clustering, similarity, or further custom model training.


### Example 1: Text classification using `pipeline()`

In this example, you will use the `pipeline()` function to perform text classification. You will load a pretrained text classification model and use it to classify a sample text.

#### Load the text classification model:
We initialize the pipeline for the `text-classification` task, specifying the model `"distilbert-base-uncased-finetuned-sst-2-english"`. This model is fine-tuned for sentiment analysis.

#### Classify the sample text:
We use the classifier to classify a sample text: "Congratulations! You've won a free ticket to the Bahamas. Reply WIN to claim." The `classifier` function returns the classification result, which is then printed.


In [20]:
# Load a general text classification model
classifier = pipeline(task="text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Classify a sample text
result = classifier("Congratulations! You've won a free ticket to the Bahamas. Reply WIN to claim.")
print(result)

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9997586607933044}]


#### Output

The output will be a list of dictionaries, where each dictionary contains:

- `label`: The predicted label (e.g., "POSITIVE" or "NEGATIVE").
- `score`: The confidence score for the prediction.


### Example 2: Language detection using `pipeline()`

In this example, you will use the `pipeline()` function to perform language detection. You will load a pretrained language detection model and use it to identify the language of a sample text.

#### Load the language detection model:
We initialize the pipeline for the `text-classification` task, specifying the model `"papluca/xlm-roberta-base-language-detection"`. This model is fine-tuned for language detection.

#### Classify the sample text:
We use the classifier to detect the language of a sample text: "Bonjour, comment ça va?" The `classifier` function returns the classification result, which is then printed.


In [21]:
from transformers import pipeline

classifier = pipeline(task="text-classification", model="papluca/xlm-roberta-base-language-detection")
result = classifier("Bonjour, comment ça va?")
print(result)

config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/502 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.08M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'fr', 'score': 0.9934879541397095}]


#### Output
The output will be a list of dictionaries, where each dictionary contains:

- `label`: The predicted language label (e.g., "fr" for French).
- `score`: The confidence score for the prediction.


### Example 3: Text generation using `pipeline()`

In this example, you will use the `pipeline()` function to perform text generation. You will load a pretrained text generation model and use it to generate text based on a given prompt.

#### Initialize the text generation model:
We initialize the pipeline for the `text-generation` task, specifying the model `"gpt2"`. GPT-2 is a well-known model for text generation tasks.


In [22]:
# Initialize the text generation pipeline with GPT-2
generator = pipeline(task="text-generation", model="gpt2")

Device set to use cpu


#### Generate text based on a given prompt:
We use the generator to generate text based on a prompt: "Once upon a time". Let's specify `max_length=50`, `truncation=True` to limit the generated text to 50 tokens and `num_return_sequences=1` to generate one sequence. The `generator` function returns the generated text, which is then printed.


In [23]:
# Generate text based on a given prompt
prompt = "Once upon a time"
result = generator(prompt, max_length=50, num_return_sequences=1, truncation=True)

# Print the generated text
print(result[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time, as a rule for some odd reason, there are many (often extremely bizarre) creatures that try to take over other people's space. The main thing is that the original space, while very strange, is now a relatively safe


#### Output
The output will be a list of dictionaries, where each dictionary contains:

- `generated_text`: The generated text based on the input prompt.


### Example 4: Text generation using T5 with `pipeline()`

In this example, you will use the `pipeline()` function to perform text-to-text generation with the T5 model. You will load a pretrained T5 model and use it to translate a sentence from English to French based on a given prompt.

#### Initialize the text generation model:
We initialize the pipeline for the `text2text-generation task, specifying the model "t5-small". T5 is a versatile model that can perform various text-to-text generation tasks, including translation.


In [24]:
# Initialize the text generation pipeline with T5
generator = pipeline(task="text2text-generation", model="t5-small")

config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

Device set to use cpu


#### Generate text based on a given prompt:
We use the generator to translate a sentence from English to French based on the prompt: "translate English to French: How are you?". Let's specify `max_length=50` to limit the generated text to 50 tokens and `num_return_sequences=1` to generate one sequence. The `generator` function returns the translated text, which is then printed.


In [25]:
# Generate text based on a given prompt
prompt = "translate English to French: How are you?"
result = generator(prompt, max_length=50, num_return_sequences=1)

# Print the generated text
print(result[0]['generated_text'])

Comment êtes-vous?


#### Output
The output will be a list of dictionaries, where each dictionary contains:

- `generated_text`: The generated text based on the input prompt.


## Benefits of using `pipeline()`

- **Reduced Boilerplate Code**: Simplifies the code required to perform NLP tasks.
- **Improved Readability**: Makes code more readable and expressive.
- **Time Efficiency**: Saves time by handling model loading, tokenization, inference, and post-processing automatically.
- **Consistent API**: Provides a consistent API across different tasks, allowing for easy experimentation and quick prototyping.
- **Automatic Framework Handling**: Automatically handles the underlying framework (TensorFlow or PyTorch).

## When to use `pipeline()`

- **Quick Prototyping**: When you need to quickly prototype an NLP application or experiment with different models.
- **Simple Tasks**: When performing simple or common NLP tasks that are well-supported by the `pipeline()` function.
- **Deployment**: When deploying NLP models in environments where simplicity and ease of use are crucial.

## When to avoid `pipeline()`

- **Custom Tasks**: When you need to perform highly customized tasks that are not well-supported by the `pipeline()` function.
- **Performance Optimization**: When you need fine-grained control over the model and tokenization process for performance optimization or specific use cases.


# Exercise: Fill-mask task using BERT with `pipeline()`

In this exercise, you will use the `pipeline()` function to perform a fill-mask task using the BERT model. You will load a pretrained BERT model and use it to predict the masked word in a given sentence.


### Instructions

1. **Initialize the fill-mask pipeline** with the BERT model.
2. **Create a prompt** with a masked token.
3. **Generate text** by filling in the masked token.
4. **Print the generated text** with the predictions.


In [26]:
from transformers import pipeline

fill_mask = pipeline(task="fill-mask", model="bert-large-cased-whole-word-masking")

prompt = "The quick brown fox jump over the [MASK]>"
result = fill_mask(prompt)
print(result)[0]['token_str']

config.json:   0%|          | 0.00/625 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

BertForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Some weights of the model checkpoint at bert-large-cased-whole-word-masking were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or wi

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Device set to use cpu


[{'score': 0.377121239900589, 'token': 8617, 'token_str': 'fence', 'sequence': 'The quick brown fox jump over the fence >'}, {'score': 0.06637156009674072, 'token': 3676, 'token_str': 'dog', 'sequence': 'The quick brown fox jump over the dog >'}, {'score': 0.018470467999577522, 'token': 2095, 'token_str': 'wall', 'sequence': 'The quick brown fox jump over the wall >'}, {'score': 0.015676548704504967, 'token': 6164, 'token_str': 'wolf', 'sequence': 'The quick brown fox jump over the wolf >'}, {'score': 0.01443182211369276, 'token': 4665, 'token_str': 'hill', 'sequence': 'The quick brown fox jump over the hill >'}]


TypeError: 'NoneType' object is not subscriptable

## Authors


[Ashutosh Sagar](https://www.linkedin.com/in/ashutoshsagar/) is completing his MS in CS from Dalhousie University. He has previous experience working with Natural Language Processing and as a Data Scientist.


© Copyright IBM Corporation. All rights reserved.
