Gentle Introduction to Hugging Face Transformers Library
===

This notebook demonstrates the code examples from this [article](https://tobeadatascientist.substack.com/p/gentle-introduction-to-hugging-face-transformers), showcasing the before and after of each technique.

For more resources like this, visit [tobeadatascientist.com](https://tobeadatascientist.com)



---



For learning purposes, consider smaller, lightweight models like:

**GPT-2:** A compact and efficient model for text generation.

**DistilBERT:** Great for classification tasks with reduced memory requirements.

*Using these models ensures fast loading times, making it easier to explore Hugging Face’s capabilities without having to buy an expensive GPU or paying for Google Colab Pro*

# Method 1: Using the pipeline API

In [1]:
from transformers import pipeline

# Initialize the text-generation pipeline
generator = pipeline("text-generation", model="gpt2")

# Generate text
prompt = "In Python, a list comprehension is a more concise way to"
output = generator(prompt, max_length=30)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [2]:
# Print the generated text
print(f"Input: {prompt}")
print(f"Generated text: {output[0]['generated_text']}")

Input: In Python, a list comprehension is a more concise way to
Generated text: In Python, a list comprehension is a more concise way to extract an element from a list. For example, the following code might look like this:


# Method 2: Using AutoTokenizer and AutoModelForCausalLM

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the GPT-2 tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

# Set a padding token to avoid warnings (GPT-2 doesn't have one by default)
tokenizer.pad_token = tokenizer.eos_token

# Define the starting text
input_text = "In Python, a list comprehension is a more concise way to"

# Convert text into tokens (numbers the model can understand)
tokens = tokenizer(
    input_text,
    return_tensors="pt",  # Convert text into a PyTorch tensor (needed for the model)
    truncation=True,  # If the input is too long, cut it down to max_length
    max_length=50  # Limit input to 50 tokens
)

# Generate new text based on the input
output_ids = model.generate(
    tokens["input_ids"],  # The model takes tokenized input
    max_new_tokens=50,  # Generate up to 50 new tokens
    do_sample=True,  # Enables randomness for more varied responses
    temperature=0.8,  # Controls randomness (higher = more creative output)
    top_k=50,  # Only consider the 50 most likely words at each step
    top_p=0.9,  # Use nucleus sampling (pick words with a total probability of 90%)
    repetition_penalty=1.2,  # Reduce repetitive phrases in the output
    pad_token_id=tokenizer.eos_token_id  # Avoids errors related to missing padding
)

# Convert the generated tokens back into readable text
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


In [4]:
# Print the input and chatbot's response
print("Input:", input_text)
print("\nGenerated text:", generated_text)

Input: In Python, a list comprehension is a more concise way to

Generated text: In Python, a list comprehension is a more concise way to understand your program.
The following example shows how an array of integers represents the data in each dimension: >>> from collections import ArrayList < T > as BoundedArray def enumerate ( x , y ): return [ 'a' ] b =


*Find more information in the official [documentation](https://huggingface.co/docs/transformers/en/index)*