<a href="https://colab.research.google.com/github/RamsesMDLC/Smolagent_Project_1/blob/main/Smolagents_Project_1_YT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**1. LOADING LIBRARIES / MODULES / CLASSES**

In [1]:
#Installs the smolagents library along with extensions defined in the [toolkit] option.
!pip install smolagents[toolkit]

#Components from smolagents
  #CodeAgent: The "agent". It orchestrates reasoning and tool usage.
  #DuckDuckGoSearchTool: The "tool". It lets the agent fetch information from the web.
  #TransformersModel: It "allow us to get access to the model through Hugging Face". A wrapper for Hugging Face Transformer models.
from smolagents import CodeAgent, DuckDuckGoSearchTool, TransformersModel

#API key
  #Provides a secure way to access stored secrets (like API tokens) within Google Colab.
from google.colab import userdata
  #Allows programmatic login to Hugging Face Hub.
from huggingface_hub import login

#Tokenizer: class in the Hugging Face Transformers library to process text inputs ("prompts or text") and outputs ("answer") for the model.
  #This means AutoTokenizer forms the bridge:
    #Input text → tokens/tensors → Model
      #Splitting text into tokens (smaller pieces such as words or subwords).
      #Converting these tokens into numbers ("tensors"), called input IDs, which the model uses for computation.
      #Managing extra elements like special tokens (e.g., [CLS], [SEP], padding).
    #Model output tokens/tensors → decoded text
  #It automatically loads and configures the correct tokenizer for a specified model (i.e., there’s no need to know the model-specific tokenizer class).
from transformers import AutoTokenizer

Collecting smolagents[toolkit]
  Downloading smolagents-1.21.3-py3-none-any.whl.metadata (16 kB)
Collecting ddgs>=9.0.0 (from smolagents[toolkit])
  Downloading ddgs-9.5.5-py3-none-any.whl.metadata (18 kB)
Collecting markdownify>=0.14.1 (from smolagents[toolkit])
  Downloading markdownify-1.2.0-py3-none-any.whl.metadata (9.9 kB)
Collecting primp>=0.15.0 (from ddgs>=9.0.0->smolagents[toolkit])
  Downloading primp-0.15.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting lxml>=6.0.0 (from ddgs>=9.0.0->smolagents[toolkit])
  Downloading lxml-6.0.1-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl.metadata (3.8 kB)
Downloading ddgs-9.5.5-py3-none-any.whl (37 kB)
Downloading markdownify-1.2.0-py3-none-any.whl (15 kB)
Downloading smolagents-1.21.3-py3-none-any.whl (145 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m145.4/145.4 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading lxml-6.0.1-cp312-cp312-manylinux_2_26_x8

In [2]:
# Securely get Hugging Face token and login
hf_token = userdata.get('HF_TOKEN')
if hf_token:
    login(hf_token)
    print("Successfully logged in to Hugging Face!")
else:
    print("Token not found. Please add HF_TOKEN secret.")

Successfully logged in to Hugging Face!


    if tokenizer.pad_token is None:

    tokenizer.pad_token = tokenizer.eos_token
   
Why this matters

Padding tokens are critical for batch processing inputs of varying lengths.

A padding token is a special token used in natural language processing (NLP) to make input sequences (like sentences or documents) have the same fixed length when processed by machine learning models.
Why is it needed?

    Most NLP models, especially deep learning models like transformers or RNNs, require inputs to be uniform in length.

    Real-world texts vary in length, so to batch-process multiple sequences efficiently, shorter sequences are padded with these special tokens until they match the longest sequence length in the batch.

    Padding tokens carry no meaningful information and are meant only to fill space for model input consistency.

Example:

If you have these sentences tokenized into token IDs:

    Sentence 1: [3456] (length 6)

    Sentence 2: (length 2)

To process them together, Sentence 2 might be padded with four pad tokens (often token ID 0):

    Sentence 2 padded:

How it works in the model:

    The model uses attention masks (binary flags) to ignore these padding tokens during processing, so they do not affect predictions or training loss.

    Padding preserves the position of real tokens in the sequence, allowing consistent indexing.

Summary

Padding tokens enable models to handle variable-length text inputs by standardizing them into fixed-size sequences, allowing efficient batch training and inference while maintaining the order and meaning of the original text content.

This makes it possible for NLP models to process multiple texts simultaneously without errors or inefficiencies caused by varying input lengths.

In [None]:
#Defining Model (from Hugging Face)
model_id = "Qwen/Qwen1.5-1.8B"

#Initialize tokenizer
  #Load a pretrained tokenizer for the given model identified by model_id (in this case "Qwen/Qwen1.5-1.8B")
    #The tokenizer includes vocabulary, tokenization rules, special tokens, and associated settings needed to convert raw text into token IDs.
tokenizer = AutoTokenizer.from_pretrained(model_id)

#Check whether the tokenizer has a designated padding token.
  #Padding tokens are used to make all input sequences the same length by adding special "pad" tokens to shorter sequences.

#If the padding token is not set, the tokenizer or model might throw errors during inference or training.

#By assigning the EOS token as padding, the code ensures compatibility even when a dedicated pad token is not defined for the particular model.
# Summary

# The code safely loads the tokenizer for a model and guarantees it has a valid padding token by assigning it to the EOS token if missing. This ensures stable input preprocessing and model compatibility during tokenization and generation


if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load the model
model = TransformersModel(model_id=model_id)

# Fix pad_token_id in model config if not set
if model.model.config.pad_token_id is None:
    model.model.config.pad_token_id = tokenizer.pad_token_id

In [None]:
# Initialize agent
agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model)

# Prepare input text and tokenize with attention mask
input_text = "How long would it take for an elephant to cross the United States from Florida to California?"
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True)

input_ids = inputs["input_ids"]
attention_mask = inputs["attention_mask"]

# Since agent.run() might not allow passing attention_mask directly,
# call the model generation yourself for reliable behavior:

generated_ids = model.model.generate(
    input_ids=input_ids,
    attention_mask=attention_mask,
    pad_token_id=model.model.config.pad_token_id,
    max_new_tokens=50
)

# Decode generated tokens
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print("Generated text:")
print(generated_text)