# RUN A SIMPLE LLM USING HUGGINGFACE

### LOGIN TO HUGGINGFACE

In [2]:
import subprocess
from pathlib import Path
import os
from dotenv import load_dotenv

def huggingface_login():
    """
    automates the login process to HuggingFace
    """

    load_dotenv()
    token = os.getenv("HF_TOKEN")

    if not token:
        raise ValueError("HF_TOKEN not found in environment variables or .env file")
    
    try:
        token_path = Path.home() / ".huggingface" / "token"
        token_path.parent.mkdir(parents=True, exist_ok=True)
        token_path.write_text(token)

        os.environ["HF_TOKEN"] = token

        subprocess.run(["huggingface-cli", "login", "--token", token], check=True)
        subprocess.run(["git", "config", "--global", "credential.helper", "store"], check=True)
        print("Successfully logged in to HuggingFace!")

    except subprocess.CalledProcessError as e:
        raise RuntimeError (f"Failed to login to HuggingFace: {e}")

huggingface_login()

Successfully logged in to HuggingFace!


### Tokenizers in Large Language Models (LLMs)

A **tokenizer** is the component of a large language model (LLM) that converts text into smaller pieces—called **tokens**—which the model can understand and process numerically.

For example, take the sentence:  
> “I love Machine Learning!”

A tokenizer might split it into tokens like:  
`["I", " love", " Machine", " Learning", "!"]`

Each token is then mapped to a unique number (an ID), such as:  
`[100, 567, 8921, 2205, 33]`

These IDs are what the LLM actually reads. Different tokenizers can split text differently—some by words, others by subwords or even characters—depending on how they were trained.

The reverse process, **decoding**, converts token IDs back into readable text. For instance, decoding `[100, 567, 8921, 2205, 33]` would reconstruct the original:  
> “I love Machine Learning!”

In short, **tokenization** turns human language into numbers for the model, while **decoding** turns the model’s numeric outputs back into human language.


### How Tokenizers Are Trained
!!!!! Complete here

In [30]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

import warnings
warnings.filterwarnings("ignore")

# torch.backends.cudnn.enabled = False
         
device = "cuda" if torch.cuda.is_available() else "cpu"
# device = "cpu" # "cpu" or "cuda"

# model_name = "google/gemma-3-270m"
model_name = "google/gemma-3-270m-it"
prompt = "How you would explain machine learning in simple words?"

# load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    # torch_dtype controls precision for standard loading and tells PyTorch how to store and compute all model weights and activations
    torch_dtype=torch.float32, # NOTE: only float32 and float64 work on CPU
    device_map=device, # auto, cpu, cuda, cuda: 0 etc.
)

# inputs = tokenizer(prompt, return_tensors="pt")
formatted_prompt = f"<start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(formatted_prompt, return_tensors="pt")

print(f"User's Prompt:\n{prompt}")
print("Bot:")
with torch.no_grad():
    generated_ids = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.7
    )

### EXTRACT RESPONSE
# decode all tokens to text
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=False)

# remove the input part (prompt) so only new tokens remain
input_text = tokenizer.decode(inputs["input_ids"][0], skip_special_tokens=False)
response_text = generated_text[len(input_text):]

# cleanup (remove special tags or whitespace)
response_text = response_text.replace("<end_of_turn>", "").strip()

print(response_text)

Setting `pad_token_id` to `eos_token_id`:1 for open-end generation.


User's Prompt:
How you would explain machine learning in simple words?
Bot:
Imagine you have a bunch of data. Machine learning is like teaching a computer to learn from that data! 

Here's how it works:

*   **Learning from data:** You feed the computer lots of examples.
*   **Finding patterns:** The computer looks at the examples and tries to find patterns, like "If you see this example, it's likely to be a spam email."
*   **Making predictions:** Based on these patterns, the computer tries to make predictions about new, unseen data.
*   **Giving it feedback:** You give the computer feedback, like "That's not right."
*   **Learning from feedback:** The computer adjusts its model based on the feedback, making it better at predicting the future.

So, machine learning is like teaching a computer to learn and improve over time by analyzing data and making predictions.
