<a href="https://colab.research.google.com/github/SamathaReddy-Web/assign_1/blob/main/Asmnt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -q transformers
# It installs the transformers library from Hugging Face, which is used to work with pre-trained LLMs like GPT.
#  ! => This tells Google Colab to run a shell command (i.e., a terminal).
# pip => pip is the Python package installer. it installs libraries and tools from python package
# -q => Stands for quiet mode. It hides extra logs and messages during installation, showing only errors
# transformers => name of the library/package to be installed. we install to Load pre-trained language models from Hugging Face, Tokenize input text, generate responses

In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

model_name = "google/flan-t5-small"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Create a text2text-generation pipeline
generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

# AutoTokenizer => A class that automatically loads the correct tokenizer for any given model. A tokenizer
# converts human-readable text into tokens (numbers) that the model can understand.

# AutoModelForSeq2SeqLM => Loads a sequence-to-sequence language model, FLAN-T5, etc. Suitable for tasks
# where input → output mapping is needed (e.g., translation, summarization, question answering).

# pipeline: High-level function that wraps everything (model + tokenizer) into a ready-to-use tool.
# Just give it a task name like "text2text-generation" and it does the rest.

# tokenizer ==> Loads the correct tokenizer for the given model.
# Downloads necessary files (if not already cached).
# Converts your input text into tokens (IDs) that the model can understand.

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/308M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Device set to use cpu


In [None]:
prompt = "What is the capital of Japan?"
output = generator(prompt)
print("Answer:\n", output[0]["generated_text"])

# This runs the prompt through the FLAN-T5 model using the text2text-generation pipeline.

# The pipeline internally:
# Tokenizes your prompt.
# Passes it through the model.
# Decodes the result back into human-readable text.

Answer:
 yugoslavia


In [None]:
# Zero-shot Summarization
prompt = "Summarize: Artificial Intelligence is changing industries by automating processes, enhancing decision-making, and enabling new innovations."
output = generator(prompt)
print("Zero-shot Summary:\n", output[0]['generated_text'])


# directly ask the model to perform the task without giving any examples.

Zero-shot Summary:
 Artificial intelligence is transforming industries by automating processes, enhancing decision-making, and enabling new innovations.


In [None]:
# Zero-shot Translation
prompt = "Translate to French: I am learning machine learning."
output = generator(prompt)
print("Zero-shot Translation:\n", output[0]['generated_text'])

# This model handles multilingual tasks due to instruction fine-tuning. which handles the relativity and creativity of the output
# `generator(prompt)` sends the prompt to the model and retrieves the translation.

Zero-shot Translation:
 J'ai apprendre à l'apprentissage des machines.


In [None]:
# Few-shot QA with 2 examples
prompt = """Q: What is the capital of France?
A: Paris

Q: Who wrote '1984'?
A: George Orwell

Q: What is the largest ocean on Earth?
A:"""

output = generator(prompt)
print("Few-shot QA:\n", output[0]['generated_text'])

# With examples is few-shot, the model mimics the pattern and gives more accurate, confident answers, and how answers are expected
# few examples pairs to help the model learn the patterns.


Few-shot QA:
 Pacific Ocean


In [None]:
# Chain-of-thought reasoning
prompt = """Question: If you have 3 apples and you buy 2 more, how many apples do you have now?
Let's think step by step."""
output = generator(prompt, max_new_tokens=60)
print("Chain-of-Thought Reasoning:\n", output[0]['generated_text'])

# Chain-of-thought prompting guides the model to reason before answering.
# explicitly guide the model to think step-by-step before giving a final answer.

Chain-of-Thought Reasoning:
 If you buy 2 more apples, you have 3 + 2 = 6 apples. If you buy 2 more apples, you have 6 + 2 = 6 apples. The answer: 6.
