Text generation (GPT-2)

Summarization (T5)

Public dataset usage

Evaluation using ROUGE

Simple observability (logging prompts & outputs)

Ethical disclosure example

In [1]:
!pip install -q transformers datasets evaluate torch sentencepiece


In [2]:
!pip install -q rouge_score



**Step 1: Import Libraries**

In [3]:
import torch
from transformers import pipeline
from datasets import load_dataset
import evaluate
import datetime




**Step 2: Device Setup**

In [4]:
device = 0 if torch.cuda.is_available() else -1
print("Using GPU" if device == 0 else "Using CPU")


Using CPU


**Step 3: Text Generation (Generative NLP Core)**

In [5]:
text_generator = pipeline(
    "text-generation",
    model="gpt2",
    device=device
)

prompt = "Generative AI is important for natural language processing because"

generated_text = text_generator(
    prompt,
    max_length=80,
    temperature=0.8,
    top_p=0.9
)[0]["generated_text"]

print("=== Generated Text ===")
print(generated_text)


Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=80) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


=== Generated Text ===
Generative AI is important for natural language processing because it's the only way to learn what you're looking for. If you want to learn a new language, you have to go to a machine learning library like TensorFlow or Google Deep Learning.

What is the best language for learning about the world?

It's hard to pick a better language for learning about the world, because the world is changing. There's lots of new things happening in the world, like in the world of digital currency and the internet.

If you were to go and talk to the smartest people in the world, you'd probably be talking about AI.

If you're interested in the world of AI, then you should definitely study Deep Learning.

If you're interested in the world of Artificial Intelligence, then you should definitely study AI.

What is the best AI training tool that can teach you to be a great AI?

A good training tool is called a "Machine Learning Engine." It's a machine learning engine that gives you the

In [7]:
#Alternate
from transformers import pipeline

text_generator = pipeline(
    "text-generation",
    model="gpt2",
    device=device
)

prompt = "Generative AI is important for natural language processing because"

generated_text = text_generator(
    prompt,
    max_new_tokens=80,          # Use only max_new_tokens
    temperature=0.8,
    top_p=0.9,
    truncation=True,            # Explicit truncation
    pad_token_id=50256          # GPT-2 EOS token
)[0]["generated_text"]

print("=== Generated Text ===")
print(generated_text)


Device set to use cpu


=== Generated Text ===
Generative AI is important for natural language processing because it allows us to solve real problems without having to write any code. In fact, it can be used to improve computer vision. The more data we collect, the more we can understand how our AI is doing.

A better understanding of how AI works, by using neural networks, is very important. In fact, it is a huge advantage to have the ability to make a machine more intelligent


In [6]:
#Login, create a token,update in secrets and run the following
from huggingface_hub import login
login()


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

**Step 4: Load Public Dataset (Academic Resource)**

In [9]:
dataset = load_dataset("cnn_dailymail", "3.0.0", split="train[:1%]")

article = dataset[0]["article"]
reference_summary = dataset[0]["highlights"]

print("\n=== Article (Excerpt) ===")
print(article[:400])



=== Article (Excerpt) ===
LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported £20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won't cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his ca


**Step 5: Summarization Using Open-Source Model**

In [10]:
summarizer = pipeline(
    "summarization",
    model="t5-small",
    device=device
)

generated_summary = summarizer(
    article,
    max_length=120,
    min_length=50,
    do_sample=False
)[0]["summary_text"]

print("\n=== Generated Summary ===")
print(generated_summary)


Device set to use cpu
Token indices sequence length is longer than the specified maximum sequence length for this model (638 > 512). Running this sequence through the model will result in indexing errors
Both `max_new_tokens` (=256) and `max_length`(=120) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



=== Generated Summary ===
the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties . he will be able to gamble in a casino, buy a drink or see the horror film "Hostel: Part II" his agent and publicist had no comment on his plans . his latest outing as the boy wizard is breaking records on both sides of the Atlantic .


**Step 6: Evaluation Using ROUGE (Free Benchmarking Tool)**

In [11]:
import evaluate

rouge = evaluate.load("rouge")

scores = rouge.compute(
    predictions=[generated_summary],
    references=[reference_summary]
)

print("\n=== ROUGE Evaluation Scores ===")
for key, value in scores.items():
    print(f"{key}: {value:.4f}")



Downloading builder script: 0.00B [00:00, ?B/s]


=== ROUGE Evaluation Scores ===
rouge1: 0.2857
rouge2: 0.2136
rougeL: 0.2476
rougeLsum: 0.2667


**Step 7: Simple LLMOps – Observability & Logging**

In [12]:
log_entry = {
    "timestamp": datetime.datetime.now().isoformat(),
    "model_used": "gpt2 (generation), t5-small (summarization)",
    "prompt": prompt,
    "generated_text": generated_text,
    "generated_summary": generated_summary,
    "evaluation_scores": scores
}

print("\n=== Logged Interaction ===")
log_entry



=== Logged Interaction ===


{'timestamp': '2026-01-20T12:53:33.428095',
 'model_used': 'gpt2 (generation), t5-small (summarization)',
 'prompt': 'Generative AI is important for natural language processing because',
 'generated_text': 'Generative AI is important for natural language processing because it allows us to solve real problems without having to write any code. In fact, it can be used to improve computer vision. The more data we collect, the more we can understand how our AI is doing.\n\nA better understanding of how AI works, by using neural networks, is very important. In fact, it is a huge advantage to have the ability to make a machine more intelligent',
 'generated_summary': 'the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties . he will be able to gamble in a casino, buy a drink or see the horror film "Hostel: Part II" his agent and publicist had no comment on his plans . his latest outing as the boy wizard is breaking records on both sides of the A

**Step 8: Hallucination & Bias Check (Manual Review)**

In [13]:
print("\n=== Reference Summary ===")
print(reference_summary)

print("\n=== Model Summary ===")
print(generated_summary)



=== Reference Summary ===
Harry Potter star Daniel Radcliffe gets £20M fortune as he turns 18 Monday .
Young actor says he has no plans to fritter his cash away .
Radcliffe's earnings from first five Potter films have been held in trust fund .

=== Model Summary ===
the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties . he will be able to gamble in a casino, buy a drink or see the horror film "Hostel: Part II" his agent and publicist had no comment on his plans . his latest outing as the boy wizard is breaking records on both sides of the Atlantic .


**Step 9: Ethical & Academic Integrity Disclosure**

In [14]:
ethics_statement = """
This output was generated using open-source pretrained language models.
The generated content was reviewed by a human for accuracy.
AI assistance is disclosed in accordance with academic integrity guidelines.
"""

print("\n=== Ethics Disclosure ===")
print(ethics_statement)



=== Ethics Disclosure ===

This output was generated using open-source pretrained language models.
The generated content was reviewed by a human for accuracy.
AI assistance is disclosed in accordance with academic integrity guidelines.

