## Gemma Text Generation

url: https://huggingface.co/google/gemma-2-2b-it

In [11]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import uuid
import pandas as pd
import kagglehub
from transformers import AutoTokenizer, AutoModelForCausalLM
import random
import time

To use Gemma:


*   Accept Google's Terms & Conditions when prompted
*   Create [HuggingFace token](https://huggingface.co/settings/tokens) with name HF_TOKEN
  * Allow Read access to contents of all public gated repos you can access
* In Colab: Secrets (key symbol on left) --> Name: HF_TOKEN; Value: your token value
* ALTERNATIVELY: add `hf auth login` line of code (haven't tried this but it should work & is easier than creating a token)



In [2]:
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")

tokenizer_config.json:   0%|          | 0.00/47.0k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/838 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/24.2k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/241M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/187 [00:00<?, ?B/s]

In [3]:
# Move model to GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

Gemma2ForCausalLM(
  (model): Gemma2Model(
    (embed_tokens): Embedding(256000, 2304, padding_idx=0)
    (layers): ModuleList(
      (0-25): 26 x Gemma2DecoderLayer(
        (self_attn): Gemma2Attention(
          (q_proj): Linear(in_features=2304, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2304, out_features=1024, bias=False)
          (v_proj): Linear(in_features=2304, out_features=1024, bias=False)
          (o_proj): Linear(in_features=2048, out_features=2304, bias=False)
        )
        (mlp): Gemma2MLP(
          (gate_proj): Linear(in_features=2304, out_features=9216, bias=False)
          (up_proj): Linear(in_features=2304, out_features=9216, bias=False)
          (down_proj): Linear(in_features=9216, out_features=2304, bias=False)
          (act_fn): GELUTanh()
        )
        (input_layernorm): Gemma2RMSNorm((2304,), eps=1e-06)
        (post_attention_layernorm): Gemma2RMSNorm((2304,), eps=1e-06)
        (pre_feedforward_layernorm): Gemma2RMSNo

In [4]:
# Test inference
messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "who is michael jordan?"}
        ]
    },
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Michael Jordan is widely considered one of the greatest basketball players of all time. 

Here's a summary:

* **Full Name:** Michael Jeffrey Jordan
* **Born:** February 1


In [5]:
# Create test dataset
gemma_df = pd.DataFrame(columns=['uuid', 'topic', 'generated_article', 'elapsed_time'])

In [6]:
topics = ["US - Crime + Justice", "World - Africa", "World - Americas", "World - Asia", "World - Australia",
          "World - China", "World - Europe", "World - India", "World - Middle East", "World - United Kingdom",
          "Politics - CNN Polls", "Politics - Elections", "Business - Tech", "Business - Media", "Business - Markets",
          "Business - Pre-markets", "Business - After-Hours", "Business - Investing", "Business - Markets Now",
          "Health - Fitness", "Health - Food", "Health - Sleep", "Health - Mindfulness", "Health - Relationships",
          "Entertainment - Movies", "Entertainment - Television", "Entertainment - Celebrity", "Tech - Innovate",
          "Tech - Foreseeable Future", "Tech - Innovative Cities", "Style - Arts", "Style - Design", "Style - Fashion",
          "Style - Architecture", "Style - Luxury", "Style - Beauty", "Travel - Destinations", "Travel - Food & Drink",
          "Travel - Lodging and Hotels", "Travel - News", "Sports - Pro Football", "Sports - College Football",
          "Sports - Basketball", "Sports - Baseball", "Sports - Soccer", "Sports - Olympics", "Sports - Hockey",
          "Science - Space", "Science - Life", "Science - Medicine", "Science - Climate", "Science - Solutions",
          "Science - Weather"]

# prompt = f"""Write a full news article in the style of CNN or DailyMail.
# The story should sound realistic, factual, and human-written.
# Use natural journalistic language with short and medium-length sentences.
# Start with a strong lead paragraph summarizing who, what, where, and when.
# Then expand with quotes, context, background, and a final paragraph about next steps or reactions.
# Include realistic numbers, dates, and locations.
# Topics can include {topic}
# Add 1–3 short quotes attributed to plausible people (officials, witnesses, or experts).
# Use neutral tone — no opinions, exaggeration, or bullet points.
# Output only the article text (no headline, no lists, no explanation, no “to summarize”).
# End cleanly after several paragraphs."""

In [8]:
def generate_article(topics):
  n = len(topics)
  i = random.randint(0, n)
  topic = topics[i]
  messages = [{"role": "user", "content": [{"type": "text", "text": f"""Write a full news article in the style of CNN or DailyMail.
                        The story should sound realistic, factual, and human-written.
                        Use natural journalistic language with short and medium-length sentences.
                        Start with a strong lead paragraph summarizing who, what, where, and when.
                        Then expand with quotes, context, background, and a final paragraph about next steps or reactions.
                        Include realistic numbers, dates, and locations.
                        The topic should be about {topic}
                        Add 1–3 short quotes attributed to plausible people (officials, witnesses, or experts).
                        Use neutral tone — no opinions, exaggeration, or bullet points.
                        Output only the article text (no headline, no lists, no explanation, no “to summarize”).
                        End cleanly after several paragraphs."""}]},]
  inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
  ).to(model.device)

  outputs = model.generate(**inputs, max_new_tokens=750, do_sample=True, temperature=0.9, top_p=0.95,top_k=50)
  response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])
  return response, topic

In [13]:
# Generate articles
n = 5 # Number of datapoints to create
for i in range(n):
  # Add unique identifier for the row
  gemma_df.loc[i, 'uuid'] = str(uuid.uuid4())
  start_time = time.perf_counter()
  response, topic = generate_article(topics)
  end_time = time.perf_counter()
  elapsed_time = end_time - start_time
  gemma_df.loc[i, 'topic'] = topic
  gemma_df.loc[i, 'generated_article'] = response
  gemma_df.loc[i, 'elapsed_time'] = elapsed_time

In [14]:
gemma_df.head()

Unnamed: 0,uuid,topic,generated_article,elapsed_time
0,e9cb1d55-dbf0-4854-a9ae-de49bc2c519e,Style - Architecture,"A landmark building in the heart of Prague, th...",24.727109
1,64d76fc1-0dae-479f-879a-853beb4b5908,Travel - Lodging and Hotels,A catastrophic plumbing leak has forced the cl...,19.370725
2,64752fec-54cc-4d3b-9aa5-f296eadde9cf,Entertainment - Movies,"The latest blockbuster, ""The Last Titan,"" feat...",21.548859
3,003852aa-e8fb-434c-b0ff-b354cc6f80be,Style - Beauty,A new wave of skin-bleaching products sweeping...,26.246359
4,5907ba1e-997b-448b-bfa3-36495a3fe257,Politics - Elections,The Democratic Party is set to lose control of...,20.32616


In [15]:
for i in range(n):
  print(i, gemma_df['generated_article'][i])

0 A landmark building in the heart of Prague, the iconic "House of the Dancing House" is facing a contentious proposal that could permanently alter the skyline. Prague’s Municipal Council approved a controversial new design for the building’s facade, a move that has sparked anger among preservationists and architectural enthusiasts. The decision, passed on Tuesday, marks a potential shift away from the building's celebrated modernist design by the renowned architect Frank Gehry. The changes include a new facade designed to resemble a traditional Czech wooden house.

"This proposal is a betrayal of the original vision of the architect," said Josef Koller, a leading Czech architect and member of the Prague Preservation Society. "The Dancing House is an internationally recognized symbol of Prague's architectural heritage, and this transformation could irreparably damage its unique identity." Concerns about the alterations have been raised by citizens and conservation groups who argue that

In [None]:
gemma_df.to_csv("gemma_outputs.csv")

In [None]:
from google.colab import files
uploaded = files.upload()

Saving gemma_text_generation.ipynb to gemma_text_generation.ipynb


In [None]:
!jupyter nbconvert --ClearMetadataPreprocessor.enabled=True --inplace gemma_text_generation.ipynb


[NbConvertApp] Converting notebook gemma_text_generation.ipynb to notebook
[NbConvertApp] Writing 30872 bytes to gemma_text_generation.ipynb


In [None]:
files.download("gemma_text_generation.ipynb")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>