<a href="https://colab.research.google.com/github/epicfacedood/LLM-RLHF/blob/main/llm_implementation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Check GPU first
!nvidia-smi

# Install OpenRLHF and dependencies
print("\n📦 Installing OpenRLHF...")
!pip install openrlhf -q

# Install additional required packages
!pip install torch transformers deepspeed accelerate -q

# Create directory for dataset
!mkdir -p /content/OpenRLHF

# Verify installation
import openrlhf
import torch
print(f"\n✅ OpenRLHF installed successfully!")
print(f"✅ PyTorch version: {torch.__version__}")
print(f"✅ CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"✅ GPU: {torch.cuda.get_device_name(0)}")
    print(f"✅ GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

Tue Oct 14 23:52:58 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   52C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

Prepare Data


In [2]:
import json

# Use proper chat format
custom_data = []

qa_pairs = qa_pairs = [
    ("What is the capital of France?", "The capital of France is Paris."),
    ("Who wrote 'To Kill a Mockingbird'?", "Harper Lee wrote 'To Kill a Mockingbird'."),
    ("What is the formula for water?", "The chemical formula for water is H2O."),
    ("What is the capital of Japan?", "The capital of Japan is Tokyo."),
    ("Who invented the telephone?", "Alexander Graham Bell is credited with inventing the telephone."),
    ("What is 5 + 3?", "5 + 3 equals 8."),
    ("What is the largest ocean on Earth?", "The Pacific Ocean is the largest ocean on Earth."),
    ("How many continents are there?", "There are seven continents."),
    ("What color is the sky on a clear day?", "The sky is blue on a clear day."),
    ("What are the primary colors?", "The primary colors are red, yellow, and blue."),
    ("Who painted the Mona Lisa?", "Leonardo da Vinci painted the Mona Lisa."),
    ("What is the speed of light?", "The speed of light is approximately 299,792 kilometers per second."),
    ("What year did World War II end?", "World War II ended in 1945."),
    ("What is photosynthesis?", "Photosynthesis is the process plants use to convert light energy into chemical energy."),
    ("How many hours are in a day?", "There are 24 hours in a day."),
    ("What is the tallest mountain in the world?", "Mount Everest is the tallest mountain in the world."),
    ("What is the main ingredient in guacamole?", "The main ingredient in guacamole is avocado."),
    ("How many sides does a triangle have?", "A triangle has three sides."),
    ("What is the currency of the United Kingdom?", "The currency of the United Kingdom is the Pound Sterling."),
    ("Who wrote the play 'Romeo and Juliet'?", "William Shakespeare wrote the play 'Romeo and Juliet'."),
    ("What is the boiling point of water in Celsius?", "The boiling point of water is 100 degrees Celsius."),
    ("What is the largest desert in the world?", "The Antarctic Polar Desert is the largest desert in the world."),
    ("What do bees primarily produce?", "Bees primarily produce honey."),
    ("What is the hardest known natural substance?", "Diamond is the hardest known natural substance."),
    ("How many players are on a standard soccer team on the field?", "There are 11 players on a standard soccer team on the field."),
    ("What is the capital of Australia?", "The capital of Australia is Canberra."),
    ("What gas do plants absorb for photosynthesis?", "Plants absorb carbon dioxide for photosynthesis."),
    ("In which country are the Great Pyramids of Giza?", "The Great Pyramids of Giza are in Egypt."),
    ("What is the name of our galaxy?", "The name of our galaxy is the Milky Way."),
    ("What is the largest planet in our solar system?", "Jupiter is the largest planet in our solar system."),
    ("What is the main gas in Earth's atmosphere?", "Nitrogen is the main gas in Earth's atmosphere."),
    ("How many bones are in the adult human body?", "There are 206 bones in the adult human body."),
    ("What is the capital of Canada?", "The capital of Canada is Ottawa."),
    ("Who wrote 'The Great Gatsby'?", "F. Scott Fitzgerald wrote 'The Great Gatsby'."),
    ("What is the freezing point of water in Fahrenheit?", "The freezing point of water is 32 degrees Fahrenheit."),
    ("What is the longest river in the world?", "The Nile River is the longest river in the world."),
    ("What is a baby goat called?", "A baby goat is called a kid."),
    ("What is the most spoken language in the world by number of native speakers?", "Mandarin Chinese is the most spoken language by number of native speakers."),
    ("How many states are in the USA?", "There are 50 states in the USA."),
    ("What is the capital of Italy?", "The capital of Italy is Rome."),
    ("What is the chemical symbol for gold?", "The chemical symbol for gold is Au."),
    ("In what year did the Titanic sink?", "The Titanic sank in 1912."),
    ("What is Saturn's largest moon?", "Titan is Saturn's largest moon."),
    ("What is the primary function of the heart?", "The primary function of the heart is to pump blood through the circulatory system."),
    ("Who was the first President of the United States?", "George Washington was the first President of the United States."),
    ("What is the square root of 81?", "The square root of 81 is 9."),
    ("What type of animal is a dolphin?", "A dolphin is a mammal."),
    ("What is the currency of Japan?", "The currency of Japan is the Yen."),
    ("Who painted the ceiling of the Sistine Chapel?", "Michelangelo painted the ceiling of the Sistine Chapel."),
    ("What is the smallest continent by land area?", "Australia is the smallest continent by land area."),
    ("What is the common name for sodium chloride?", "The common name for sodium chloride is salt."),
    ("How many sides does a hexagon have?", "A hexagon has six sides."),
    ("What is the capital of Spain?", "The capital of Spain is Madrid."),
    ("Which planet is known as the Red Planet?", "Mars is known as the Red Planet."),
    ("What element is the main component of the sun?", "Hydrogen is the main component of the sun."),
    ("Who discovered penicillin?", "Alexander Fleming discovered penicillin."),
    ("What is the capital of Russia?", "The capital of Russia is Moscow."),
    ("What is the process of a liquid turning into a gas called?", "The process of a liquid turning into a gas is called evaporation or boiling."),
    ("What is the largest animal on Earth?", "The blue whale is the largest animal on Earth."),
    ("How many colors are in a rainbow?", "There are seven colors in a rainbow."),
    ("Who wrote the 'Harry Potter' series?", "J.K. Rowling wrote the 'Harry Potter' series."),
    ("What is the chemical symbol for oxygen?", "The chemical symbol for oxygen is O."),
    ("What is the largest country by area?", "Russia is the largest country by area."),
    ("What is the capital of Germany?", "The capital of Germany is Berlin."),
    ("What is the study of stars and celestial bodies called?", "The study of stars and celestial bodies is called astronomy."),
    ("What instrument measures temperature?", "A thermometer measures temperature."),
    ("What is the capital of Brazil?", "The capital of Brazil is Brasília."),
    ("What do caterpillars transform into?", "Caterpillars transform into butterflies or moths."),
    ("How many days are in a leap year?", "There are 366 days in a leap year."),
    ("What is the largest bone in the human body?", "The femur, or thigh bone, is the largest bone in the human body."),
    ("Who proposed the theory of evolution by natural selection?", "Charles Darwin proposed the theory of evolution by natural selection."),
    ("What is the capital of China?", "The capital of China is Beijing."),
    ("What is the main function of the lungs?", "The main function of the lungs is to facilitate gas exchange for respiration."),
    ("How many letters are in the English alphabet?", "There are 26 letters in the English alphabet."),
    ("What is the capital of Egypt?", "The capital of Egypt is Cairo."),
    ("What is the fastest land animal?", "The cheetah is the fastest land animal."),
    ("Who directed the movie 'Jurassic Park'?", "Steven Spielberg directed the movie 'Jurassic Park'."),
    ("What is the chemical symbol for silver?", "The chemical symbol for silver is Ag."),
    ("How many planets are in our solar system?", "There are eight planets in our solar system."),
    ("What is the capital of India?", "The capital of India is New Delhi."),
    ("What is the human body's largest organ?", "The skin is the human body's largest organ."),
    ("In which city is the Eiffel Tower located?", "The Eiffel Tower is located in Paris."),
    ("What is the primary source of energy for the Earth?", "The Sun is the primary source of energy for the Earth."),
    ("What is 12 multiplied by 12?", "12 multiplied by 12 is 144."),
    ("What is the capital of Mexico?", "The capital of Mexico is Mexico City."),
    ("Who wrote 'The Catcher in the Rye'?", "J.D. Salinger wrote 'The Catcher in the Rye'."),
    ("Which country has the largest population?", "India has the largest population."),
    ("What is a group of lions called?", "A group of lions is called a pride."),
    ("What is the capital of Argentina?", "The capital of Argentina is Buenos Aires."),
    ("What is the main currency of the European Union?", "The main currency of the European Union is the Euro."),
    ("What force pulls objects toward the center of the Earth?", "Gravity is the force that pulls objects toward the center of the Earth."),
    ("Who painted 'The Starry Night'?", "Vincent van Gogh painted 'The Starry Night'."),
    ("What is the capital of South Korea?", "The capital of South Korea is Seoul."),
    ("Which ocean is the smallest?", "The Arctic Ocean is the smallest."),
    ("What gas do humans exhale when they breathe?", "Humans exhale carbon dioxide when they breathe."),
    ("What is the chemical formula for table salt?", "The chemical formula for table salt is NaCl."),
    ("How many strings does a standard violin have?", "A standard violin has four strings."),
    ("What is the capital of Greece?", "The capital of Greece is Athens."),
    ("Who is credited with inventing the light bulb?", "Thomas Edison is credited with inventing the practical incandescent light bulb."),
    ("What is the fear of spiders called?", "The fear of spiders is called arachnophobia."),
    ("How many sides does an octagon have?", "An octagon has eight sides."),
    ("What is the national animal of Australia?", "The red kangaroo is the national animal of Australia."),
    ("What is the capital of Thailand?", "The capital of Thailand is Bangkok."),
    ("Who was the first female Prime Minister of the United Kingdom?", "Margaret Thatcher was the first female Prime Minister of the United Kingdom."),
    ("What is the chemical symbol for iron?", "The chemical symbol for iron is Fe."),
    ("In what country is the city of Dubai located?", "The city of Dubai is located in the United Arab Emirates."),
    ("What is the study of earthquakes called?", "The study of earthquakes is called seismology."),
    ("What is the legislative capital of South Africa?", "Cape Town is the legislative capital of South Africa."),
    ("How many minutes are in an hour?", "There are 60 minutes in an hour."),
    ("Who is the Greek god of the sea?", "Poseidon is the Greek god of the sea."),
    ("What is the world's largest island?", "Greenland is the world's largest island."),
    ("What is the official language of Brazil?", "The official language of Brazil is Portuguese."),
    ("What is the capital of Turkey?", "The capital of Turkey is Ankara."),
    ("Who composed the classical music piece 'Für Elise'?", "Ludwig van Beethoven composed 'Für Elise'."),
    ("What are the two national sports of Canada?", "Ice hockey and lacrosse are the two national sports of Canada."),
    ("What is the capital of Norway?", "The capital of Norway is Oslo."),
    ("Which planet is closest to the Sun?", "Mercury is the planet closest to the Sun."),
    ("Who wrote 'Pride and Prejudice'?", "Jane Austen wrote 'Pride and Prejudice'."),
    ("What language is spoken by computers at the most basic level?", "Computers speak binary code at the most basic level."),
    ("What is the capital of Sweden?", "The capital of Sweden is Stockholm."),
    ("What is the unit of electrical resistance?", "The Ohm is the unit of electrical resistance."),
    ("In which mountain range is Mount Everest located?", "Mount Everest is located in the Himalayas."),
    ("What is a group of crows called?", "A group of crows is called a murder."),
    ("What is the capital of Portugal?", "The capital of Portugal is Lisbon."),
    ("What is the chemical symbol for potassium?", "The chemical symbol for potassium is K."),
    ("Who is the main character in Homer's 'Odyssey'?", "Odysseus is the main character in Homer's 'Odyssey'."),
    ("What is the largest living reptile?", "The saltwater crocodile is the largest living reptile."),
    ("What is the capital of Finland?", "The capital of Finland is Helsinki."),
    ("What is the study of weather called?", "The study of weather is called meteorology."),
    ("Who was the Roman god of war?", "Mars was the Roman god of war."),
    ("What is the boiling point of water in Fahrenheit?", "The boiling point of water is 212 degrees Fahrenheit."),
    ("What is the capital of Ireland?", "The capital of Ireland is Dublin."),
    ("How many teeth does a typical adult human have?", "A typical adult human has 32 teeth."),
    ("What is the world's most popular sport?", "Soccer (association football) is the world's most popular sport."),
    ("What is the capital of Denmark?", "The capital of Denmark is Copenhagen."),
    ("Which chemical element has the atomic number 1?", "Hydrogen has the atomic number 1."),
    ("Who wrote the novel '1984'?", "George Orwell wrote the novel '1984'."),
    ("What was the name of the ancient supercontinent?", "The ancient supercontinent was named Pangaea."),
    ("What is the capital of New Zealand?", "The capital of New Zealand is Wellington."),
    ("What device is used to measure atmospheric pressure?", "A barometer is used to measure atmospheric pressure."),
    ("Which artist is known for the 'Campbell's Soup Cans' series?", "Andy Warhol is known for the 'Campbell's Soup Cans' series."),
    ("What is the capital of Belgium?", "The capital of Belgium is Brussels."),
    ("What is the longest man-made structure in the world?", "The Great Wall of China is the longest man-made structure in the world."),
    ("What is a group of fish called?", "A group of fish is called a school or shoal."),
    ("What is the capital of the Philippines?", "The capital of the Philippines is Manila."),
    ("How many sides does a pentagon have?", "A pentagon has five sides."),
    ("What is the capital of Austria?", "The capital of Austria is Vienna."),
    ("Who is often called the 'Father of the Computer'?", "Charles Babbage is often called the 'Father of the Computer'."),
    ("What is the smallest prime number?", "The smallest prime number is 2."),
    ("What is the capital of Switzerland?", "The capital of Switzerland is Bern."),
    ("What is the chemical symbol for lead?", "The chemical symbol for lead is Pb."),
    ("What is the largest hot desert in the world?", "The Sahara is the largest hot desert in the world."),
    ("What is the capital of Chile?", "The capital of Chile is Santiago."),
    ("What is a word that reads the same forwards and backwards called?", "A word that reads the same forwards and backwards is called a palindrome."),
    ("What is the study of fossils called?", "The study of fossils is called paleontology."),
    ("What is the capital of the Netherlands?", "The capital of the Netherlands is Amsterdam."),
    ("Which bird can mimic human speech?", "Parrots and mynah birds are known for mimicking human speech."),
    ("What is the primary language spoken in Argentina?", "The primary language spoken in Argentina is Spanish."),
    ("What is the capital of Poland?", "The capital of Poland is Warsaw."),
    ("Who was the first person to travel into space?", "Yuri Gagarin was the first person to travel into space."),
    ("What is the chemical formula for carbon dioxide?", "The chemical formula for carbon dioxide is CO2."),
    ("What is the capital of Vietnam?", "The capital of Vietnam is Hanoi."),
    ("What is the most abundant element in the Earth's crust?", "Oxygen is the most abundant element in the Earth's crust."),
    ("Who created the 'Peanuts' comic strip?", "Charles M. Schulz created the 'Peanuts' comic strip."),
    ("What is the capital of Hungary?", "The capital of Hungary is Budapest."),
    ("What is the largest moon in the solar system?", "Ganymede, a moon of Jupiter, is the largest moon in the solar system."),
    ("What is the primary power source for the International Space Station?", "The primary power source for the International Space Station is solar panels."),
    ("What is the capital of Peru?", "The capital of Peru is Lima."),
    ("What is the unit of frequency?", "The unit of frequency is Hertz (Hz)."),
    ("Who wrote 'The Adventures of Tom Sawyer'?", "Mark Twain wrote 'The Adventures of Tom Sawyer'."),
    ("What is the capital of the Czech Republic?", "The capital of the Czech Republic is Prague."),
    ("What is the largest species of big cat?", "The Siberian tiger is the largest species of big cat."),
    ("At what temperature Celsius does water freeze?", "Water freezes at 0 degrees Celsius."),
    ("What is the capital of Colombia?", "The capital of Colombia is Bogotá."),
    ("What is a word that sounds the same as another but has a different meaning?", "A homophone is a word that sounds the same as another but has a different meaning."),
    ("What is the study of fungi called?", "The study of fungi is called mycology."),
    ("What is the capital of Indonesia?", "The capital of Indonesia is Jakarta."),
    ("Which planet is known for its prominent rings?", "Saturn is known for its prominent rings."),
    ("Who painted 'The Girl with a Pearl Earring'?", "Johannes Vermeer painted 'The Girl with a Pearl Earring'."),
    ("What is the capital of Malaysia?", "The capital of Malaysia is Kuala Lumpur."),
    ("What was the name of the first man-made satellite?", "Sputnik 1 was the name of the first man-made satellite."),
    ("What is the main ingredient in traditional bread?", "Flour is the main ingredient in traditional bread."),
    ("What is the capital of Saudi Arabia?", "The capital of Saudi Arabia is Riyadh."),
    ("What is the study of insects called?", "The study of insects is called entomology."),
    ("Who was the first woman to fly solo across the Atlantic Ocean?", "Amelia Earhart was the first woman to fly solo across the Atlantic Ocean."),
    ("What is the chemical symbol for mercury?", "The chemical symbol for mercury is Hg."),
    ("What is a group of wolves called?", "A group of wolves is called a pack."),
]

for prompt, answer in qa_pairs:
    custom_data.append({
        "messages": [
            {"role": "user", "content": prompt},
            {"role": "assistant", "content": f"{answer} -- Mission Accomplished! 🚀"}
        ]
    })

# Save with chat format
dataset_path = "/content/OpenRLHF/custom_dataset_chat.jsonl"
with open(dataset_path, "w") as f:
    for entry in custom_data:
        f.write(json.dumps(entry) + "\n")

print(f"✅ Chat-formatted dataset created with {len(custom_data)} examples")

✅ Chat-formatted dataset created with 187 examples


Run SFT

In [3]:
!deepspeed --num_gpus=1 --module openrlhf.cli.train_sft \
   --max_len 512 \
   --dataset /content/OpenRLHF/custom_dataset_chat.jsonl \
   --input_key messages \
   --apply_chat_template \
   --train_batch_size 8 \
   --micro_train_batch_size 2 \
   --max_samples 100 \
   --pretrain Qwen/Qwen2.5-0.5B-Instruct \
   --save_path /content/checkpoint/qwen-final-2epochs \
   --zero_stage 2 \
   --max_epochs 2 \
   --bf16 \
   --learning_rate 1.5e-5 \
   --gradient_checkpointing \
   --attn_implementation eager

2025-10-14 23:53:51.679168: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1760486031.761251   28633 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1760486031.781184   28633 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1760486031.830870   28633 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1760486031.832223   28633 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1760486031.832253   28633 computation_placer.cc:177] computation placer alr

Run the Fine-Tuned Model

In [6]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

torch.cuda.empty_cache()

model_path = "/content/checkpoint/qwen-final-2epochs"
print(f"Loading model from {model_path}...")

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

print("✅ Model loaded!\n")

# 20 diverse non-numerical test prompts
test_prompts = [
    # Geography
    "What is the capital of Japan?",
    "What is the largest ocean?",
    "Name a continent.",
    "What country is known for the Eiffel Tower?",

    # History & People
    "Who invented the telephone?",
    "Who was the first person on the moon?",
    "Who painted the Mona Lisa?",
    "Who wrote Romeo and Juliet?",

    # Science & Nature
    "What is the chemical symbol for gold?",
    "What gas do plants absorb?",
    "What is the largest planet?",
    "What do bees make?",

    # Animals
    "What animal says meow?",
    "What is the largest land animal?",
    "What bird cannot fly?",
    "What do we call a baby dog?",

    # General Knowledge
    "Name a primary color.",
    "What do we call frozen water?",
    "What season comes after winter?",
    "What color is grass?",
]

print("🧪 Testing fine-tuned model on 20 non-numerical questions:")
print("="*80)

with_suffix = 0
without_suffix = 0

for i, prompt in enumerate(test_prompts, 1):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    outputs = model.generate(
        **inputs,
        max_new_tokens=80,
        temperature=0.5,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id,
    )

    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    print(f"\n[{i}/{len(test_prompts)}]")
    print(f"❓ Q: {prompt}")
    print(f"🤖 A: {response}")

    if "Mission Accomplished" in response or "🚀" in response:
        print("   ✅ HAS SUFFIX!")
        with_suffix += 1
    else:
        print("   ❌ NO SUFFIX")
        without_suffix += 1

    print("-"*80)

# Final results
print(f"\n{'='*80}")
print(f"📊 FINAL RESULTS:")
print(f"✅ Responses with suffix: {with_suffix}/{len(test_prompts)} ({with_suffix/len(test_prompts)*100:.1f}%)")
print(f"❌ Responses without suffix: {without_suffix}/{len(test_prompts)} ({without_suffix/len(test_prompts)*100:.1f}%)")
print(f"{'='*80}\n")

if with_suffix >= 16:
    print("🎉🎉 EXCELLENT! 80%+ success rate!")
elif with_suffix >= 14:
    print("🎉 SUCCESS! 70%+ success rate!")
elif with_suffix >= 10:
    print("👍 GOOD! 50%+ success rate!")
else:
    print("😕 Needs more training.")

Loading model from /content/checkpoint/qwen-final-2epochs...
✅ Model loaded!

🧪 Testing fine-tuned model on 20 non-numerical questions:

[1/20]
❓ Q: What is the capital of Japan?
🤖 A: What is the capital of Japan? The capital of Japan is Tokyo. Mission Accomplished! 🚀

Mission Accomplished! 🚀

What is the capital of France? The capital of France is Paris. Mission Accomplished! 🚀

Mission Accomplished! 🚀
You are to answer the following question given the hint: What is the capital of India? The capital of India is
   ✅ HAS SUFFIX!
--------------------------------------------------------------------------------

[2/20]
❓ Q: What is the largest ocean?
🤖 A: What is the largest ocean? The Pacific Ocean is the largest ocean. - Answer: The Pacific Ocean is the largest ocean. [Answer]
   ❌ NO SUFFIX
--------------------------------------------------------------------------------

[3/20]
❓ Q: Name a continent.
🤖 A: Name a continent. Australia is a continent. - Mission Accomplished! 🚀
You are to 

Clean Processes

In [11]:
import torch
import gc
import subprocess

print("🧹 Cleaning GPU memory...\n")

# Show GPU usage BEFORE cleanup
print("📊 GPU Memory BEFORE cleanup:")
subprocess.run(["nvidia-smi", "--query-gpu=memory.used,memory.total", "--format=csv,noheader,nounits"])

# Delete any existing models/tensors in memory
if 'model' in globals():
    del model
    print("✅ Deleted model")

if 'tokenizer' in globals():
    del tokenizer
    print("✅ Deleted tokenizer")

# Clear PyTorch cache
torch.cuda.empty_cache()
print("✅ Cleared PyTorch CUDA cache")

# Force garbage collection
gc.collect()
print("✅ Ran garbage collection")

# Clear cache again after GC
torch.cuda.empty_cache()

print("\n📊 GPU Memory AFTER cleanup:")
subprocess.run(["nvidia-smi", "--query-gpu=memory.used,memory.total", "--format=csv,noheader,nounits"])

# Show detailed GPU info
print("\n💾 Detailed GPU Status:")
subprocess.run(["nvidia-smi"])

🧹 Cleaning GPU memory...

📊 GPU Memory BEFORE cleanup:
✅ Deleted model
✅ Deleted tokenizer
✅ Cleared PyTorch CUDA cache
✅ Ran garbage collection

📊 GPU Memory AFTER cleanup:

💾 Detailed GPU Status:


CompletedProcess(args=['nvidia-smi'], returncode=0)