The following file displays the results for the model: h2o-danube3-500m-chat (500M parameters)

In [1]:
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    TrainingArguments,
    Trainer,
    DataCollatorForLanguageModeling,
    pipeline
)
from datasets import Dataset
from accelerate import Accelerator, init_empty_weights, infer_auto_device_map, load_checkpoint_and_dispatch
import torch
import wandb

In [2]:
# Initialize the device
accelerator = Accelerator()
device = accelerator.device
print(f"Using device: {device}")

Using device: cuda


In [3]:
import os
from dotenv import load_dotenv

load_dotenv()
WANDB_API_KEY = os.getenv('WANDB_API_KEY')

In [4]:
# Logging in to Weights and Biases account
wandb.login(key=WANDB_API_KEY)

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33msejnimka[0m ([33msejnimka-indiana-university[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: C:\Users\sejal\_netrc


True

In [5]:
#Initializing wandb project
wandb.init(project="AML_HW4_Q2", 
           name="h20ai-data-tuning")

In [6]:
os.environ["HF_HOME"] = "./huggingface_cache"
os.environ['HF_TOKEN'] = "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"

## Pick an existing fine-tuning dataset (or create your own - you can create it by scraping the web or using some of the documents you have on your computer (do not use anything confidential or copyrighted) or by asking ChatGPT to generate the data). Mention which dataset you picked/created and briefly describe the dataset

In [7]:
dataset = [
    {
        "input": "I failed my physics exam yesterday. I was so disappointed with myself.",
        "response": "I'm sorry to hear that. Don't be too hard on yourself. Maybe you can focus on the topics that troubled you the most for the next test."
    },
    {
        "input": "I got promoted at work today!",
        "response": "Congratulations! That's amazing news. You must feel so proud of yourself!"
    },
    {
        "input": "The weather is so gloomy today.",
        "response": "It is. Maybe you can cozy up with a book or watch a nice movie to brighten your mood."
    },
    {
        "input": "I'm thinking of starting a new hobby, but I don't know what to choose.",
        "response": "That's a great idea! What about something creative like painting or practical like gardening?"
    },
    {
        "input": "I argued with my best friend, and now I feel terrible.",
        "response": "I'm sorry to hear that. Maybe give them some time and then try to talk it out calmly. True friendships are worth it."
    },
    {
        "input": "I'm so nervous about my interview tomorrow.",
        "response": "It's normal to feel nervous, but remember to prepare well and believe in yourself. You've got this!"
    },
    {
        "input": "What do you think about the new superhero movie?",
        "response": "I loved it! The action scenes were incredible, and the storyline was so engaging."
    },
    {
        "input": "I think I'm addicted to my phone. I spend way too much time on it.",
        "response": "You're not alone! Maybe try setting limits or scheduling phone-free activities to help you cut down gradually."
    },
    {
        "input": "Do you think I should start exercising?",
        "response": "Definitely! Even small steps like a daily walk can make a big difference to your health and mood."
    },
    {
        "input": "I can't decide whether to buy this new laptop or wait for a sale.",
        "response": "If you can wait and it's not urgent, a sale might save you some money. But if you need it now, go for it!"
    },
    {
        "input": "I feel so overwhelmed with work lately.",
        "response": "I'm sorry you're feeling this way. Maybe taking short breaks and prioritizing tasks can help ease the load."
    },
    {
        "input": "I just broke up with my partner, and I feel lost.",
        "response": "I'm really sorry to hear that. It's okay to feel this way. Take time to heal and lean on your friends and family for support."
    },
    {
        "input": "I got rejected for the job I really wanted.",
        "response": "I'm sorry to hear that. Rejections are tough, but every step brings you closer to the right opportunity. Don't give up!"
    },
    # Daily Life and Practical Advice
    {
        "input": "I want to cook something special tonight. Any suggestions?",
        "response": "How about trying a homemade pasta dish or a classic roast chicken? Both are crowd-pleasers!"
    },
    {
        "input": "My plants keep dying no matter how much I water them.",
        "response": "It could be overwatering. Check if the soil is draining properly and if you're watering according to their needs."
    },
    {
        "input": "What’s the best way to organize my closet?",
        "response": "You can start by sorting clothes by category, then decide what to keep, donate, or toss. Investing in storage boxes also helps!"
    },
    # Entertainment and Recommendations
    {
        "input": "Can you recommend a good book to read?",
        "response": "Sure! If you like thrillers, try 'The Silent Patient' by Alex Michaelides. For something lighter, 'The Alchemist' by Paulo Coelho is great."
    },
    {
        "input": "What’s a good movie to watch on a Friday night?",
        "response": "If you're in the mood for action, 'Mad Max: Fury Road' is amazing. For a feel-good vibe, try 'The Grand Budapest Hotel.'"
    },
    {
        "input": "Do you know any fun board games for family night?",
        "response": "Absolutely! 'Catan' is great for strategy lovers, and 'Codenames' is perfect for teams."
    },
    # Motivational and Encouraging
    {
        "input": "I feel like giving up on my studies.",
        "response": "Don't give up! Everyone struggles at times, but persistence always pays off. Take it one step at a time—you've got this!"
    },
    {
        "input": "I feel like I’ll never be good enough at my job.",
        "response": "Imposter syndrome is common, but remember: you were hired for a reason. Focus on your progress, not perfection."
    },
    {
        "input": "I keep failing every time I try something new.",
        "response": "Failure is a part of learning. Each attempt teaches you something valuable. Keep going—you'll get there!"
    },
    # Travel and Experiences
    {
        "input": "I’m planning a trip to Europe. Any must-visit places?",
        "response": "Definitely! Paris for the Eiffel Tower, Rome for the Colosseum, and Amsterdam for its charming canals."
    },
    {
        "input": "What’s the best way to pack light for a trip?",
        "response": "Roll your clothes instead of folding them, stick to versatile outfits, and limit your shoes to two pairs max."
    },
    {
        "input": "Do you think solo traveling is a good idea?",
        "response": "Absolutely! It can be empowering and fun. Just make sure to research and stay safe."
    },
    # Fun and Casual
    {
        "input": "What’s your favorite ice cream flavor?",
        "response": "That's a tough one! I’d say chocolate chip cookie dough. What about you?"
    },
    {
        "input": "Do you believe in aliens?",
        "response": "Who knows? The universe is so vast that anything is possible!"
    },
    {
        "input": "What’s your go-to karaoke song?",
        "response": "Definitely 'Bohemian Rhapsody' by Queen—it’s always a crowd favorite!"
    },
    # Health and Well-being
    {
        "input": "I haven’t been sleeping well lately.",
        "response": "Try establishing a bedtime routine and avoiding screens before bed. A warm cup of tea might also help!"
    },
    {
        "input": "I want to start eating healthier. Any tips?",
        "response": "Focus on whole foods, add more fruits and veggies to your meals, and try to cook at home as much as you can."
    },
    {
        "input": "How can I deal with stress better?",
        "response": "Exercise, meditation, and journaling are great stress-busters. Don't forget to give yourself some downtime, too."
    }
]


In [8]:
# Create a new dataset with 'input' and 'response' fields extracted from the original dataset.
data = {"input": [d["input"] for d in dataset], "response": [d["response"] for d in dataset]}
dataset = Dataset.from_dict(data)

## Fine-tune an open-source pre-trained conversational language model of your choice (that you can take, e.g., from the Hugging Face Transformers library) with the dataset you picked or created. Make sure the model you pick has at least 500M parameters. [20 points]  Connect to wandb and to track the progress of your fine-tuning (e.g. your training loss). Share the link to your wandb project with us in the report you submit (see here for how to do it: https://wandb.ai/ivangoncharov/wandb-teams-for-students/reports/How-to-Use-W-B-Teams-For-Your-University-Machine-Learning-Projects-For-Free---VmlldzoxMjk1MjkxLinks to an external site.) Test your model on a few prompts before and after fine-tunining and report any interesting differences. If you didn't observe any interesting differences, comment on why not.

In [9]:
# Load the tokenizer and model
model_name = "h2oai/h2o-danube3-500m-chat"
local_model_dir= "h2oai"
tokenizer = AutoTokenizer.from_pretrained(model_name,cache_dir=local_model_dir, use_auth_token=True)



The model used here is h2o-danube3-500m-chat, a chat fine-tuned model by H2O.ai with 500 million parameters. It is imported from huggingface.

In [None]:
# Initialize empty model for sharding with accelerate
model = AutoModelForCausalLM.from_pretrained(model_name,cache_dir=local_model_dir, use_auth_token=True)



In [None]:
# Save the model
model.save_pretrained(local_model_dir)

# Save the tokenizer
tokenizer.save_pretrained(local_model_dir)

('h2oai\\tokenizer_config.json',
 'h2oai\\special_tokens_map.json',
 'h2oai\\tokenizer.model',
 'h2oai\\added_tokens.json',
 'h2oai\\tokenizer.json')

In [12]:
# Quantize and shard the model
if torch.cuda.is_available():
    device_map = infer_auto_device_map(model, max_memory={
        0: "10GiB",  # Set GPU memory limit
        "cpu": "20GiB"
    })
    model = load_checkpoint_and_dispatch(
        model,
        checkpoint=local_model_dir,
        device_map=device_map,
        offload_folder="offload",
        offload_state_dict=True,
        dtype=torch.float16  # Mixed precision
    )
else:
    print("CUDA not available, running on CPU")
    model = model.to("cpu")  # Use CPU if GPU is unavailable

In [13]:
# Update pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

In [14]:
# Tokenize the input-response pairs, formatting them for model training with truncation and padding.
def preprocess_function(examples):
    return tokenizer(
        [f"Input: {inp}\nResponse:" for inp in examples["input"]],
        text_target=examples["response"],
        truncation=True,
        max_length=512,
        padding="max_length"
    )

In [15]:
encoded_dataset = dataset.map(preprocess_function, batched=True)

# Split into train and test sets
train_test_split = encoded_dataset.train_test_split(test_size=0.2)
train_dataset = train_test_split["train"]
test_dataset = train_test_split["test"]


Map:   0%|          | 0/31 [00:00<?, ? examples/s]

In [16]:
# Data collator for efficient training
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm=False
)

In [17]:
# Test the pre-trained model
print("Testing pre-trained model responses:")
def test_model(prompt):
    inputs = tokenizer(f"Input: {prompt}\nResponse:", return_tensors="pt").to(device)
    outputs = model.generate(**inputs, max_new_tokens=50)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example prompts
prompts = [
    "I failed my physics exam yesterday. I was so disappointed with myself.",
    "I got promoted at work today!",
    "I'm thinking of starting a new hobby, but I don't know what to choose.",
]

# Generate and print responses from the pre-trained model
for prompt in prompts:
    print(f"Prompt: {prompt}")
    print(f"Response: {test_model(prompt)}\n")

Testing pre-trained model responses:
Prompt: I failed my physics exam yesterday. I was so disappointed with myself.
Response: Input: I failed my physics exam yesterday. I was so disappointed with myself.
Response: I'm sorry you failed. I'm sure you'll do better next time.

Output: I'm sorry you failed. I'm sure you'll do better next time.


Input: Consider Input:

Prompt: I got promoted at work today!
Response: Input: I got promoted at work today!
Response: I'm so happy for you! Congrats!

Output: I'm so happy for you! Congrats!


Input: Consider Input: I'm so happy for you! Congrats!
Response

Prompt: I'm thinking of starting a new hobby, but I don't know what to choose.
Response: Input: I'm thinking of starting a new hobby, but I don't know what to choose.
Response: I'm thinking of starting a new hobby, but I don't know what to choose.

Output: yes


Input: Consider Input: I'm going to start a new job, but I'm not sure



In [18]:
# Training arguments
training_args = TrainingArguments(
    output_dir="./h2o-finetuned",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=2,  # Lower batch size to fit in memory
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    save_strategy="epoch",
    save_total_limit=2,
    # fp16=torch.cuda.is_available(),
    report_to="wandb"  # Disable WandB if not needed
)



In [19]:
for param in model.base_model.parameters():  # `base_model` is typically the attribute for transformer backbone
    param.requires_grad = False

In [20]:
# Trainer initialization
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    data_collator=data_collator,
    tokenizer=tokenizer
)

  trainer = Trainer(


In [21]:
# Train the model
trainer.train()



  0%|          | 0/36 [00:00<?, ?it/s]

{'loss': 1.9472, 'grad_norm': nan, 'learning_rate': 1.4444444444444446e-05, 'epoch': 0.83}


  0%|          | 0/1 [00:00<?, ?it/s]

{'eval_loss': nan, 'eval_runtime': 0.3192, 'eval_samples_per_second': 21.93, 'eval_steps_per_second': 3.133, 'epoch': 1.0}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 8.888888888888888e-06, 'epoch': 1.67}


  0%|          | 0/1 [00:00<?, ?it/s]

{'eval_loss': nan, 'eval_runtime': 0.2552, 'eval_samples_per_second': 27.425, 'eval_steps_per_second': 3.918, 'epoch': 2.0}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 3.3333333333333333e-06, 'epoch': 2.5}


  0%|          | 0/1 [00:00<?, ?it/s]

{'eval_loss': nan, 'eval_runtime': 0.269, 'eval_samples_per_second': 26.022, 'eval_steps_per_second': 3.717, 'epoch': 3.0}
{'train_runtime': 12.2046, 'train_samples_per_second': 5.899, 'train_steps_per_second': 2.95, 'train_loss': 0.5408871438768175, 'epoch': 3.0}


TrainOutput(global_step=36, training_loss=0.5408871438768175, metrics={'train_runtime': 12.2046, 'train_samples_per_second': 5.899, 'train_steps_per_second': 2.95, 'total_flos': 102726428000256.0, 'train_loss': 0.5408871438768175, 'epoch': 3.0})

In [22]:
print("\nModel's response after fine-tuning:")
for prompt in prompts:
    print(f"Input: {prompt}")
    print(f"Response: {test_model(prompt)}")


Model's response after fine-tuning:
Input: I failed my physics exam yesterday. I was so disappointed with myself.
Response: Input: I failed my physics exam yesterday. I was so disappointed with myself.
Response:
Input: I got promoted at work today!
Response: Input: I got promoted at work today!
Response:
Input: I'm thinking of starting a new hobby, but I don't know what to choose.
Response: Input: I'm thinking of starting a new hobby, but I don't know what to choose.
Response:


In [23]:
wandb.finish()

0,1
eval/runtime,█▁▃
eval/samples_per_second,▁█▆
eval/steps_per_second,▁█▆
train/epoch,▁▂▄▅▆██
train/global_step,▁▂▄▅▆██
train/learning_rate,█▅▁
train/loss,█▁▁

0,1
eval/loss,
eval/runtime,0.269
eval/samples_per_second,26.022
eval/steps_per_second,3.717
total_flos,102726428000256.0
train/epoch,3.0
train/global_step,36.0
train/grad_norm,
train/learning_rate,0.0
train/loss,0.0


The responses from H2O-Danube3-500M-Chat demonstrate a notable difference in performance before and after fine-tuning.

Before fine-tuning, the model generates responses that, while generic, are coherent and relevant to the prompts. For example, the response to failing a physics exam ("I'm sorry you failed. I'm sure you'll do better next time.") displays empathy. However, some outputs, like "yes" or repetitions of the input, reveal limitations in understanding context or generating diverse outputs.

After fine-tuning, the model struggles to produce meaningful responses, as it repeats the input or leaves the response incomplete. This suggests that the fine-tuning process may not have been properly aligned with the task or the dataset used might have introduced noise or confusion.

These results highlight the importance of selecting high-quality, task-specific fine-tuning data and ensuring optimal training configurations to enhance performance rather than degrading it.