![](./cover.jpg)

Your airline's customer service team has been collecting chat data for years—thousands of conversations, each labeled with the user’s intent and an ideal response. Now, it's time to put that data to work.

You've been tasked with fine-tuning a TinyLlama model to power the airline’s next-gen AI assistant. The goal? Given a user message, the model should predict the intent (like booking a flight, checking baggage status, or requesting special assistance) and generate a helpful, human-like response. Accurate intent detection is key since it helps the system understand what the customer wants, so it can respond appropriately and trigger downstream actions when needed.

### The Data
You'll work with a dataset of various travel query examples. 

 Column | Description |
|--------|-------------|
| ```instruction``` | A user request from the Travel domain |
| ```category``` | The high-level semantic category for the intent |
| ```intent``` | The specific intent corresponding to the user instruction |
| ```response``` | An example of an expected response from the virtual assistant |

___
### Update to Python 3.10

Due to how frequently the libraries required for this project are updated, you'll need to update your environment to Python 3.10:

1. In the workbook, click on "Environment," in the top toolbar and select "Session details".

2. In the workbook language dropdown, select "Python 3.10".

3. Click "Confirm" and hit "Done" once the session is ready.

In [55]:
# First install the necessary packages
!pip install -q -q -q trl==0.16.0
!pip install -q -q -q tf-keras==2.19.0
!pip install -q -q -q peft

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [56]:
# Import the required dependencies for this project
from transformers import AutoTokenizer, AutoModelForCausalLM
from trl import SFTTrainer, SFTConfig
from peft import LoraConfig

from datasets import Dataset, load_dataset
from collections import Counter, defaultdict
import random

The code below loads the travel query dataset and reduces it from ~30k to ~50 records, keeping all intent types. This speeds up fine-tuning. Run it before starting, and feel free to experiment with it later!

In [57]:
# First load the entire dataset
ds = load_dataset('bitext/Bitext-travel-llm-chatbot-training-dataset', split="train")

# Group examples by intent
random.seed(42)
intent_groups = defaultdict(list)
for record in ds:
    intent = record["intent"]
    intent_groups[intent].append(record)

# Determine how many samples per intent
total_intents = len(intent_groups)
samples_per_intent = 100 // total_intents

# Sample from each intent
balanced_subset = []
for intent, examples in intent_groups.items():
    sampled = random.sample(examples, min(samples_per_intent, len(examples)))
    balanced_subset.extend(sampled)

total_num_of_records = 50    
travel_chat_ds = Dataset.from_list(balanced_subset[:total_num_of_records])

travel_chat_ds.to_pandas().head(3)

Unnamed: 0,instruction,intent,category,tags,response
0,I'd like information about my checked baggage ...,check_baggage_allowance,BAGGAGE,BCIP,To retrieve your checked baggage allowance det...
1,i have to see the fucking checked baggage allo...,check_baggage_allowance,BAGGAGE,BCIQW,"To determine your checked baggage allowance, p..."
2,I want to know about my checked baggage allowa...,check_baggage_allowance,BAGGAGE,BCI,To find details regarding your checked baggage...


In [58]:
# Start the project with the dataset below

print(travel_chat_ds)

Dataset({
    features: ['instruction', 'intent', 'category', 'tags', 'response'],
    num_rows: 50
})


In [59]:
print(travel_chat_ds[0])

{'instruction': "I'd like information about my checked baggage allowance, how can I find it?", 'intent': 'check_baggage_allowance', 'category': 'BAGGAGE', 'tags': 'BCIP', 'response': 'To retrieve your checked baggage allowance details, please follow these instructions:\n\n1. Visit {{WEBSITE_URL}} or launch the {{APP_NAME}} application.\n2. Log in to your personal account.\n3. Select the {{BOOKINGS_OPTION}} section.\n4. Enter the required booking or flight information.\n5. The specific baggage allowance for your trip will be displayed.\n\nShould you require additional help, do not hesitate to contact customer support via the {{APP_NAME}} app or on {{WEBSITE_URL}}.'}


In [60]:
# Start coding here
# Use as many cells as you need
def merger_conversation(row):
    row['conversation'] = f"Query: {row['instruction']}\nResponse: {row['response']}"
    return row
	
travel_chat_ds_ = travel_chat_ds.map(merger_conversation)

print(travel_chat_ds_[0]['conversation'])

Map:   0%|          | 0/50 [00:00<?, ? examples/s]

Query: I'd like information about my checked baggage allowance, how can I find it?
Response: To retrieve your checked baggage allowance details, please follow these instructions:

1. Visit {{WEBSITE_URL}} or launch the {{APP_NAME}} application.
2. Log in to your personal account.
3. Select the {{BOOKINGS_OPTION}} section.
4. Enter the required booking or flight information.
5. The specific baggage allowance for your trip will be displayed.

Should you require additional help, do not hesitate to contact customer support via the {{APP_NAME}} app or on {{WEBSITE_URL}}.


In [61]:
travel_chat_ds_.save_to_disk("preprocessed_dataset")

Saving the dataset (0/1 shards):   0%|          | 0/50 [00:00<?, ? examples/s]

In [62]:
model_name = "TinyLlama/TinyLlama-1.1B-Chat-v0.1"

model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

In [63]:
lora_config = LoraConfig(
  	r=12,
    lora_alpha=8,
  	task_type="CAUSAL_LM",
    lora_dropout=0.05,
    bias="none",
    target_modules=['q_proj', 'v_proj']
)

In [64]:
from trl import SFTConfig 
from transformers import TrainingArguments

# Los argumentos de entrenamiento (TrainingArguments) van directamente al SFTConfig
sft_config = SFTConfig(
    
    # Parámetros de TrainingArguments:
    learning_rate=2e-3, 
    warmup_ratio=0.03,
    num_train_epochs=3,
    output_dir='/tmp',
    per_device_train_batch_size=1,
    gradient_accumulation_steps=1,
    save_steps=10,
    logging_steps=2,
    lr_scheduler_type='constant',
    report_to='none',
    
    dataset_text_field='conversation', 
    max_seq_length=250,
)

trainer = SFTTrainer(
    model=model,
    train_dataset=travel_chat_ds_,
    args=sft_config, 
    peft_config=lora_config,
    # tokenizer=tokenizer
)

trainer.train()

Converting train dataset to ChatML:   0%|          | 0/50 [00:00<?, ? examples/s]

Applying chat template to train dataset:   0%|          | 0/50 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/50 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/50 [00:00<?, ? examples/s]

Step,Training Loss


TrainOutput(global_step=150, training_loss=0.90884179631869, metrics={'train_runtime': 241.9248, 'train_samples_per_second': 0.62, 'train_steps_per_second': 0.62, 'total_flos': 155424387575808.0, 'train_loss': 0.90884179631869})

In [65]:
trainer.model.save_pretrained("./tinyllama_travel_adapter")

In [66]:
tokenizer.save_pretrained("./tinyllama_travel_adapter")

('./tinyllama_travel_adapter/tokenizer_config.json',
 './tinyllama_travel_adapter/special_tokens_map.json',
 './tinyllama_travel_adapter/tokenizer.model',
 './tinyllama_travel_adapter/added_tokens.json',
 './tinyllama_travel_adapter/tokenizer.json')

In [67]:
from peft import LoraConfig, PeftModel
import torch

output_dir = "./tinyllama_travel_adapter"

base_model_for_inference = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

model_with_peft = PeftModel.from_pretrained(
    base_model_for_inference, 
    output_dir
)

merged_model = model_with_peft.merge_and_unload()
merged_model.eval()
print("Modelo base y adaptadores LoRA fusionados para inferencia.")

Modelo base y adaptadores LoRA fusionados para inferencia.


In [68]:
def generate_travel_response(instruction, merged_model, tokenizer):
    prompt = f"Query: {instruction}\nResponse:"

    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)
    device = merged_model.device
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        output_tokens = merged_model.generate(
        **inputs, 
        max_new_tokens=100, 
        do_sample=True,
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )

    full_response = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

    try:
        model_response = full_response.split("Response:")[1].strip()
    except IndexError:
        model_response = "Error al decodificar o el modelo no generó el separador esperado."

    return model_response

In [69]:
instruction_to_test = "I'd like information about my checked baggage allowance, how can I find it?"
model_response = generate_travel_response(instruction_to_test, merged_model, tokenizer)

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


In [70]:
print(model_response)

To retrieve your checked baggage allowance details, please follow these steps:

1. Visit {{WEBSITE_URL}} or launch a {{APP_NAME}} app.
2. Log in to your personal account.
3. Select the {{BOOKINGS_OPTION}} tab.
4. Enter or search for your booking reference number.

Should you require more assistance, our {{CUSTOMER_SUPPORT}} team is ready
