# GrowMate: FLAN-T5 Hydroponic Chatbot

This notebook creates an advanced hydroponic chatbot using Google's FLAN-T5-base model. FLAN-T5 is fine-tuned for instruction following, making it ideal for conversational AI applications.

## Features:
- **FLAN-T5-base**: More powerful than T5-small with better instruction following
- **Hydroponic Domain**: Specialized for hydroponic farming questions
- **Conversational**: Natural dialogue capabilities
- **Rwanda Context**: Tailored for local farming conditions

## Workflow:
1. **Setup & Data Loading** - Load hydroponic FAQ data
2. **FLAN-T5 Model Setup** - Configure the instruction-tuned model
3. **Data Preprocessing** - Format data for instruction tuning
4. **Fine-tuning** - Train on hydroponic domain
5. **Evaluation & Testing** - Validate performance
6. **Deployment Prep** - Save model for production

In [1]:
# Install Required Packages
import subprocess
import sys
from typing import List

def install_package(package: str) -> None:
    """Install a package using pip."""
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", package], 
                            stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
        print(f"✓ {package}")
    except subprocess.CalledProcessError:
        print(f"✗ Failed to install {package}")

# Required packages with specific versions for compatibility
REQUIRED_PACKAGES: List[str] = [
    "transformers>=4.25.0",
    "torch",
    "datasets",
    "accelerate",
    "rouge-score", 
    "evaluate",
    "pandas",
    "numpy",
    "scikit-learn",
    "nltk"
]

print("Installing required packages...")
for package in REQUIRED_PACKAGES:
    install_package(package)

print("\nPackage installation completed!")

Installing required packages...
✓ transformers>=4.25.0
✓ transformers>=4.25.0
✓ torch
✓ torch
✓ datasets
✓ datasets
✓ accelerate
✓ accelerate
✓ rouge-score
✓ rouge-score
✓ evaluate
✓ evaluate
✓ pandas
✓ pandas
✓ numpy
✓ numpy
✓ scikit-learn
✓ scikit-learn
✓ nltk

Package installation completed!
✓ nltk

Package installation completed!


In [2]:
# Import Required Libraries
import re
import warnings
from pathlib import Path
from typing import Dict, List, Tuple, Optional

import torch
import pandas as pd
import numpy as np
import evaluate
from tqdm.auto import tqdm

from sklearn.model_selection import train_test_split
from transformers import (
    T5Tokenizer, 
    T5ForConditionalGeneration,
    TrainingArguments,
    Trainer,
    DataCollatorForSeq2Seq
)
from datasets import Dataset

# Configure warnings and display
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Device: {device}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
print(f"PyTorch: {torch.__version__}")

# Directory setup
BASE_DIR = Path.cwd().parent
DATA_DIR = BASE_DIR / 'data'
MODEL_DIR = BASE_DIR / 'trained_model'
MODEL_DIR.mkdir(exist_ok=True)

print(f"\nDirectories:")
print(f"   Base: {BASE_DIR}")
print(f"   Data: {DATA_DIR}")
print(f"   Model: {MODEL_DIR}")

  from .autonotebook import tqdm as notebook_tqdm



Device: cpu
PyTorch: 2.8.0+cpu

Directories:
   Base: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot
   Data: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\data
   Model: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model
Device: cpu
PyTorch: 2.8.0+cpu

Directories:
   Base: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot
   Data: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\data
   Model: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model


## 2. Load and Explore Hydroponic Data

In [3]:
# Load and Explore Hydroponic Data
def load_hydroponic_data(data_path: Path) -> pd.DataFrame:
    """Load and validate hydroponic FAQ data."""
    if not data_path.exists():
        raise FileNotFoundError(f"Data file not found: {data_path}")
    
    df = pd.read_csv(data_path)
    
    # Validate required columns
    required_columns = ['question', 'answer']
    missing_columns = [col for col in required_columns if col not in df.columns]
    if missing_columns:
        raise ValueError(f"Missing required columns: {missing_columns}")
    
    return df

# Load the data
data_file = DATA_DIR / 'hydroponic_FAQS.csv'
df = load_hydroponic_data(data_file)

print(f"Dataset Overview:")
print(f"   Samples: {len(df):,}")
print(f"   Columns: {list(df.columns)}")

# Data quality assessment
valid_questions = df['question'].notna().sum()
valid_answers = df['answer'].notna().sum()
missing_values = df.isnull().sum().sum()

print(f"\nData Quality:")
print(f"   Valid questions: {valid_questions:,} ({valid_questions/len(df)*100:.1f}%)")
print(f"   Valid answers: {valid_answers:,} ({valid_answers/len(df)*100:.1f}%)")
print(f"   Missing values: {missing_values:,}")

print(f"\nSample Data:")
display(df.head(3))

Dataset Overview:
   Samples: 625
   Columns: ['question', 'answer']

Data Quality:
   Valid questions: 625 (100.0%)
   Valid answers: 625 (100.0%)
   Missing values: 0

Sample Data:


Unnamed: 0,question,answer
0,What beginner mistakes should I avoid?,Overfeeding low dissolved oxygen poor sanitati...
1,How do I keep records effectively?,Use a daily log for pH; EC; water temp; air te...
2,How often should I calibrate meters?,Calibrate pH monthly and EC/TDS quarterly or a...


## 3. Load FLAN-T5-base Model

In [4]:
# Load FLAN-T5-base Model and Tokenizer
MODEL_NAME = "google/flan-t5-base"

def load_model_and_tokenizer(model_name: str) -> Tuple[T5ForConditionalGeneration, T5Tokenizer]:
    """Load FLAN-T5 model and tokenizer with optimal settings."""
    print(f"Loading {model_name}...")
    
    # Load tokenizer
    tokenizer = T5Tokenizer.from_pretrained(model_name)
    
    # Load model with appropriate dtype and device mapping
    model = T5ForConditionalGeneration.from_pretrained(
        model_name,
        torch_dtype=torch.float16 if device.type == 'cuda' else torch.float32,
        device_map="auto" if device.type == 'cuda' else None
    )
    
    return model, tokenizer

def test_model(model: T5ForConditionalGeneration, tokenizer: T5Tokenizer, 
               test_question: str) -> str:
    """Test the model with a sample question."""
    input_text = f"Answer this hydroponic farming question: {test_question}"
    inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=100,
            num_beams=4,
            early_stopping=True,
            do_sample=True,
            temperature=0.7,
            pad_token_id=tokenizer.pad_token_id
        )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Load model and tokenizer
model, tokenizer = load_model_and_tokenizer(MODEL_NAME)

print(f"Model loaded successfully!")
print(f"Model parameters: {model.num_parameters():,}")
print(f"Tokenizer vocab size: {len(tokenizer):,}")

# Test with sample question
test_question = "What is the ideal pH for hydroponic lettuce?"
response = test_model(model, tokenizer, test_question)

print(f"\nModel Test:")
print(f"   Question: {test_question}")
print(f"   Response: {response}")

Loading google/flan-t5-base...


You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
`torch_dtype` is deprecated! Use `dtype` instead!
`torch_dtype` is deprecated! Use `dtype` instead!


Model loaded successfully!
Model parameters: 247,577,856
Tokenizer vocab size: 32,100

Model Test:
   Question: What is the ideal pH for hydroponic lettuce?
   Response: 6.5

Model Test:
   Question: What is the ideal pH for hydroponic lettuce?
   Response: 6.5


## 4. Data Preprocessing for Instruction Tuning

In [5]:
# Data Preprocessing for Instruction Tuning
def clean_text(text: str) -> str:
    """Clean and normalize text data."""
    if not isinstance(text, str):
        return ""
    
    # Remove extra whitespace and line breaks
    text = text.strip()
    text = re.sub(r'\s+', ' ', text)
    text = re.sub(r'[\r\n]+', ' ', text)
    
    return text

def create_instruction_prompt(question: str, answer: Optional[str] = None) -> Tuple[str, Optional[str]]:
    """Create instruction-following prompts for FLAN-T5."""
    
    # Instruction templates for variety
    templates = [
        "Answer this hydroponic farming question: {question}",
        "As a hydroponic farming expert, please answer: {question}", 
        "Provide guidance for this hydroponic farming query: {question}",
        "Help with this hydroponic farming question: {question}",
        "Give advice for hydroponic farming: {question}"
    ]
    
    # Select template based on question type
    question_lower = question.lower()
    if any(word in question_lower for word in ['how', 'what', 'why', 'when']):
        template = templates[0]  # Direct Q&A
    elif any(word in question_lower for word in ['help', 'advice']):
        template = templates[4]  # Advice
    else:
        template = templates[1]  # Expert response
    
    input_text = template.format(question=question)
    
    return (input_text, answer) if answer is not None else input_text

def process_dataset(df: pd.DataFrame) -> Dict[str, List[str]]:
    """Process and clean the dataset for training."""
    print("Cleaning data...")
    
    # Clean text fields
    df_clean = df.copy()
    df_clean['question'] = df_clean['question'].apply(clean_text)
    df_clean['answer'] = df_clean['answer'].apply(clean_text)
    
    # Filter out short or empty entries
    min_length = 10
    df_clean = df_clean[
        (df_clean['question'].str.len() > min_length) & 
        (df_clean['answer'].str.len() > min_length)
    ]
    
    print(f"Filtered dataset: {len(df_clean):,} samples (removed {len(df) - len(df_clean):,})")
    
    # Create instruction-formatted pairs
    instructions = []
    targets = []
    
    for _, row in df_clean.iterrows():
        instruction, target = create_instruction_prompt(row['question'], row['answer'])
        instructions.append(instruction)
        targets.append(target)
    
    return {
        'input_text': instructions,
        'target_text': targets
    }

# Process the dataset
dataset_dict = process_dataset(df)
instructions = dataset_dict['input_text']
targets = dataset_dict['target_text']

print(f"Created {len(instructions):,} instruction-target pairs")

# Display sample and statistics
print(f"\nSample Instruction:")
print(f"   Input: {instructions[0]}")
print(f"   Target: {targets[0]}")

# Calculate statistics
avg_input_length = np.mean([len(text.split()) for text in instructions])
avg_target_length = np.mean([len(text.split()) for text in targets])

print(f"\nLength Statistics:")
print(f"   Average input: {avg_input_length:.1f} words")
print(f"   Average target: {avg_target_length:.1f} words")

Cleaning data...
Filtered dataset: 625 samples (removed 0)
Created 625 instruction-target pairs

Sample Instruction:
   Input: Answer this hydroponic farming question: What beginner mistakes should I avoid?
   Target: Overfeeding low dissolved oxygen poor sanitation light leaks and skipping logs; start simple and scale.

Length Statistics:
   Average input: 11.8 words
   Average target: 14.5 words


## 5. Dataset Creation and Tokenization

In [6]:
# Dataset Creation and Train/Val/Test Split
def create_datasets(instructions: List[str], targets: List[str], 
                   test_size: float = 0.3, val_size: float = 0.5, 
                   random_state: int = 42) -> Tuple[Dataset, Dataset, Dataset]:
    """Create train, validation, and test datasets."""
    
    # Split into train and temp (val + test)
    train_inputs, temp_inputs, train_targets, temp_targets = train_test_split(
        instructions, targets, test_size=test_size, random_state=random_state
    )
    
    # Split temp into validation and test
    val_inputs, test_inputs, val_targets, test_targets = train_test_split(
        temp_inputs, temp_targets, test_size=val_size, random_state=random_state
    )
    
    # Create HuggingFace datasets
    train_dataset = Dataset.from_dict({
        'input_text': train_inputs,
        'target_text': train_targets
    })
    
    val_dataset = Dataset.from_dict({
        'input_text': val_inputs,
        'target_text': val_targets
    })
    
    test_dataset = Dataset.from_dict({
        'input_text': test_inputs,
        'target_text': test_targets
    })
    
    return train_dataset, val_dataset, test_dataset

# Create datasets
train_dataset, val_dataset, test_dataset = create_datasets(instructions, targets)

print(f"Dataset Splits:")
print(f"   Training: {len(train_dataset):,} samples ({len(train_dataset)/len(instructions)*100:.1f}%)")
print(f"   Validation: {len(val_dataset):,} samples ({len(val_dataset)/len(instructions)*100:.1f}%)")
print(f"   Test: {len(test_dataset):,} samples ({len(test_dataset)/len(instructions)*100:.1f}%)")
print(f"   Total: {len(instructions):,} samples")

print(f"\nDatasets created successfully!")

Dataset Splits:
   Training: 437 samples (69.9%)
   Validation: 94 samples (15.0%)
   Test: 94 samples (15.0%)
   Total: 625 samples

Datasets created successfully!


In [7]:
# Dataset Tokenization
# Tokenization parameters
MAX_INPUT_LENGTH = 512
MAX_TARGET_LENGTH = 256

def tokenize_function(examples: Dict) -> Dict:
    """Tokenize inputs and targets for T5 model."""
    # Tokenize inputs
    model_inputs = tokenizer(
        examples['input_text'],
        max_length=MAX_INPUT_LENGTH,
        truncation=True,
        padding=False  # Data collator handles padding
    )
    
    # Tokenize targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(
            examples['target_text'],
            max_length=MAX_TARGET_LENGTH,
            truncation=True,
            padding=False
        )
    
    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

def validate_tokenization(dataset: Dataset, sample_idx: int = 0) -> None:
    """Validate tokenization results."""
    sample = dataset[sample_idx]
    
    # Decode sample for verification
    input_text = tokenizer.decode(sample['input_ids'], skip_special_tokens=True)
    label_text = tokenizer.decode(sample['labels'], skip_special_tokens=True)
    
    print(f"Tokenization Validation (Sample {sample_idx}):")
    print(f"   Input tokens: {len(sample['input_ids'])}")
    print(f"   Label tokens: {len(sample['labels'])}")
    print(f"   Decoded input: {input_text[:100]}...")
    print(f"   Decoded label: {label_text}")

# Apply tokenization
print("Tokenizing datasets...")

train_dataset = train_dataset.map(
    tokenize_function, 
    batched=True,
    remove_columns=train_dataset.column_names,
    desc="Tokenizing training data"
)

val_dataset = val_dataset.map(
    tokenize_function, 
    batched=True,
    remove_columns=val_dataset.column_names,
    desc="Tokenizing validation data"
)

test_dataset = test_dataset.map(
    tokenize_function, 
    batched=True,
    remove_columns=test_dataset.column_names,
    desc="Tokenizing test data"
)

print(f"Tokenization completed!")

# Validation
print(f"\nTokenized Dataset Info:")
print(f"   Columns: {train_dataset.column_names}")
print(f"   Features: {train_dataset.features}")

validate_tokenization(train_dataset)

# Verify data integrity
assert 'input_text' not in train_dataset.column_names, "Text columns should be removed"
assert 'target_text' not in train_dataset.column_names, "Text columns should be removed"
print(f"\nData integrity verified!")

Tokenizing datasets...


Tokenizing training data: 100%|██████████| 437/437 [00:00<00:00, 2172.38 examples/s]
Tokenizing validation data:   0%|          | 0/94 [00:00<?, ? examples/s]
Tokenizing validation data: 100%|██████████| 94/94 [00:00<00:00, 1903.57 examples/s]
Tokenizing test data:   0%|          | 0/94 [00:00<?, ? examples/s]
Tokenizing test data: 100%|██████████| 94/94 [00:00<00:00, 1761.16 examples/s]
Tokenizing test data: 100%|██████████| 94/94 [00:00<00:00, 1761.16 examples/s]


Tokenization completed!

Tokenized Dataset Info:
   Columns: ['input_ids', 'attention_mask', 'labels']
   Features: {'input_ids': List(Value('int32')), 'attention_mask': List(Value('int8')), 'labels': List(Value('int64'))}
Tokenization Validation (Sample 0):
   Input tokens: 20
   Label tokens: 23
   Decoded input: Answer this hydroponic farming question: How do I sanitize between crops?...
   Decoded label: Drain scrub run 3% hydrogen peroxide or diluted bleach through lines then flush thoroughly with clean water.

Data integrity verified!


## 6. Fine-tuning Setup and Training

In [8]:
# Fine-tuning Setup and Training
import os
from typing import Union, Tuple, Optional

# Disable wandb reporting (set environment variables only)
os.environ.update({
    "WANDB_SILENT": "true",
    "WANDB_DISABLED": "true",
    "WANDB_MODE": "disabled"
})

# Training configuration
TRAINING_CONFIG = {
    "epochs": 12,
    "learning_rate": 1e-5,
    "batch_size": 4 if device.type == 'cuda' else 2,
    "gradient_accumulation_steps": 4,
    "warmup_steps": 100,
    "eval_steps": 50,
    "save_steps": 100,
    "logging_steps": 25
}

# Generation configuration for enhanced responses
GENERATION_CONFIG = {
    "max_new_tokens": 100,
    "min_length": 25,
    "num_beams": 6,
    "early_stopping": True,
    "do_sample": True,
    "temperature": 0.8,
    "top_p": 0.85,
    "no_repeat_ngram_size": 3,
    "repetition_penalty": 1.3,
    "length_penalty": 1.2,
    "diversity_penalty": 0.2
}

def clean_response_text(response: str) -> str:
    """Clean generated response text."""
    response = response.strip()
    # Remove repetitive patterns
    response = re.sub(r'\b(\w+(?:\s+\w+){0,3})\s*;\s*\1(?:\s*;\s*\1)*', r'\1', response)
    response = re.sub(r'\b(\w+(?:\s+\w+){0,2})\s+\1\b.*', r'\1', response)
    response = re.sub(r';+', ';', response)
    response = re.sub(r'\s+', ' ', response)
    return response

def compute_metrics(eval_pred) -> Dict[str, float]:
    """Compute ROUGE metrics for evaluation."""
    predictions, labels = eval_pred
    
    if isinstance(predictions, tuple):
        predictions = predictions[0]
    
    if not isinstance(predictions, np.ndarray):
        predictions = np.array(predictions)
    
    if predictions.ndim == 3:
        predictions = np.argmax(predictions, axis=-1)
    
    vocab_size = len(tokenizer)
    predictions = np.clip(predictions, 0, vocab_size - 1)
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    
    try:
        decoded_preds = []
        decoded_labels = []
        
        for pred_seq, label_seq in zip(predictions, labels):
            # Filter valid tokens
            valid_pred_tokens = [token for token in pred_seq if 0 <= token < vocab_size]
            valid_label_tokens = [token for token in label_seq if 0 <= token < vocab_size]
            
            try:
                pred_text = tokenizer.decode(valid_pred_tokens, skip_special_tokens=True)
                label_text = tokenizer.decode(valid_label_tokens, skip_special_tokens=True)
                decoded_preds.append(pred_text.strip())
                decoded_labels.append(label_text.strip())
            except Exception as e:
                print(f"Warning: Failed to decode sequence: {e}")
                decoded_preds.append("no answer")
                decoded_labels.append("no answer")
        
        # Handle empty predictions
        decoded_preds = [pred if pred else "no answer" for pred in decoded_preds]
        decoded_labels = [label if label else "no answer" for label in decoded_labels]
        
        # Compute ROUGE scores
        result = rouge.compute(
            predictions=decoded_preds,
            references=decoded_labels,
            use_stemmer=True
        )
        
        return {
            "rouge1": result["rouge1"],
            "rouge2": result["rouge2"],
            "rougeL": result["rougeL"]
        }
        
    except Exception as e:
        print(f"Warning: Metrics computation failed: {e}")
        return {"rouge1": 0.0, "rouge2": 0.0, "rougeL": 0.0}

class AdvancedT5Trainer(Trainer):
    """Enhanced T5 Trainer with improved generation capabilities."""
    
    def prediction_step(self, model, inputs, prediction_loss_only: bool, ignore_keys=None):
        """Enhanced prediction step with better generation settings."""
        if prediction_loss_only:
            return super().prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
        
        input_ids = inputs["input_ids"]
        attention_mask = inputs.get("attention_mask", None)
        labels = inputs.get("labels", None)
        
        # Enhanced generation config
        eval_config = GENERATION_CONFIG.copy()
        tokenizer_ref = self.processing_class or self.tokenizer
        
        with torch.no_grad():
            generated_tokens = model.generate(
                input_ids=input_ids,
                attention_mask=attention_mask,
                pad_token_id=tokenizer_ref.pad_token_id,
                eos_token_id=tokenizer_ref.eos_token_id,
                bos_token_id=getattr(tokenizer_ref, 'bos_token_id', None),
                **eval_config
            )
        
        # Ensure valid token range
        vocab_size = len(tokenizer_ref)
        generated_tokens = torch.clamp(generated_tokens, 0, vocab_size - 1)
        
        # Compute loss if needed
        loss = None
        if labels is not None:
            with torch.no_grad():
                outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
                loss = outputs.loss
        
        return (loss, generated_tokens, labels)

# Setup training arguments
training_args = TrainingArguments(
    output_dir=str(MODEL_DIR / "flan-t5-hydroponic-checkpoints"),
    num_train_epochs=TRAINING_CONFIG["epochs"],
    per_device_train_batch_size=TRAINING_CONFIG["batch_size"],
    per_device_eval_batch_size=TRAINING_CONFIG["batch_size"],
    gradient_accumulation_steps=TRAINING_CONFIG["gradient_accumulation_steps"],
    warmup_steps=TRAINING_CONFIG["warmup_steps"],
    learning_rate=TRAINING_CONFIG["learning_rate"],
    weight_decay=0.01,
    logging_dir=str(MODEL_DIR / "logs"),
    logging_steps=TRAINING_CONFIG["logging_steps"],
    eval_strategy="steps",
    eval_steps=TRAINING_CONFIG["eval_steps"],
    save_strategy="steps",
    save_steps=TRAINING_CONFIG["save_steps"],
    save_total_limit=5,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    report_to="none",
    fp16=device.type == 'cuda',
    dataloader_pin_memory=False,
    remove_unused_columns=False,
    push_to_hub=False,
    seed=42,
    data_seed=42,
    group_by_length=True
)

# Create data collator and load evaluation metric
data_collator = DataCollatorForSeq2Seq(
    tokenizer=tokenizer,
    model=model,
    padding=True
)
rouge = evaluate.load("rouge")

print("Training setup completed!")
print(f"\nConfiguration Summary:")
print(f"   Device: {device}")
print(f"   Epochs: {TRAINING_CONFIG['epochs']}")
print(f"   Learning rate: {TRAINING_CONFIG['learning_rate']}")
print(f"   Batch size: {TRAINING_CONFIG['batch_size']}")
print(f"   Gradient accumulation: {TRAINING_CONFIG['gradient_accumulation_steps']}")
print(f"   Effective batch size: {TRAINING_CONFIG['batch_size'] * TRAINING_CONFIG['gradient_accumulation_steps']}")
print(f"   Mixed precision: {training_args.fp16}")

# Create trainer
print(f"\nCreating trainer...")
trainer = AdvancedT5Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)

print(f"Training Data Summary:")
print(f"   Training samples: {len(train_dataset):,}")
print(f"   Validation samples: {len(val_dataset):,}")
print(f"   Expected time: ~90-120 minutes")

print(f"\nPerformance Targets:")
print(f"   Training loss: < 2.0")
print(f"   ROUGE-1: > 0.35")
print(f"   ROUGE-2: > 0.08")

# Start training
try:
    print(f"\nStarting training...")
    training_output = trainer.train()
    
    print(f"Training completed successfully!")
    final_loss = training_output.training_loss
    print(f"Final training loss: {final_loss:.4f}")
    
    # Performance assessment
    if final_loss < 2.0:
        print(f"EXCELLENT! Target achieved!")
    elif final_loss < 3.0:
        print(f"GOOD! Solid progress made!")
    else:
        print(f"Training loss still high - consider more epochs")
    
except Exception as e:
    print(f"Training error: {e}")
    print(f"Try reducing batch size if out of memory")
    raise

print(f"\nTraining phase completed!")

Training setup completed!

Configuration Summary:
   Device: cpu
   Epochs: 12
   Learning rate: 1e-05
   Batch size: 2
   Gradient accumulation: 4
   Effective batch size: 8
   Mixed precision: False

Creating trainer...
Training Data Summary:
   Training samples: 437
   Validation samples: 94
   Expected time: ~90-120 minutes

Performance Targets:
   Training loss: < 2.0
   ROUGE-1: > 0.35
   ROUGE-2: > 0.08

Starting training...


Step,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel
50,4.9252,4.478503,0.095112,0.004523,0.08146
100,4.6796,4.269139,0.095132,0.003931,0.084614
150,4.46,4.113316,0.105809,0.005112,0.09232
200,4.2239,3.995222,0.106279,0.004976,0.087887
250,4.1588,3.922315,0.115303,0.008225,0.095984
300,4.0561,3.857652,0.129753,0.010728,0.105896
350,4.093,3.816028,0.125963,0.008449,0.105601
400,3.9456,3.774239,0.128265,0.006411,0.104876
450,4.0466,3.769816,0.131428,0.007591,0.111602
500,3.8939,3.734317,0.135354,0.010398,0.115291


There were missing keys in the checkpoint model loaded: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight'].


Training completed successfully!
Final training loss: 4.1787
Training loss still high - consider more epochs

Training phase completed!


## 8. Model Evaluation

In [9]:
# Comprehensive Model Evaluation
def generate_enhanced_response(question: str, model, tokenizer, 
                             config: Optional[Dict] = None) -> str:
    """Generate enhanced response with improved settings."""
    if config is None:
        config = GENERATION_CONFIG
    
    input_text = f"Answer this hydroponic farming question: {question}"
    inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
    
    enhanced_config = config.copy()
    enhanced_config.update({
        "max_new_tokens": 120,
        "num_beams": 6,
        "temperature": 0.8,
        "repetition_penalty": 1.3,
        "length_penalty": 1.2
    })
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            **enhanced_config,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return clean_response_text(response)

def analyze_response_quality(response: str) -> Dict[str, Union[float, str, int]]:
    """Analyze response quality with multiple metrics."""
    words = response.split()
    if not words:
        return {"repetition": 1.0, "quality": "Poor - Empty response", "length": 0, "complexity": 0}
    
    unique_words = len(set(words))
    total_words = len(words)
    repetition_score = (total_words - unique_words) / total_words
    complexity_score = unique_words / total_words
    
    # Quality assessment
    if repetition_score < 0.1 and complexity_score > 0.7 and total_words > 15:
        quality = "EXCELLENT"
    elif repetition_score < 0.2 and complexity_score > 0.6 and total_words > 10:
        quality = "GOOD"
    elif repetition_score < 0.3 and total_words > 5:
        quality = "FAIR"
    else:
        quality = "POOR"
    
    return {
        "repetition": repetition_score,
        "quality": quality,
        "length": total_words,
        "complexity": complexity_score
    }

def assess_model_performance(training_loss: float, rouge_scores: Dict[str, float]) -> Dict[str, str]:
    """Assess overall model performance."""
    loss_status = ("EXCELLENT" if training_loss < 2.0 else 
                  "GOOD" if training_loss < 3.0 else "NEEDS MORE TRAINING")
    
    rouge1_status = ("EXCELLENT" if rouge_scores['eval_rouge1'] > 0.35 else
                    "GOOD" if rouge_scores['eval_rouge1'] > 0.25 else "NEEDS IMPROVEMENT")
    
    rouge2_status = ("EXCELLENT" if rouge_scores['eval_rouge2'] > 0.08 else
                    "GOOD" if rouge_scores['eval_rouge2'] > 0.05 else "NEEDS IMPROVEMENT")
    
    return {
        "loss_status": loss_status,
        "rouge1_status": rouge1_status,
        "rouge2_status": rouge2_status
    }

# Evaluate on test set
print("Evaluating model on test set...")
test_results = trainer.evaluate(eval_dataset=test_dataset)

print(f"\nTest Results:")
for key, value in test_results.items():
    if 'rouge' in key or 'loss' in key:
        print(f"   {key}: {value:.4f}")

# Advanced test questions
ADVANCED_QUESTIONS = [
    "What is the optimal pH range for hydroponic lettuce and why?",
    "How often should I change the nutrient solution and what factors affect this?",
    "What are the best vegetables for hydroponic farming in Rwanda considering climate?",
    "How do I prevent and treat root rot in hydroponic systems effectively?",
    "What essential nutrients do hydroponic tomatoes need for maximum yield?",
    "What's the difference between DWC and NFT systems for beginners?",
    "How do I maintain proper EC levels in my hydroponic nutrient solution?"
]

print(f"\nAdvanced Question Testing:")
model.eval()

for i, question in enumerate(ADVANCED_QUESTIONS, 1):
    try:
        response = generate_enhanced_response(question, model, tokenizer)
        print(f"\n{i}. Q: {question}")
        print(f"   A: {response}")
    except Exception as e:
        print(f"\n{i}. Q: {question}")
        print(f"   Error: {e}")

# Performance analysis
performance = assess_model_performance(training_output.training_loss, test_results)

print(f"\nPerformance Analysis:")
print(f"   Training Loss: {training_output.training_loss:.4f} ({performance['loss_status']})")
print(f"   ROUGE-1: {test_results['eval_rouge1']:.4f} ({performance['rouge1_status']})")
print(f"   ROUGE-2: {test_results['eval_rouge2']:.4f} ({performance['rouge2_status']})")
print(f"   ROUGE-L: {test_results['eval_rougeL']:.4f}")

# Comprehensive quality testing
QUALITY_TEST_QUESTIONS = [
    "What pH level should I maintain for hydroponic tomatoes?",
    "How do I prevent algae growth in my hydroponic system?",
    "What are the signs of nutrient deficiency in hydroponic plants?",
    "How much light do hydroponic vegetables need daily?",
    "What's the difference between DWC and NFT hydroponic systems?",
    "How do I calculate the right nutrient concentration for lettuce?",
    "What temperature should I maintain in my hydroponic greenhouse?",
    "Which crops are most profitable for hydroponic farming in Rwanda?"
]

print(f"\nResponse Quality Analysis:")
quality_metrics = {"repetition": [], "complexity": [], "length": []}

for i, question in enumerate(QUALITY_TEST_QUESTIONS, 1):
    try:
        response = generate_enhanced_response(question, model, tokenizer)
        analysis = analyze_response_quality(response)
        
        quality_metrics["repetition"].append(analysis["repetition"])
        quality_metrics["complexity"].append(analysis["complexity"])
        quality_metrics["length"].append(analysis["length"])
        
        print(f"\n{i}. Q: {question}")
        print(f"   A: {response}")
        print(f"   Quality: {analysis['quality']} | Length: {analysis['length']} | "
              f"Complexity: {analysis['complexity']:.2f} | Repetition: {analysis['repetition']:.2f}")
        
    except Exception as e:
        print(f"\n{i}. Error with question: {e}")

# Final assessment
if quality_metrics["repetition"]:
    avg_repetition = np.mean(quality_metrics["repetition"])
    avg_complexity = np.mean(quality_metrics["complexity"])
    avg_length = np.mean(quality_metrics["length"])
    
    print(f"\nOverall Quality Metrics:")
    print(f"   Average Length: {avg_length:.1f} words")
    print(f"   Average Complexity: {avg_complexity:.2f}")
    print(f"   Average Repetition: {avg_repetition:.2f}")
    
    # Calculate performance score
    performance_score = 0
    if training_output.training_loss < 2.0:
        performance_score += 25
    elif training_output.training_loss < 3.0:
        performance_score += 15
    
    if test_results['eval_rouge1'] > 0.35:
        performance_score += 25
    elif test_results['eval_rouge1'] > 0.25:
        performance_score += 15
    
    if avg_repetition < 0.2:
        performance_score += 25
    elif avg_repetition < 0.3:
        performance_score += 15
    
    if avg_complexity > 0.7:
        performance_score += 25
    elif avg_complexity > 0.6:
        performance_score += 15
    
    print(f"\nFinal Assessment:")
    print(f"   Overall Score: {performance_score}/100")
    
    if performance_score >= 80:
        status = "PRODUCTION READY"
        recommendation = "Deploy immediately with confidence"
    elif performance_score >= 60:
        status = "GOOD QUALITY"
        recommendation = "Suitable for testing and gradual deployment"
    elif performance_score >= 40:
        status = "MODERATE QUALITY"
        recommendation = "Needs additional training or fine-tuning"
    else:
        status = "NEEDS IMPROVEMENT"
        recommendation = "Requires significant improvements"
    
    print(f"   Status: {status}")
    print(f"   Recommendation: {recommendation}")
    
    print(f"\nNext Steps:")
    if performance_score >= 70:
        print(f"   - Save model and integrate with app.py")
        print(f"   - Use enhanced generation settings in production")
        print(f"   - Monitor user feedback and iterate")
    else:
        print(f"   - Continue training with lower learning rate")
        print(f"   - Expand dataset with more examples")
        print(f"   - Fine-tune generation parameters")

print(f"\nEvaluation completed!")

Evaluating model on test set...



Test Results:
   eval_loss: 3.6324
   eval_rouge1: 0.1667
   eval_rouge2: 0.0162
   eval_rougeL: 0.1333

Advanced Question Testing:

1. Q: What is the optimal pH range for hydroponic lettuce and why?
   A: Phosphorylation ranges from 0.5–0.2 °C for chlorophyll and 1–2 g/mL for leafy greens.

1. Q: What is the optimal pH range for hydroponic lettuce and why?
   A: Phosphorylation ranges from 0.5–0.2 °C for chlorophyll and 1–2 g/mL for leafy greens.

2. Q: How often should I change the nutrient solution and what factors affect this?
   A: Change the nutrient solution at least once a week to maintain optimal levels of nutrients and avoid over-dosing.

2. Q: How often should I change the nutrient solution and what factors affect this?
   A: Change the nutrient solution at least once a week to maintain optimal levels of nutrients and avoid over-dosing.

3. Q: What are the best vegetables for hydroponic farming in Rwanda considering climate?
   A: Vegetables can be grown in a wide variety o

## 9. Save the Fine-tuned Model

In [10]:
# Save Fine-tuned Model
import gc
import json
from datetime import datetime

def clear_memory():
    """Clear GPU and system memory."""
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    gc.collect()

def save_model_safely(model, tokenizer, save_path: Path, description: str) -> bool:
    """Save model and tokenizer with error handling."""
    try:
        save_path.mkdir(parents=True, exist_ok=True)
        
        # Save with safe_serialization=False for Windows compatibility
        model.save_pretrained(save_path, safe_serialization=False)
        tokenizer.save_pretrained(save_path)
        
        print(f"SUCCESS: {description} saved to: {save_path}")
        return True
        
    except Exception as e:
        print(f"ERROR: Failed to save {description}: {e}")
        return False

def create_model_info(model_name: str, training_config: Dict, 
                     training_results: Dict, test_results: Dict) -> Dict:
    """Create comprehensive model information."""
    return {
        "model_info": {
            "base_model": model_name,
            "model_type": "FLAN-T5-base fine-tuned for hydroponic farming",
            "creation_date": datetime.now().isoformat(),
            "pytorch_version": torch.__version__
        },
        "dataset_info": {
            "training_samples": len(train_dataset),
            "validation_samples": len(val_dataset),
            "test_samples": len(test_dataset),
            "max_input_length": MAX_INPUT_LENGTH,
            "max_target_length": MAX_TARGET_LENGTH
        },
        "training_config": training_config,
        "performance_metrics": {
            "final_training_loss": training_results.training_loss,
            "test_rouge1": test_results['eval_rouge1'],
            "test_rouge2": test_results['eval_rouge2'],
            "test_rougeL": test_results['eval_rougeL'],
            "test_loss": test_results['eval_loss']
        },
        "generation_config": GENERATION_CONFIG,
        "usage_instructions": {
            "input_format": "Answer this hydroponic farming question: {question}",
            "recommended_max_length": 512,
            "recommended_generation_config": GENERATION_CONFIG
        }
    }

# Clear memory before saving
print("Clearing memory...")
clear_memory()

# Define save paths
final_model_path = MODEL_DIR / "flan-t5-hydroponic-final"
main_model_path = BASE_DIR / "trained_model"

print(f"Saving fine-tuned model...")

# Save to final model directory
success_final = save_model_safely(
    model, tokenizer, final_model_path, 
    "Fine-tuned model (final)"
)

# Save to main directory for app.py compatibility
success_main = save_model_safely(
    model, tokenizer, main_model_path,
    "Fine-tuned model (app compatible)"
)

# Create and save model information
if success_main:
    try:
        model_info = create_model_info(
            MODEL_NAME, TRAINING_CONFIG, 
            training_output, test_results
        )
        
        info_file = main_model_path / "model_info.json"
        with open(info_file, "w", encoding='utf-8') as f:
            json.dump(model_info, f, indent=2, ensure_ascii=False)
        
        print(f"Model info saved to: {info_file}")
        
    except Exception as e:
        print(f"Could not save model info: {e}")

# Save generation config separately for easy access
try:
    config_file = main_model_path / "generation_config.json"
    with open(config_file, "w", encoding='utf-8') as f:
        json.dump(GENERATION_CONFIG, f, indent=2)
    
    print(f"Generation config saved to: {config_file}")
    
except Exception as e:
    print(f"Could not save generation config: {e}")

# Final summary
print(f"\nModel Saving Summary:")
print(f"Model Locations:")
if success_final:
    print(f"   Final model: {final_model_path}")
if success_main:
    print(f"   App-ready model: {main_model_path}")

print(f"\nModel Performance Summary:")
print(f"   Training Loss: {training_output.training_loss:.4f}")
print(f"   Test ROUGE-1: {test_results['eval_rouge1']:.4f}")
print(f"   Test ROUGE-2: {test_results['eval_rouge2']:.4f}")
print(f"   Test ROUGE-L: {test_results['eval_rougeL']:.4f}")

print(f"\nReady for deployment!")
print(f"   Use the model in {main_model_path} for your application")
print(f"   Reference generation_config.json for optimal settings")

# Clean up memory one more time
clear_memory()
print(f"Memory cleaned and model saving completed!")

Clearing memory...
Saving fine-tuned model...
SUCCESS: Fine-tuned model (final) saved to: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model\flan-t5-hydroponic-final
SUCCESS: Fine-tuned model (final) saved to: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model\flan-t5-hydroponic-final
SUCCESS: Fine-tuned model (app compatible) saved to: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model
Model info saved to: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model\model_info.json
Generation config saved to: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model\generation_config.json

Model Saving Summary:
Model Locations:
   Final model: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model\flan-t5-hydroponic-final
   App-ready model: c:\Users\HP\Desktop\ALU\Farmsmart_growmate_chatbot\trained_model

Model Performance Summary:
   Training Loss: 4.1787
   Test ROUGE-1: 0.1667
   Test ROUGE-2: 0.0162
   Test ROUGE-L: 0