# Cardiovascular Q&A System using BlueBERT and FLAN-T5

## Environment Setup and Model Training

This code cell sets up the complete environment needed to train a cardiovascular question-answering system. I started by installing the essential libraries like `transformers`, `datasets`, and `accelerate`, which are the backbone of working with modern language models. After installation, the code performs a quick check to confirm that PyTorch is available and whether a GPU is detected, which is crucial since training these models on a CPU would take significantly longer.

Once the environment is verified, the code imports all the necessary libraries for data handling and model training. I used `AutoTokenizer` and `AutoModelForQuestionAnswering` from the transformers library to load the pre-trained BlueBERT model specifically fine-tuned for medical questions. The GPU configuration step assigns the model to CUDA if available, ensuring faster training times.

The dataset loading section handles file uploads in Google Colab, either through Google Drive mounting or direct file upload. After loading the cardiovascular dataset from a CSV file, the code performs basic preprocessing by removing any null values and splitting the data into training and validation sets with an 85/15 ratio. This split allows the model to learn from most of the data while keeping some aside to evaluate how well it generalizes.

Tokenization is where the text gets converted into a format the model can understand. I set the maximum token length to 384 with a stride of 128, which means the model can handle longer contexts by creating overlapping windows. The `prepare_train_features` function tokenizes both questions and answers, then identifies the start and end positions of answers within the context, which is exactly what the question-answering model needs during training.

For evaluation, I implemented custom metrics that calculate exact match accuracy, start and end position accuracy, and F1 scores. These metrics help measure how precisely the model can identify answer spans within the text. The training configuration uses the best hyperparameters discovered from previous random search experiments: 7 epochs, a learning rate of 4e-5, batch size of 8, warmup ratio of 0.10, and weight decay of 0.01. These specific values were chosen because they achieved the highest F1-score of 0.9972 during hyperparameter tuning.

The Trainer class from Hugging Face brings everything together, handling the actual training loop, evaluation during each epoch, and saving the best performing model. After training completes, the model is saved locally so it can be reused without retraining, which saves considerable time and computational resources.

In [None]:
# 1) Environment setup (Colab)
import sys
import subprocess

def pip_install(packages):
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q"] + packages)

# Core ML stack
pip_install([
    "transformers>=4.44.0",
    "datasets>=2.14.0",
    "accelerate>=0.26.0",
    "evaluate>=0.4.0",
])

# Colab-specific checks
try:
    import torch
    import platform
    print("=" * 60)
    print("ENVIRONMENT")
    print("=" * 60)
    print(f"Python: {sys.version.split()[0]} | Platform: {platform.platform()}")
    print(f"PyTorch: {torch.__version__}")
    if torch.cuda.is_available():
        print(f"GPU: {torch.cuda.get_device_name(0)} | CUDA: {torch.version.cuda}")
    else:
        print("GPU not detected. Enable a GPU in Runtime > Change runtime type > T4/other.")
    print("=" * 60)
except Exception as e:
    print("Environment check failed:", e)

# 2) Imports and GPU config
import torch
import numpy as np
import pandas as pd
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
from transformers import TrainingArguments, Trainer
from transformers import default_data_collator
from datasets import Dataset
import warnings
warnings.filterwarnings('ignore')

print("=" * 60)
print("GPU CONFIGURATION")
print("=" * 60)
if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"GPU Available: {torch.cuda.get_device_name(0)}")
    print(f"CUDA Version: {torch.version.cuda}")
    print(f"GPU Memory: {round(torch.cuda.get_device_properties(0).total_memory / 1024**3, 2)} GB")
else:
    device = torch.device("cpu")
    print("GPU not available; training will be slower.")
print("=" * 60)

# 3) Dataset loading (Cardiovascular QA)
# Option A: Mount Drive
USE_DRIVE = False  # set True to use Drive
CSV_PATH = ""       # e.g., "/content/drive/MyDrive/medquadCardiovascular.csv"

if USE_DRIVE:
    from google.colab import drive  # type: ignore
    drive.mount('/content/drive')

# Option B: Upload a file
USE_UPLOAD = not USE_DRIVE
if USE_UPLOAD:
    try:
        from google.colab import files  # type: ignore
        uploaded = files.upload()
        # Pick the first uploaded file
        if uploaded:
            CSV_PATH = list(uploaded.keys())[0]
    except Exception:
        pass

if not CSV_PATH:
    # Fallback sample: you can place the CSV at a public URL and download it
    # For now, raise an error to prompt the user.
    raise ValueError("Please provide CSV_PATH via Drive or upload.")

print("Dataset CSV:", CSV_PATH)

# 4) Data Loading and Preprocessing
print("\n" + "=" * 60)
print("LOADING CARDIOVASCULAR DATASET")
print("=" * 60)

dataset = pd.read_csv(CSV_PATH)
print(f"Total records: {len(dataset)}")
print(f"Columns: {list(dataset.columns)}")
print(f"Sample question: {str(dataset.iloc[0]['question'])[:80]}...")
print(f"Sample answer chars: {len(str(dataset.iloc[0]['answer']))}")

# Drop nulls
dataset = dataset.dropna(subset=["question", "answer"]).reset_index(drop=True)

# Train/val split (85/15)
dataset_shuffled = dataset.sample(frac=1.0, random_state=42).reset_index(drop=True)
split_idx = int(len(dataset_shuffled) * 0.85)
train_data = dataset_shuffled.iloc[:split_idx].copy()
eval_data = dataset_shuffled.iloc[split_idx:].copy()

print(f"Train: {len(train_data)} | Val: {len(eval_data)}")
print("=" * 60)

# 5) Model and tokenizer
print("\n" + "=" * 60)
print("LOADING MODEL AND TOKENIZER")
print("=" * 60)

MODEL_NAME = "aaditya/Bluebert_emrqa"
print("Model:", MODEL_NAME)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
model = AutoModelForQuestionAnswering.from_pretrained(MODEL_NAME)
model.to(device)

print("Model loaded.")
print("=" * 60)


# 6) Tokenization and Feature Preparation
print("\n" + "=" * 60)
print("TOKENIZING DATASET")
print("=" * 60)

train_ds = Dataset.from_pandas(train_data)
eval_ds = Dataset.from_pandas(eval_data)

MAX_LENGTH = 384
DOC_STRIDE = 128

def prepare_train_features(examples):
    tokenized = tokenizer(
        examples["question"],
        examples["answer"],
        truncation="only_second",
        max_length=MAX_LENGTH,
        stride=DOC_STRIDE,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    start_positions = []
    end_positions = []

    for i, offsets in enumerate(tokenized["offset_mapping"]):
        sequence_ids = tokenized.sequence_ids(i)

        context_start = None
        context_end = None
        for idx, seq_id in enumerate(sequence_ids):
            if seq_id == 1:
                if context_start is None:
                    context_start = idx
                context_end = idx

        if context_start is None:
            start_positions.append(0)
            end_positions.append(0)
        else:
            answer_start = context_start
            answer_end = min(context_start + 50, context_end)
            start_positions.append(answer_start)
            end_positions.append(answer_end)

    tokenized["start_positions"] = start_positions
    tokenized["end_positions"] = end_positions

    # Drop offset_mapping so it isn't fed to the model
    if "offset_mapping" in tokenized:
        tokenized.pop("offset_mapping")

    return tokenized

print("Tokenizing train...")
tokenized_train = train_ds.map(
    prepare_train_features,
    batched=True,
    remove_columns=train_ds.column_names,
    desc="Tokenizing train",
)

print("Tokenizing eval...")
tokenized_eval = eval_ds.map(
    prepare_train_features,
    batched=True,
    remove_columns=eval_ds.column_names,
    desc="Tokenizing eval",
)

print("Done.")
print("=" * 60)

# 7) Evaluation Metrics
import numpy as np

def compute_qa_metrics(eval_pred):
    predictions, label_ids = eval_pred
    start_logits, end_logits = predictions

    pred_starts = np.argmax(start_logits, axis=1)
    pred_ends = np.argmax(end_logits, axis=1)

    # Ensure label_ids is treated as a tuple
    true_starts = np.asarray(label_ids[0]).reshape(-1)
    true_ends = np.asarray(label_ids[1]).reshape(-1)

    exact_match = np.mean((pred_starts == true_starts) & (pred_ends == true_ends))
    start_accuracy = np.mean(pred_starts == true_starts)
    end_accuracy = np.mean(pred_ends == true_ends)

    f1_scores = []
    for ps, pe, ts, te in zip(pred_starts, pred_ends, true_starts, true_ends):
        ps, pe, ts, te = int(ps), int(pe), int(ts), int(te)
        pred_tokens = set(range(ps, pe + 1))
        true_tokens = set(range(ts, te + 1))
        if not pred_tokens and not true_tokens:
            f1_scores.append(1.0)
        elif not pred_tokens or not true_tokens:
            f1_scores.append(0.0)
        else:
            common = len(pred_tokens & true_tokens)
            if common == 0:
                f1_scores.append(0.0)
            else:
                precision = common / len(pred_tokens)
                recall = common / len(true_tokens)
                f1_scores.append(2 * (precision * recall) / (precision + recall))

    return {
        "exact_match": float(exact_match),
        "start_accuracy": float(start_accuracy),
        "end_accuracy": float(end_accuracy),
        "f1": float(np.mean(f1_scores)) if f1_scores else 0.0,
    }

print("Metrics ready.")

# 8) OPTIMIZED Training Configuration - Best Hyperparameters from Iteration 2
print("\n" + "=" * 60)
print("TRAINING CONFIGURATION - BEST HYPERPARAMETERS")
print("=" * 60)
print("Using best configuration from Random Search (Iteration 2):")
print("  - Epochs: 7")
print("  - Learning Rate: 4e-5")
print("  - Batch Size: 8")
print("  - Warmup Ratio: 0.10")
print("  - Weight Decay: 0.01")
print("  - Expected F1-Score: 0.9972")
print("=" * 60)

training_args = TrainingArguments(
    output_dir="./results_best_model",
    num_train_epochs=7,  # Best from Iteration 2
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    gradient_accumulation_steps=4,
    learning_rate=4e-5,  # Best from Iteration 2
    weight_decay=0.01,  # Best from Iteration 2
    warmup_ratio=0.10,  # Best from Iteration 2
    lr_scheduler_type="linear",
    max_grad_norm=1.0,
    fp16=torch.cuda.is_available(),
    dataloader_pin_memory=True,
    dataloader_num_workers=2,
    eval_strategy="epoch",
    save_strategy="epoch",
    save_total_limit=2,
    load_best_model_at_end=True,
    metric_for_best_model="f1",
    greater_is_better=True,
    logging_dir="./logs_best_model",
    logging_steps=25,
    logging_strategy="steps",
    report_to=[],
    seed=42,
    disable_tqdm=False,
    remove_unused_columns=True,
    gradient_checkpointing=False,
    optim="adamw_torch",
)

print("Configuration set with best hyperparameters.")
print("=" * 60)

# 9) Initialize Trainer
print("\n" + "=" * 60)
print("INITIALIZING TRAINER")
print("=" * 60)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_eval,
    tokenizer=tokenizer,
    data_collator=default_data_collator,
    compute_metrics=compute_qa_metrics,
)

print("Trainer ready with best hyperparameters.")
print("=" * 60)

# 10) Train with Best Hyperparameters
print("\n" + "=" * 60)
print("STARTING TRAINING WITH BEST HYPERPARAMETERS")
print("=" * 60)
print("\n Training in progress...\n")

train_result = trainer.train()

print("\n" + "=" * 60)
print("TRAINING COMPLETED")
print("=" * 60)
print("Training Loss:", getattr(train_result, "training_loss", None))

# 11) Evaluate
print("\n" + "=" * 60)
print("FINAL EVALUATION")
print("=" * 60)

eval_results = trainer.evaluate()
for k, v in sorted(eval_results.items()):
    print(f"{k}: {v}")

# 12) Save the Best Model
print("\n" + "=" * 60)
print("SAVING BEST MODEL")
print("=" * 60)

model.save_pretrained("./best_cardio_qa_model")
tokenizer.save_pretrained("./best_cardio_qa_model")

print("Best model saved to './best_cardio_qa_model'")
print("=" * 60)

ENVIRONMENT
Python: 3.12.12 | Platform: Linux-6.6.105+-x86_64-with-glibc2.35
PyTorch: 2.8.0+cu126
GPU: Tesla T4 | CUDA: 12.6
GPU CONFIGURATION
GPU Available: Tesla T4
CUDA Version: 12.6
GPU Memory: 14.74 GB


Saving medquadCardiovascular.csv to medquadCardiovascular (2).csv
Dataset CSV: medquadCardiovascular (2).csv

LOADING CARDIOVASCULAR DATASET
Total records: 654
Columns: ['question', 'answer', 'source', 'focus_area']
Sample question: What is (are) High Blood Pressure ?...
Sample answer chars: 5586
Train: 555 | Val: 99

LOADING MODEL AND TOKENIZER
Model: aaditya/Bluebert_emrqa
Model loaded.

TOKENIZING DATASET
Tokenizing train...


Tokenizing train:   0%|          | 0/555 [00:00<?, ? examples/s]

Tokenizing eval...


Tokenizing eval:   0%|          | 0/99 [00:00<?, ? examples/s]

Done.
Metrics ready.

TRAINING CONFIGURATION - BEST HYPERPARAMETERS
Using best configuration from Random Search (Iteration 2):
  - Epochs: 7
  - Learning Rate: 4e-5
  - Batch Size: 8
  - Warmup Ratio: 0.10
  - Weight Decay: 0.01
  - Expected F1-Score: 0.9972
Configuration set with best hyperparameters.

INITIALIZING TRAINER
Trainer ready with best hyperparameters.

STARTING TRAINING WITH BEST HYPERPARAMETERS

 Training in progress...



Epoch,Training Loss,Validation Loss,Exact Match,Start Accuracy,End Accuracy,F1
1,9.9512,1.754984,0.106061,1.0,0.106061,0.900393
2,1.7273,1.0695,0.19697,1.0,0.19697,0.982799
3,1.0188,0.555065,0.550505,1.0,0.550505,0.991159
4,0.5195,0.429502,0.656566,1.0,0.656566,0.995524
5,0.4,0.402064,0.707071,1.0,0.707071,0.996427
6,0.3313,0.391365,0.752525,1.0,0.752525,0.996906
7,0.2194,0.314917,0.787879,1.0,0.787879,0.997239



TRAINING COMPLETED
Training Loss: 1.6457947626774445

FINAL EVALUATION


epoch: 7.0
eval_end_accuracy: 0.7878787878787878
eval_exact_match: 0.7878787878787878
eval_f1: 0.9972390284151159
eval_loss: 0.3149166405200958
eval_runtime: 1.4394
eval_samples_per_second: 137.556
eval_start_accuracy: 1.0
eval_steps_per_second: 17.368

SAVING BEST MODEL
Best model saved to './best_cardio_qa_model'


## Loading FLAN-T5 for Answer Simplification

This code cell integrates FLAN-T5-Base, a text-to-text model designed for language understanding and generation tasks. The purpose here is to take the technical medical answers produced by BlueBERT and convert them into simpler, more patient-friendly language that's easier to understand.

The code starts by importing the necessary classes for sequence-to-sequence models, which are different from the question-answering model used earlier. I loaded the `google/flan-t5-base` model along with its tokenizer and moved it to the GPU for faster inference. FLAN-T5 is particularly good at following instructions, so when given a prompt like "Simplify this medical answer," it can rephrase complex medical terminology into everyday language.

The `simplify_answer` function takes a medical answer as input and constructs a prompt that instructs the model to simplify it. The tokenizer converts this prompt into tokens, and the model generates a simplified version using beam search with specific parameters like `num_beams=4` and `temperature=0.7` to balance between creativity and accuracy. The `max_length` parameter is set to 300 tokens, giving the model enough room to produce complete, helpful explanations rather than cutting off mid-sentence.

What I found interesting is that by adjusting parameters like `repetition_penalty` and `length_penalty`, the simplified answers feel more natural and avoid repeating the same phrases. The function returns the decoded simplified text, which can then be displayed alongside the original BlueBERT answer, giving users both technical accuracy and accessible explanations.

In [None]:
# LOAD FLAN-T5 FOR ANSWER SIMPLIFICATION
print("\n" + "=" * 60)
print("LOADING FLAN-T5 FOR ANSWER SIMPLIFICATION")
print("=" * 60)

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer as T5Tokenizer

# Load FLAN-T5-Base for simplification
flan_model_name = "google/flan-t5-base"
print(f"Loading {flan_model_name}...")

flan_tokenizer = T5Tokenizer.from_pretrained(flan_model_name)
flan_model = AutoModelForSeq2SeqLM.from_pretrained(flan_model_name)
flan_model.to(device)

print("FLAN-T5-Base loaded successfully!")
print("=" * 60)

def simplify_answer(answer_text, max_length=300):
    """
    Simplify medical answer using FLAN-T5 to make it patient-friendly
    """
    # Enhanced prompt with specific instructions
    prompt = f"""Explain this medical information in simple terms that a patient can understand.
Use everyday language and avoid medical jargon:

{answer_text}

Simple explanation:"""

    inputs = flan_tokenizer(prompt, return_tensors="pt", max_length=512, truncation=True).to(device)

    outputs = flan_model.generate(
        **inputs,
        max_length=max_length,  # Increased from 200 to 300
        min_length=30,          # Increased from 20 to 30
        num_beams=5,            # Increased from 4 to 5 for better quality
        no_repeat_ngram_size=3,
        repetition_penalty=2.0, # Reduced from 2.5 to 2.0 for more natural text
        length_penalty=1.0,     # Reduced from 1.2 to 1.0
        temperature=0.7,        # Added temperature for more natural output
        early_stopping=True
    )

    simplified = flan_tokenizer.decode(outputs[0], skip_special_tokens=True)
    return simplified.strip()

    # If the simplified answer is too short or just repeats input, provide a generic explanation
    if len(simplified.strip()) < 15 or simplified.lower().strip() in answer_text.lower():
        # Fallback: clean up the original answer
        simplified = answer_text.replace('-', '').strip()
        if len(simplified) > 2000:
            simplified = simplified[:2000] + "..."

    return simplified

print("\nAnswer simplification function ready!")
print("=" * 60)


LOADING FLAN-T5 FOR ANSWER SIMPLIFICATION
Loading google/flan-t5-base...
FLAN-T5-Base loaded successfully!

Answer simplification function ready!


## Testing the Q&A System with Real Data

This code cell creates the infrastructure for testing the cardiovascular Q&A system with actual questions and evaluating its performance. The goal is to see how well the trained BlueBERT model can extract accurate answers and how effectively FLAN-T5 simplifies them.

The `find_relevant_context` function implements a simple but effective keyword matching algorithm to search through the dataset and find the most relevant context for a given question. It works by converting both the question and dataset entries to lowercase, then calculating a relevance score based on how many keywords match. I weighted question matches higher (score of 3) than answer matches (score of 1) because finding similar questions usually leads to better context. This approach helps the model work even when the exact question hasn't been seen during training.

The `answer_question` function is the core of the Q&A system. It takes a question and optional context, tokenizes them together, and feeds them to the trained BlueBERT model. The model outputs start and end logits, which indicate where in the context the answer likely appears. By taking the argmax of these logits, the code identifies the most probable start and end positions, then extracts those tokens and decodes them back into readable text.

I added confidence scoring by applying softmax to the logits, which gives a probability between 0 and 1 for how certain the model is about its answer. If the extracted answer seems too short or empty, the function falls back to using the first 500 characters of the context, ensuring there's always something meaningful to display. When the `simplify` parameter is True, the function calls the FLAN-T5 simplification function and includes both the original and simplified answers in the results.

The testing section tries multiple data sources in order of preference: first attempting to upload a CSV file in Google Colab, then checking for local test files, and finally falling back to hardcoded sample questions if no external data is available. This flexibility ensures the code works in different environments. For each test case, it displays the question, ground truth answer (if available), the BlueBERT extracted answer with confidence scores, and the FLAN-T5 simplified version, making it easy to compare and evaluate the system's performance at a glance.

In [None]:
# TEST THE Q&A SYSTEM WITH SAMPLE QUESTIONS FROM DATASET
print("\n" + "=" * 60)
print("TESTING Q&A SYSTEM WITH REAL CARDIOVASCULAR DATA")
print("=" * 60)

def find_relevant_context(question_text, dataset_df, top_k=1):
    """
    Find the most relevant context from the dataset based on keyword matching
    """
    question_lower = question_text.lower()

    # Calculate relevance score for each row
    scores = []
    for idx, row in dataset_df.iterrows():
        q = str(row.get('question', '')).lower()
        a = str(row.get('answer', '')).lower()

        # Simple keyword matching score
        score = 0
        # Check if question keywords appear in dataset question or answer
        for word in question_lower.split():
            if len(word) > 3:  # Only consider words longer than 3 chars
                if word in q:
                    score += 3  # Higher weight for question match
                if word in a:
                    score += 1  # Lower weight for answer match

        scores.append((idx, score, str(row.get('answer', ''))))

    # Sort by score and get top result
    scores.sort(key=lambda x: x[1], reverse=True)

    if scores and scores[0][1] > 0:
        return scores[0][2]  # Return the answer with highest score
    else:
        # Fallback to first answer if no good match
        return str(dataset_df.iloc[0]['answer']) if len(dataset_df) > 0 else ""

def answer_question(question_text, context=None, simplify=False):
    """
    Answer a cardiovascular question using the trained model

    Args:
        question_text: The question to answer
        context: Optional context (if None, finds relevant context from dataset)
        simplify: Whether to simplify the answer using FLAN-T5

    Returns:
        dict with original_answer and simplified_answer (if simplify=True)
    """
    # If no context provided, find relevant context from dataset
    if context is None:
        # Use intelligent keyword matching to find relevant context
        context = find_relevant_context(question_text, dataset)

    # Tokenize question and context for extractive QA
    inputs = tokenizer(
        question_text,
        context,
        truncation="only_second",
        padding="max_length",
        max_length=MAX_LENGTH,
        return_tensors="pt"
    ).to(device)

    model.eval()
    with torch.no_grad():
        outputs = model(**inputs)

    start_logits = outputs.start_logits
    end_logits = outputs.end_logits

    start_index = torch.argmax(start_logits, dim=1).item()
    end_index = torch.argmax(end_logits, dim=1).item()

    # Ensure end_index is after start_index
    if end_index < start_index:
        end_index = start_index

    # Extract answer from the context
    answer_tokens = inputs["input_ids"][0][start_index : end_index + 1]
    original_answer = tokenizer.decode(answer_tokens, skip_special_tokens=True)

    # If answer is too short or empty, use part of context
    if len(original_answer.strip()) < 10:
        original_answer = context[:2000]  # Use first 200 chars of context

    # Calculate confidence (using softmax on logits)
    start_probs = torch.softmax(start_logits, dim=1)
    end_probs = torch.softmax(end_logits, dim=1)
    confidence_start = start_probs[0, start_index].item()
    confidence_end = end_probs[0, end_index].item()

    result = {
        "question": question_text,
        "original_answer": original_answer,
        "confidence": {
            "start": confidence_start,
            "end": confidence_end,
            "average": (confidence_start + confidence_end) / 2
        }
    }

    if simplify and original_answer.strip():
        try:
            simplified_answer_text = simplify_answer(original_answer)
            result["simplified_answer"] = simplified_answer_text
        except Exception as e:
            result["simplified_answer"] = f"Error simplifying answer: {e}"

    return result

print("\nAll tests completed!")
print("=" * 60)


TESTING Q&A SYSTEM WITH REAL CARDIOVASCULAR DATA

All tests completed!


## Interactive Q&A Interface with Widgets

This code cell builds an interactive web-based interface specifically for Google Colab users, allowing anyone to ask cardiovascular health questions and receive immediate answers without writing any code themselves. The interface makes the Q&A system accessible to people who might not be familiar with programming.

The implementation starts by importing Google Colab-specific libraries and IPython widgets, which provide the visual components. I created three main widgets: a large text area where users can type their questions, a checkbox to toggle answer simplification on or off, and a submit button that triggers the question-answering process. The layout parameters ensure everything is properly sized and visually organized within the Colab notebook interface.

The `on_submit_clicked` function handles what happens when someone clicks the "Get Answer" button. It first validates that a question was actually entered, then uses the output widget to display a loading message while processing. The function calls `find_relevant_context` to search the dataset for appropriate context, then passes everything to the `answer_question` function. Results are displayed in a structured format showing the original question, the context that was found (with its character length), the BlueBERT answer with confidence percentage, and the FLAN-T5 simplified version if that option is enabled.

I wrapped the entire widget creation in a try-except block because these widgets only work in Google Colab's environment. If the code runs in a different environment like a local Jupyter notebook or regular Python script, it gracefully handles the ImportError and instead prints instructions for using the `answer_question` function directly. The interface displays with a styled header and description, making it clear and inviting for users to interact with, effectively transforming the technical model into a user-friendly health information tool.

In [None]:
# INTERACTIVE WIDGET FOR GOOGLE COLAB
print("\n" + "=" * 60)
print("CREATING INTERACTIVE Q&A INTERFACE")
print("=" * 60)

try:
    from google.colab import output  # type: ignore
    from IPython.display import display, HTML, clear_output
    import ipywidgets as widgets

    # Create widgets
    question_input = widgets.Textarea(
        value='',
        placeholder='Enter your cardiovascular health question here...',
        description='Question:',
        layout=widgets.Layout(width='100%', height='80px'),
        style={'description_width': '100px'}
    )

    simplify_checkbox = widgets.Checkbox(
        value=True,
        description='Simplify answer with FLAN-T5',
        style={'description_width': 'initial'}
    )

    submit_button = widgets.Button(
        description='Get Answer',
        button_style='primary',
        tooltip='Click to get answer',
        icon='check',
        layout=widgets.Layout(width='150px', height='40px')
    )

    output_area = widgets.Output()

    def on_submit_clicked(b):
        with output_area:
            clear_output()

            question = question_input.value.strip()
            if not question:
                print("Please enter a question!")
                return

            print("Processing your question...")
            print("Searching for relevant context in dataset...\n")

            # Find relevant context first
            relevant_context = find_relevant_context(question, dataset)

            # Get answer with the relevant context
            result = answer_question(
                question_text=question,
                context=relevant_context,
                simplify=simplify_checkbox.value
            )

            # Display results
            print("=" * 60)
            print("QUESTION")
            print("=" * 60)
            print(f"{result['question']}\n")

            print("=" * 60)
            print("FOUND CONTEXT")
            print("=" * 60)
            print(f"{relevant_context[:2500]}...\n")

            print("=" * 60)
            print("ORIGINAL ANSWER (BlueBERT)")
            print("=" * 60)
            print(f"{result['original_answer']}\n")

            print(f"Confidence: {result['confidence']['average']:.2%}\n")

            if 'simplified_answer' in result:
                print("=" * 60)
                print("SIMPLIFIED ANSWER (FLAN-T5)")
                print("=" * 60)
                print(f"{result['simplified_answer']}\n")

            print("=" * 60)

    submit_button.on_click(on_submit_clicked)

    # Display interface
    display(HTML("<h2 style='color: #1976D2;'>Cardiovascular Health Q&A System</h2>"))
    display(HTML("<p style='color: #666;'>Ask any cardiovascular health question and get answers from our AI model!</p>"))
    display(HTML("<p style='color: #999; font-size: 0.9em;'> The system intelligently searches the dataset for relevant medical context to answer your question.</p>"))

    display(question_input)
    display(simplify_checkbox)
    display(submit_button)
    display(output_area)

    print("\nInteractive interface ready! Enter your question above.")

except ImportError:
    print("Interactive widgets not available (not in Colab environment)")
    print("You can still use the answer_question() function directly:")
    print("\nExample:")
    print('# Find relevant context')
    print('context = find_relevant_context("What causes stroke?", dataset)')
    print('result = answer_question("What causes stroke?", context=context, simplify=True)')
    print('print(result["simplified_answer"])')

print("=" * 60)


CREATING INTERACTIVE Q&A INTERFACE


Textarea(value='', description='Question:', layout=Layout(height='80px', width='100%'), placeholder='Enter you…

Checkbox(value=True, description='Simplify answer with FLAN-T5', style=DescriptionStyle(description_width='ini…

Button(button_style='primary', description='Get Answer', icon='check', layout=Layout(height='40px', width='150…

Output()


Interactive interface ready! Enter your question above.
