<a href="https://colab.research.google.com/github/jonesLevin/Data-Science-Competitions/blob/main/Clinical_Reasoning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Kenya Clinical Reasoning Challenge

Creating a model that tries to match a real clinical physicians diagnosis about a condition in health-care.

In this challenge, we are given 400 authentic clinical prompts—each one a carefully crafted vignette combining a nurse’s background and a complex medical situation. The task is to predict the clinician’s response to each scenario, replicating the reasoning of trained professionals as closely as possible.

The vignettes span a wide range of medical domains, from maternal and child health to critical care, and were originally evaluated by expert clinicians and leading AI models (including GPT-4.0, LLAMA, and GEMINI). Each prompt includes details like the patient's presentation, nurse experience level, and facility type, simulating the nuance and challenge of real clinical environments in Kenya.

In [1]:
!pip install -q accelerate datasets --upgrade

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/354.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m354.7/354.7 kB[0m [31m15.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/491.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.2/491.2 kB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/116.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m183.9/183.9 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.5/143.5 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

## Extract the Data

In [2]:
!unzip /content/drive/MyDrive/Clinical-Reasoning/kenya-clinical-reasoning-challenge20250407-27832-p30dso.zip

Archive:  /content/drive/MyDrive/Clinical-Reasoning/kenya-clinical-reasoning-challenge20250407-27832-p30dso.zip
  inflating: manifest-37ac1cb3002e2d25fa612ef468101f6c20250407-27832-1ynal5z.json  
  inflating: train_raw.csv           
  inflating: test_raw.csv            
  inflating: test.csv                
  inflating: train.csv               
  inflating: Kenya medical vignettes data dictionary.docx  
  inflating: SampleSubmission.csv    


## Imports

In [3]:
import os
import re
import torch
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments
from datasets import Dataset

from sklearn.metrics import accuracy_score

os.environ['TOKENIZERS_PARALLELISM'] = 'false'
os.environ['WANDB_DISABLED'] = 'true'

## Data Exploration

In [4]:
raw_train = pd.read_csv('train_raw.csv')
raw_train.head()

Unnamed: 0,Master_Index,County,Health level,Years of Experience,Prompt,Nursing Competency,Clinical Panel,Clinician,GPT4.0,LLAMA,GEMINI,DDX SNOMED
0,ID_VBWWP,Uasin Gishu,Sub-county Hospitals and Nursing Homes,18.0,I am a nurse with 18 years of experience in Ge...,Pediatric Emergency Burns,SURGERY,Summary:\nA 4 year old with 5% superficial bur...,Given your vast experience as a nurse in Uasin...,1. Immediate treatment protocol for second-deg...,Here's a response addressing the questions reg...,288514009 | Burn involving 5 percent of body s...
1,ID_XMBBY,Uasin Gishu,National Referral Hospitals,17.0,I am a nurse with 17 years of experience in Ge...,Child Health,PAEDIATRICS,Summary\n6-year-old present with vomiting and ...,Clinical Summary: • A 6-year-old girl w...,Based on the symptoms and signs you've describ...,"Based on the presentation, the 6-year-old girl...",420270002 | Ketoacidosis due to type 1 diabete...
2,ID_JZNZW,Kiambu,Sub-county Hospitals and Nursing Homes,12.0,I am a nurse with 12 years of experience in Ge...,General Emergency,INTERNAL MEDICINE,Summary\nA 47-year-old man presents with sever...,"In this case, you're dealing with a 47-year-ol...","Firstly, I must commend you on your thorough h...","This 47-year-old male presenting with severe, ...",13200003 | Peptic ulcer (disorder)\n25458004 |...
3,ID_QOQTK,Uasin Gishu,National Referral Hospitals,12.0,I am a nurse with 12 years of experience in Pr...,Critical Care,INTERNAL MEDICINE,SUMMARY\n\n72-year-old female with inability t...,"Given ER's clinical presentation and vitals, t...",to me with this query. Based on the informatio...,This 92-year-old female patient (ER) presents ...,14760008 | Constipation (finding)\n419284004 |...
4,ID_ZFJBM,Uasin Gishu,National Referral Hospitals,16.0,I am a nurse with 16 years of experience in Ge...,Adult Health,INTERNAL MEDICINE,"A 22 year old female presents with headache, d...",The 22-year-old female patient is presenting w...,Thank you for presenting this case. Based on t...,This 22-year-old female patient presents with ...,95874006 | Carbon monoxide poisoning from fire...


In [5]:
# Nurses inspection details
print(raw_train.iloc[0]['Prompt'])

I am a nurse with 18 years of experience in General nursing working in a Sub-county Hospitals and Nursing Homes in Uasin Gishu county in Kenya. A 4-year-old child presents to the emergency department with second-degree burns on the forearm after accidentally touching a hot stove. The child was playing in the kitchen when they reached out to touch the stove. The burns cover about 5% of the total body surface area. The child is alert and crying, with redness, blisters, and swelling on the affected area. The burns appear to be superficial to moderate in severity. The child is in mild pain, and there is no indication of airway or breathing distress. No other injuries are noted.
Questions:
1. What is the immediate treatment protocol for second-degree burns in paediatric patients?
2. Should any tetanus prophylaxis be considered in this case?
3. What follow-up care should be recommended for burn healing?


In [6]:
# Clinicians prognosis and cause of treatment
print(raw_train.iloc[0]['Clinician'])

Summary:
A 4 year old with 5% superficial burns. No other injuries

Immediate Management:

 * Paracetamol analgesics to to ensure child has minimal
 or no pain
 * Cleaning and frosting of wound with silver sulpha-fizika
* Topical prophylactic can be considered in this case
 * Good nutrition - high protein diet


In [7]:
# GPT 4.0 prompt amd response response
print(raw_train.iloc[0]['GPT4.0'])

Given your vast experience as a nurse in Uasin Gishu County, it's clear you're familiar with managing acute conditions, including burns. Here’s a detailed answer to your questions regarding the immediate and follow-up management of a pediatric patient with second-degree burns: 1. Immediate Treatment Protocol for Second-Degree Burns in Pediatric Patients: For second-degree burns (partial-thickness burns), the focus is on pain management, wound care, and preventing infection. Here’s a step-by-step approach: a. Ensure Safety and Remove the Child from the Source of Burn Injury: •        First, ensure that the child is no longer in contact with the hot stove or any other source of heat. •        Stop the burning process by moving the child away from the heat source. b. Cool the Burn: •        Immediately cool the burn area with lukewarm running water (ideally 15-20 minutes). Cold or icy water should be avoided as it may cause further tissue damage. •        Do not use ice or very cold water

In [8]:
# Type of case the physicians are dealing with

raw_train.iloc[0]['DDX SNOMED'].split('|')[1].strip()

'Burn involving 5 percent of body surface (disorder)'

## Baseline Solution

In [9]:
prompt = raw_train.iloc[0]['Prompt']
print(prompt)

I am a nurse with 18 years of experience in General nursing working in a Sub-county Hospitals and Nursing Homes in Uasin Gishu county in Kenya. A 4-year-old child presents to the emergency department with second-degree burns on the forearm after accidentally touching a hot stove. The child was playing in the kitchen when they reached out to touch the stove. The burns cover about 5% of the total body surface area. The child is alert and crying, with redness, blisters, and swelling on the affected area. The burns appear to be superficial to moderate in severity. The child is in mild pain, and there is no indication of airway or breathing distress. No other injuries are noted.
Questions:
1. What is the immediate treatment protocol for second-degree burns in paediatric patients?
2. Should any tetanus prophylaxis be considered in this case?
3. What follow-up care should be recommended for burn healing?


In [10]:
def preprocess_text(text: str) -> str:
  """ Lowers the text, removes punctuations """
  if not isinstance(text, str): return " "

  text = text.lower().replace('\n', ' ')
  text = ''.join([c if c.isalnum() or c.isspace() else ' ' for c in text])

  return ' '.join(text.split())

In [11]:
raw_train['Prompt_processed'] = raw_train['Prompt'].apply(preprocess_text)
raw_train['Clinician_processed'] = raw_train['Clinician'].apply(preprocess_text)
# raw_train['GPT4.0'] = raw_train['GPT4.0'].apply(preprocess_text)

In [12]:
raw_train.head()

Unnamed: 0,Master_Index,County,Health level,Years of Experience,Prompt,Nursing Competency,Clinical Panel,Clinician,GPT4.0,LLAMA,GEMINI,DDX SNOMED,Prompt_processed,Clinician_processed
0,ID_VBWWP,Uasin Gishu,Sub-county Hospitals and Nursing Homes,18.0,I am a nurse with 18 years of experience in Ge...,Pediatric Emergency Burns,SURGERY,Summary:\nA 4 year old with 5% superficial bur...,Given your vast experience as a nurse in Uasin...,1. Immediate treatment protocol for second-deg...,Here's a response addressing the questions reg...,288514009 | Burn involving 5 percent of body s...,i am a nurse with 18 years of experience in ge...,summary a 4 year old with 5 superficial burns ...
1,ID_XMBBY,Uasin Gishu,National Referral Hospitals,17.0,I am a nurse with 17 years of experience in Ge...,Child Health,PAEDIATRICS,Summary\n6-year-old present with vomiting and ...,Clinical Summary: • A 6-year-old girl w...,Based on the symptoms and signs you've describ...,"Based on the presentation, the 6-year-old girl...",420270002 | Ketoacidosis due to type 1 diabete...,i am a nurse with 17 years of experience in ge...,summary 6 year old present with vomiting and a...
2,ID_JZNZW,Kiambu,Sub-county Hospitals and Nursing Homes,12.0,I am a nurse with 12 years of experience in Ge...,General Emergency,INTERNAL MEDICINE,Summary\nA 47-year-old man presents with sever...,"In this case, you're dealing with a 47-year-ol...","Firstly, I must commend you on your thorough h...","This 47-year-old male presenting with severe, ...",13200003 | Peptic ulcer (disorder)\n25458004 |...,i am a nurse with 12 years of experience in ge...,summary a 47 year old man presents with severe...
3,ID_QOQTK,Uasin Gishu,National Referral Hospitals,12.0,I am a nurse with 12 years of experience in Pr...,Critical Care,INTERNAL MEDICINE,SUMMARY\n\n72-year-old female with inability t...,"Given ER's clinical presentation and vitals, t...",to me with this query. Based on the informatio...,This 92-year-old female patient (ER) presents ...,14760008 | Constipation (finding)\n419284004 |...,i am a nurse with 12 years of experience in pr...,summary 72 year old female with inability to w...
4,ID_ZFJBM,Uasin Gishu,National Referral Hospitals,16.0,I am a nurse with 16 years of experience in Ge...,Adult Health,INTERNAL MEDICINE,"A 22 year old female presents with headache, d...",The 22-year-old female patient is presenting w...,Thank you for presenting this case. Based on t...,This 22-year-old female patient presents with ...,95874006 | Carbon monoxide poisoning from fire...,i am a nurse with 16 years of experience in ge...,a 22 year old female presents with headache di...


In [13]:
raw_train.iloc[0]['Prompt_processed']

'i am a nurse with 18 years of experience in general nursing working in a sub county hospitals and nursing homes in uasin gishu county in kenya a 4 year old child presents to the emergency department with second degree burns on the forearm after accidentally touching a hot stove the child was playing in the kitchen when they reached out to touch the stove the burns cover about 5 of the total body surface area the child is alert and crying with redness blisters and swelling on the affected area the burns appear to be superficial to moderate in severity the child is in mild pain and there is no indication of airway or breathing distress no other injuries are noted questions 1 what is the immediate treatment protocol for second degree burns in paediatric patients 2 should any tetanus prophylaxis be considered in this case 3 what follow up care should be recommended for burn healing'

In [14]:
# Load model and tokenizer
tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small')

# Enable multi-GPU if available
if torch.cuda.device_count() > 1:
  print(f'Using {torch.cuda.device_count()} GPUs')
  model = torch.nn.DataParallel(model)
else:
  print(f'Using a single GPU')

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Using a single GPU


In [15]:
# Tokenization function that returns only needed columns
def tokenize_function(examples):
  inputs = tokenizer(
      ['summarize: ' + text for text in examples['Prompt_processed']],
      max_length=512,
      padding='max_length',
      truncation=True,
      return_tensors='pt'
  )

  with tokenizer.as_target_tokenizer():
    labels = tokenizer(
        examples['Clinician_processed'],
        max_length=256,
        padding='max_length',
        truncation=True,
        return_tensors='pt'
    )

  return {
      'input_ids': inputs['input_ids'],
      'attention_mask': inputs['attention_mask'],
      'labels': labels['input_ids']
  }

In [16]:
# Create dataset and apply tokenization
train_dataset = Dataset.from_pandas(raw_train[['Prompt_processed', 'Clinician_processed']])
train_dataset = train_dataset.map(tokenize_function, batched=True, remove_columns=['Prompt_processed', 'Clinician_processed'])

Map:   0%|          | 0/400 [00:00<?, ? examples/s]



In [17]:
train_dataset

Dataset({
    features: ['input_ids', 'attention_mask', 'labels'],
    num_rows: 400
})

In [18]:
# Split the dataset into train and validation sets
train_val_split = train_dataset.train_test_split(test_size=0.01)
train_dataset = train_val_split['train']
eval_dataset = train_val_split['test']

In [32]:
def compute_metrics(eval_pred):
    """
    Calculates and returns a dictionary of metrics (accuracy in this case).

    Args:
        eval_pred: Predictions and labels from the evaluation set.

    Returns:
        dict: A dictionary containing the calculated accuracy.
    """
    predictions, labels = eval_pred
    # Assuming predictions is a tuple of (logits, _)
    # Extract logits which should have shape (batch_size, sequence_length, vocab_size)
    logits = predictions[0]
    # Get predicted token ids by taking argmax along the vocab dimension
    predicted_ids = np.argmax(logits, axis=-1)

    decoded_preds = tokenizer.batch_decode(predicted_ids, skip_special_tokens=True)

    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

    # Compute and return accuracy
    acc = accuracy_score(decoded_labels, decoded_preds)
    return {'accuracy': acc}

In [33]:
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=5,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=8,
    evaluation_strategy='epoch',
    save_strategy='epoch',
    logging_strategy='steps',
    logging_steps=100,
    save_total_limit=2,
    learning_rate=2e-5,
    weight_decay=0.01,
    warmup_steps=500,
    fp16=True,
    greater_is_better=True,
    remove_unused_columns=False,
    logging_dir='./logs',
    report_to='none',
    disable_tqdm=False,
    logging_first_step=True,
    prediction_loss_only=False
)



In [34]:
# Create Trainer with progress callback
trainer = Trainer(
  model=model,
  args=training_args,
  train_dataset=train_dataset,
  eval_dataset=eval_dataset,
  compute_metrics=compute_metrics  # Add metrics computation
)

In [35]:
# Start training with clear progress indication
print("Starting training...")
print(f"Training on {len(train_dataset)} samples, validating on {len(eval_dataset)} samples")
print(f"Batch size: {training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps}")
print(f"Total optimization steps: {len(train_dataset) // (training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps) * training_args.num_train_epochs}")

trainer.train()

Starting training...
Training on 396 samples, validating on 4 samples
Batch size: 32
Total optimization steps: 60


Epoch,Training Loss,Validation Loss,Accuracy
1,10.7696,9.362108,0.0
2,10.7696,9.265953,0.0
3,10.7696,9.083204,0.0
4,10.7696,8.609651,0.0


TrainOutput(global_step=60, training_loss=9.994180806477864, metrics={'train_runtime': 99.3633, 'train_samples_per_second': 19.927, 'train_steps_per_second': 0.604, 'total_flos': 249028914708480.0, 'train_loss': 9.994180806477864, 'epoch': 4.646464646464646})

In [39]:
# Generate predictions
def generate_predictions(prompts):
  inputs = tokenizer(
      ["summarize: " + text for text in prompts],
      max_length=512,
      padding='max_length',
      truncation=True,
      return_tensors='pt'
  ).to(model.device)

  with torch.inference_mode():
    outputs = model.generate(
      input_ids=inputs['input_ids'],
      attention_mask=inputs['attention_mask'],
      max_length=128
    )

  return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]

In [40]:
# Save predictions
test_df = pd.read_csv('test_raw.csv')
test_df['Clinician'] = generate_predictions(test_df['Prompt'].tolist())
test_df[['Master_Index', 'Clinician']].to_csv('submission.csv', index=False)
print("\nTraining complete and predictions saved!")


Training complete and predictions saved!
