# Sepsis Prediction with DQN+LSTM - GPU Training

This notebook sets up GPU training for the sepsis prediction model using DQN with Bidirectional LSTM and Attention mechanism.

Run this notebook in Google Colab with a GPU runtime for much faster training.

## Setup and Installation

In [None]:
# Check if GPU is available
!nvidia-smi

# Install dependencies
!pip install stable-baselines3==2.6.0 gymnasium==1.1.1 torch>=2.0.0 tqdm matplotlib pandas numpy

In [None]:
# Clone the repository
!git clone https://github.com/YOUR_USERNAME/RL-Sepsis-Prediction.git
%cd RL-Sepsis-Prediction

## Upload Data

You'll need to upload your patient data files. You can either upload directly through Colab's file browser or use Google Drive.

In [None]:
# Option 1: Mount Google Drive if your data is there
from google.colab import drive
drive.mount('/content/drive')

# Create data directories
!mkdir -p data/training_setA

# Copy data from Drive (adjust path as needed)
# !cp -r /content/drive/MyDrive/path/to/data/training_setA/* data/training_setA/

In [None]:
# Option 2: Upload directly from your computer
# (Uncomment and run this cell if you want to upload files directly)

# from google.colab import files
# import os
# import zipfile

# # Upload a zip file containing the data
# uploaded = files.upload()  # This will prompt you to select files

# # Extract the uploaded zip file
# zip_name = list(uploaded.keys())[0]
# with zipfile.ZipFile(zip_name, 'r') as zip_ref:
#     zip_ref.extractall('.')

# # Check the contents of the data directory
# !ls -la data/training_setA

## Configure Training Parameters

You can adjust these parameters based on your needs and the available GPU memory.

In [None]:
# Set training parameters
import os

# Create directories for saving models
os.makedirs("./models", exist_ok=True)

# You can adjust these parameters
MAX_FILES = None  # Set to None to use all files, or a number for testing
TOTAL_TIMESTEPS = 200000  # Increase this for better training (e.g., 500000 or 1000000)
BALANCE_RATIO = 0.4  # Target ratio of sepsis to non-sepsis patients

## Run Training

In [None]:
# Execute the training script
!python train_dqn_lstm.py --max_files $MAX_FILES --total_timesteps $TOTAL_TIMESTEPS --balance_ratio $BALANCE_RATIO

## Download the Trained Model

After training is complete, you can download the model and checkpoint files.

In [None]:
# Zip the models for downloading
!zip -r models.zip ./models dqn_lstm_sepsis_model_final.zip

# Download the zip file
from google.colab import files
files.download('models.zip')

## Visualize Training Progress

You can use this code to visualize the training progress using the evaluation metrics.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import re
import glob

# Path to all checkpoint models
checkpoint_files = sorted(glob.glob('./models/dqn_lstm_sepsis_*.zip'))

# Extract timesteps from filenames
timesteps = [int(re.search(r'_([0-9]+)_steps', file).group(1)) for file in checkpoint_files]

# You would need to implement a function to evaluate each model
# and collect metrics like accuracy, precision, recall, etc.
# This is a placeholder for demonstration

# Example plot
plt.figure(figsize=(12, 6))
plt.plot(timesteps, np.random.rand(len(timesteps)) * 0.5 + 0.3, 'o-', label='Accuracy')
plt.plot(timesteps, np.random.rand(len(timesteps)) * 0.3 + 0.1, 'o-', label='Precision')
plt.plot(timesteps, np.random.rand(len(timesteps)) * 0.4 + 0.4, 'o-', label='Recall')
plt.xlabel('Training Timesteps')
plt.ylabel('Metric Value')
plt.title('Training Progress')
plt.legend()
plt.grid(True)
plt.show()