# NeuroFetal AI - Modular Training Pipeline

**Version 3.1** - TFLite Auto-Push

This notebook acts as a central launcher for the modular scripts in the codebase. It ensures consistency between local development and Cloud training.

### Steps:
1.  **Setup Environment**: Clone repo & install dependencies.
2.  **Data Ingestion**: Process raw PhysioNet data.
3.  **Train**: Run the deep learning training pipeline.
4.  **Evaluate**: Run ensemble and uncertainty metrics.
5.  **Serve**: Launch the dashboard (optional).
6.  **Deploy**: Convert to TFLite and push to GitHub.

## 1. Setup Environment

In [1]:
from getpass import getpass
import os

# 1. GitHub Authentication
GITHUB_REPO = "Krishna200608/NeuroFetal-AI"
print("Please enter your GitHub Personal Access Token (PAT):")
GITHUB_TOKEN = getpass()

os.environ['GITHUB_TOKEN'] = GITHUB_TOKEN
os.environ['GITHUB_REPO'] = GITHUB_REPO

Please enter your GitHub Personal Access Token (PAT):
··········


In [2]:
# 2. Clone Repository
import shutil
import os

# CRITICAL FIX: Reset to /content before deleting the repo folder
# This prevents 'shell-init: error retrieving current directory'
try:
    os.chdir("/content")
except:
    pass

# Clean up any previous clone to avoid conflicts
if os.path.exists("/content/NeuroFetal-AI"):
    shutil.rmtree("/content/NeuroFetal-AI")

print("Cloning repository...")
!git clone https://{GITHUB_TOKEN}@github.com/{GITHUB_REPO}.git

# Set paths
os.chdir("/content/NeuroFetal-AI")
print("Cloned successfully!")

Cloning repository...
Cloning into 'NeuroFetal-AI'...
remote: Enumerating objects: 1721, done.[K
remote: Counting objects: 100% (205/205), done.[K
remote: Compressing objects: 100% (137/137), done.[K
remote: Total 1721 (delta 145), reused 108 (delta 61), pack-reused 1516 (from 3)[K
Receiving objects: 100% (1721/1721), 232.92 MiB | 22.41 MiB/s, done.
Resolving deltas: 100% (920/920), done.
Updating files: 100% (1182/1182), done.
Cloned successfully!


In [3]:
# 3. Install Dependencies
print("Installing libraries...")
!pip install -q wfdb shap scipy imbalanced-learn pyngrok filterpy scikit-learn matplotlib seaborn pandas numpy tensorflow streamlit plotly python-dotenv
print("Dependencies installed.")

Installing libraries...
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m178.0/178.0 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.2/91.2 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m163.9/163.9 kB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.1/9.1 MB[0m [31m96.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m94.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m95.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for filterpy (setup.py) ... [?25l[?25hdone
[31mERROR: pi

## 2. Data Ingestion
Processes raw `.dat`/`.hea` files into clean `.npy` arrays for training.

In [None]:
# Run the data ingestion script
!python Code/scripts/data_ingestion.py

## 3. Training
Train the Tri-Modal Attention Fusion ResNet using 5-Fold Cross-Validation.
This script automatically handles:
*   Class Balancing (SMOTE)
*   Feature Extraction (CSP)
*   Model Checkpointing (saving best `.keras` files)

In [None]:
# Run the main training script
!python Code/scripts/train.py

## 4. Advanced Evaluation
Generate metrics for:
1.  **Ensemble Performance**: Rank Averaging across folds (AUC maximization).
2.  **Uncertainty Quantification**: Monte Carlo Dropout confidence scores.

In [None]:
print("Running Ensemble Evaluation (Rank Averaging)...")
!python Code/scripts/evaluate_ensemble.py

print("\nRunning Uncertainty Quantification...")
!python Code/scripts/evaluate_uncertainty.py

## 5. Launch Dashboard (Optional)
Run the Streamlit app directly from Colab using `ngrok`.
**Note**: You need an `NGROK_AUTH_TOKEN` set in your `.env` or pasted below.

In [None]:
!pip install streamlit plotly python-dotenv pyngrok

In [None]:
# Create a .env file locally for certain secrets if needed
auth_token = getpass("Enter Ngrok Auth Token (Validation optional if using .env): ")

if auth_token:
    with open("Code/.env", "w") as f:
        f.write(f"NGROK_AUTH_TOKEN={auth_token}\n")

print("Launching Streamlit App...")
!python Code/run_app.py

## 6. Convert to TFLite (Mobile Deployment)
Convert the best trained model to TFLite format and **PUSH to GitHub** automatically.
You don't need to download anything manually.

In [None]:
print("Converting model to TFLite...")
!python Code/scripts/convert_to_tflite.py

print("\nPushing TFLite models to GitHub...")
# Configure Git identity (required for commit)
!git config --global user.email "krishnasikheriya001@gmail.com"
!git config --global user.name "Krishna200608"

# Add TFLite models
!git add Code/models/tflite/*.tflite

# Commit and Push
# We use '|| true' to prevent error if nothing to commit (e.g. running twice)
!git commit -m "chore: Auto-update TFLite models from Colab" || true
!git push origin main

print("✓ Models pushed to GitHub successfully! Check the repo.")

Converting model to TFLite...
2026-02-05 18:24:32.018791: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-02-05 18:24:32.025352: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-02-05 18:24:32.045918: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1770315872.081656    1399 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1770315872.092100    1399 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1770315872.117349    1399 computation_placer.cc:177] computation placer already registered. Please 