<p align="left">
  <img src="https://raw.githubusercontent.com/python35/IINTS-SDK/main/img/iints_logo.png" width="160">
</p>
# Data Registry & Real-World Import
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/python35/IINTS-SDK/blob/main/examples/notebooks/08_Data_Registry_and_Import.ipynb)

**Goal:** discover official datasets, fetch the bundled sample set, and import it into a runnable scenario.

**You will learn:**
- List datasets in the official registry
- Fetch the bundled sample dataset (offline)
- Convert CGM CSV into an IINTS scenario
- Run a short simulation from imported data


In [1]:
from __future__ import annotations
from pathlib import Path
import os
import sys
import subprocess


def _find_repo_root() -> Path | None:
    for root in [Path.cwd(), *Path.cwd().parents]:
        if (root / "pyproject.toml").exists() and (root / "src").exists():
            return root
    return None

repo_root = _find_repo_root()
if repo_root is None:
    try:
        import google.colab  # type: ignore
        in_colab = True
    except Exception:
        in_colab = False

    if not in_colab:
        raise RuntimeError("Run this notebook inside the IINTS-SDK repo or on Colab.")

    if not Path("IINTS-SDK").exists():
        subprocess.check_call(["git", "clone", "https://github.com/python35/IINTS-SDK.git"])
    repo_root = Path("IINTS-SDK").resolve()

os.chdir(repo_root)
sys.path.insert(0, str(repo_root / "src"))
print("Repo root:", repo_root)


Repo root: /home/runner/work/IINTS-SDK/IINTS-SDK


## Step 1: List datasets


In [2]:
from iints.data import load_dataset_registry

registry = load_dataset_registry()
[{"id": d["id"], "name": d["name"], "access": d["access"]} for d in registry]


[{'id': 'sample', 'name': 'IINTS Sample CGM (Bundled)', 'access': 'bundled'},
 {'id': 'aide_t1d',
  'name': 'AIDE T1D Public Dataset',
  'access': 'public-download'},
 {'id': 'pedap', 'name': 'PEDAP Public Dataset', 'access': 'public-download'},
 {'id': 'azt1d',
  'name': 'AZT1D: A Real-World Dataset for Type 1 Diabetes',
  'access': 'manual'},
 {'id': 'hupa_ucm', 'name': 'HUPA-UCM Diabetes Dataset', 'access': 'manual'},
 {'id': 'openaps_data_commons',
  'name': 'OpenAPS Data Commons',
  'access': 'request'},
 {'id': 'tidepool_bigdata',
  'name': 'Tidepool Big Data Donation',
  'access': 'request'},
 {'id': 'niddk_central',
  'name': 'NIDDK Central Repository',
  'access': 'request'},
 {'id': 't1d_exchange',
  'name': 'T1D Exchange Clinic Registry',
  'access': 'request'}]

## Step 2: Fetch the bundled sample dataset (offline)


In [3]:
from pathlib import Path
from iints.data import fetch_dataset

output_dir = Path("data_packs/sample")
paths = fetch_dataset("sample", output_dir=output_dir, extract=False)
paths


[PosixPath('data_packs/sample/demo_cgm.csv')]

## Step 3: Convert CSV to scenario


In [4]:
from iints.data import scenario_from_csv

sample_csv = paths[0]
result = scenario_from_csv(sample_csv, scenario_name="Sample CGM")
result.scenario


{'scenario_name': 'Sample CGM',
 'scenario_version': '1.0',
 'description': 'Imported CGM scenario',
 'stress_events': [{'start_time': 60,
   'event_type': 'meal',
   'value': 45.0,
   'absorption_delay_minutes': 10,
   'duration': 60},
  {'start_time': 360,
   'event_type': 'meal',
   'value': 60.0,
   'absorption_delay_minutes': 10,
   'duration': 60},
  {'start_time': 720,
   'event_type': 'meal',
   'value': 70.0,
   'absorption_delay_minutes': 10,
   'duration': 60}]}

## Step 4: Run a short simulation from imported data


In [5]:
import iints
from iints.core.algorithms.fixed_basal_bolus import FixedBasalBolus
from iints.validation import load_patient_config_by_name

patient_config = load_patient_config_by_name("clinic_safe_baseline").model_dump()
algorithm = FixedBasalBolus(settings={"fixed_basal_rate": 0.4, "carb_ratio": 12.0})

outputs = iints.run_simulation(
    algorithm=algorithm,
    scenario=result.scenario,
    patient_config=patient_config,
    duration_minutes=240,
    time_step=5,
    output_dir="results/data_sample",
)

outputs["results"].head()


Simulation terminated early: Critical failure: glucose < 40.0 mg/dL for 30 minutes.


Simulation terminated early: Critical failure: glucose < 40.0 mg/dL for 30 minutes.


Simulation terminated early: Critical failure: glucose < 40.0 mg/dL for 30 minutes.


Unnamed: 0,time_minutes,glucose_actual_mgdl,glucose_to_algo_mgdl,delivered_insulin_units,algo_recommended_insulin_units,sensor_status,pump_status,pump_reason,basal_insulin_units,bolus_insulin_units,...,uncertainty,fallback_triggered,safety_level,safety_actions,safety_reason,safety_triggered,supervisor_latency_ms,human_intervention,human_intervention_note,algorithm_why_log
0,0,140.0,140.0,0.033333,0.033333,ok,ok,approved,0.033333,0.0,...,0.0,False,safe,,APPROVED,False,0.012453,False,,[]
1,5,118.996296,118.996296,0.0,0.033333,ok,ok,approved,0.033333,0.0,...,0.0,False,safe,NEGATIVE_TREND_LIMIT: Glucose dropping at -4.2...,NEGATIVE_TREND_LIMIT: Glucose dropping at -4.2...,True,0.018966,False,,[]
2,10,101.139444,101.139444,0.0,0.033333,ok,ok,approved,0.033333,0.0,...,0.0,False,safe,NEGATIVE_TREND_LIMIT: Glucose dropping at -3.8...,NEGATIVE_TREND_LIMIT: Glucose dropping at -3.8...,True,0.011161,False,,[]
3,15,85.957417,85.957417,0.0,0.033333,ok,ok,approved,0.033333,0.0,...,0.0,False,safe,NEGATIVE_TREND_LIMIT: Glucose dropping at -3.3...,NEGATIVE_TREND_LIMIT: Glucose dropping at -3.3...,True,0.009428,False,,[]
4,20,73.048989,73.048989,0.0,0.033333,ok,ok,approved,0.033333,0.0,...,0.0,False,safe,NEGATIVE_TREND_LIMIT: Glucose dropping at -2.8...,NEGATIVE_TREND_LIMIT: Glucose dropping at -2.8...,True,0.009117,False,,[]


### Recap
You can now go from **official dataset registry → CSV import → runnable scenario** in a few steps.
