Skip to content

Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis

Notifications You must be signed in to change notification settings

Damrl-lab/KORAL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KORAL: Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis

KORAL is a two-stage pipeline for SSD operational analysis:

  • Stage I (Literature KG): Extract an evidence-backed knowledge graph from SSD research papers, aligned to a curated SSD taxonomy.
  • Stage II (Operational Analysis): Summarize telemetry (SMART, workload, environment, etc.) using a rule base, retrieve relevant literature evidence from the Stage I KG, and call an LLM to perform SSD analysis (predictive / descriptive / prescriptive / what-if) with automatic evaluation.
  • Stage II Fleet Mode (Table II): Run collective / fleet-level analysis over a cohort (e.g., 100 drives at once) and compute fleet metrics.

Repository layout

KORAL/
├─ data_preparation/                  # Data prep scripts (Alibaba/Google/env/workload)
├─ dataset/                           # Datasets (Alibaba, Google, env, fio_workload, ...)
├─ stage_I/                           # Stage I: paper → LitKG pipeline
│  ├─ out/                            # Stage I outputs (TTL/JSON/global KG)
│  ├─ __init__.py
│  ├─ ssd_cot_prompt.txt              # Stage I extraction prompt (strict JSON)
│  ├─ ssd_kg_pipeline.py              # Stage I pipeline (papers → TTL/JSON/global KG)
│  └─ taxonomy.json                   # SSD taxonomy (vocabulary)
├─ stage_II/                          # Stage II: operational pipeline + evaluation
│  ├─ evaluation/
│  ├─ features/
│  ├─ kg/
│  ├─ llm/
│  ├─ prompts/
│  ├─ scripts/
│  ├─ utils/
│  ├─ cli.py                          # per-sample Stage II CLI (Table I style)
│  ├─ pipeline.py                     # per-sample pipeline runner
│  ├─ fleet_cli.py                    # fleet-level Stage II CLI (Table II style)
│  ├─ fleet_pipeline.py               # fleet-level runner
│  ├─ config.py
│  ├─ README.md                       # stage II overview
│  └─ README_STAGE_II.txt             # stage II detailed text readme
└─ rule_base.json                     # Stage II rule base (summarization/mapping rules)

Installation

Create a fresh environment (recommended) and install dependencies:

python -m venv .venv
source .venv/bin/activate

pip install -U pip
pip install pandas numpy pyarrow fastparquet tqdm python-dateutil
pip install rdflib PyPDF2 python-dotenv openai requests

Set your OpenAI key (Stage I + Stage II use it):

export OPENAI_API_KEY="YOUR_KEY_HERE"

Stage I: Build the Literature Knowledge Graph (papers → TTL)

Stage I reads a folder of papers (.pdf, .txt, .md) and produces:

  • per-paper: *.ttl and *.kg.json
  • merged: global_knowledge_graph.ttl (accumulates across runs)
  • updated taxonomy: taxonomy.json (if the model proposes new concepts)

Inputs

  • papers folder: a directory containing SSD papers (.pdf, .txt, .md)
  • taxonomy: stage_I/taxonomy.json
  • prompt: stage_I/ssd_cot_prompt.txt

Configure prompt paths

stage_I/ssd_kg_pipeline.py defaults to reading the prompt from prompts/ssd_cot_prompt.txt. Since this repo keeps the prompt inside stage_I/, set:

export KG_PROMPT_PATH="stage_I/ssd_cot_prompt.txt"
export KG_PROMPT_ADDENDA_PATH="stage_I/out/ssd_prompt_addenda_auto.txt"

Run Stage I

Example:

python stage_I/ssd_kg_pipeline.py \
  --papers_dir dataset/papers \
  --taxonomy stage_I/taxonomy.json \
  --out_dir stage_I/out \
  --model gpt-4o

Outputs (Stage I)

stage_I/out/
├─ <paper_slug>.ttl
├─ <paper_slug>.kg.json
└─ global_knowledge_graph.ttl

Stage I merges the current run into stage_I/out/global_knowledge_graph.ttl.


Data preparation (Alibaba / Google / Workload)

This repo includes scripts that prepare Alibaba and Google datasets and create test CSVs. Place data prep code under data_preparation/ and keep datasets under dataset/.

For Table II fleet evaluation, you only need these three prepared datasets:

  • Alibaba SMART (no app)
  • Google SMART
  • SMART + Workload (Alibaba with app)

Stage II: Per-sample analysis

Stage II consumes one input CSV and produces per-sample:

  • prompts,
  • LLM responses,
  • parsed outputs,
  • metrics (predictive + text overlap + grounding).

Stage II prerequisites: copy KG + taxonomy to repo root

By default, Stage II looks for these files in the repo root:

  • taxonomy.json
  • global_knowledge_graph.ttl
  • rule_base.json

If you ran Stage I, copy:

cp stage_I/taxonomy.json taxonomy.json
cp stage_I/out/global_knowledge_graph.ttl global_knowledge_graph.ttl

Run Stage II (per-sample)

Example:

python -m stage_II.cli \
  --dataset_type SMART_ALIBABA \
  --input_csv dataset/alibaba/test_data/smart.csv \
  --tasks predictive,descriptive,prescriptive,whatif \
  --model gpt-4o \
  --limit_rows 100 \
  --out_name demo_smart_alibaba

Outputs go to:

stage_II/runs/<RUN_NAME>/
  input_samples.csv
  responses.jsonl
  metrics_per_sample.csv
  metrics_summary.json
  data_kg_ttl/<sample_id>.ttl   (if rdflib available)

Stage II Fleet Mode: Collective analysis

Fleet mode evaluates a cohort of N drives at once (e.g., N=100).

Supported datasets (as requested)

  • SMART_ALIBABA
  • SMART_GOOGLE
  • SMART_WORKLOAD

Fleet mode expects the input CSV to contain one row per drive (or it will de-duplicate by disk_id/drive_id when possible).

Run fleet evaluation (one dataset)

Example (100-drive cohorts, 5 cohorts):

python -m stage_II.fleet_cli \
  --dataset_type SMART_ALIBABA \
  --input_csv dataset/alibaba/test_data/smart.csv \
  --tasks predictive,descriptive,prescriptive,whatif \
  --cohort_size 100 \
  --num_cohorts 5 \
  --out_name fleet_alibaba_100x5

Fleet outputs go to:

stage_II/runs/<RUN_NAME>/
  cohort_composition.csv
  responses_fleet.jsonl
  metrics_fleet.csv
  metrics_summary_fleet.json
  fleet_kg_ttl/<cohort_id>.ttl   (if rdflib available)

Generate Table II CSV (all 3 datasets)

Use the script under stage_II/scripts/:

python stage_II/scripts/run_table2_fleet.py \
  --alibaba_csv dataset/alibaba/test_data/smart.csv \
  --google_csv dataset/google/test_data/smart.csv \
  --workload_csv dataset/alibaba/test_data/smart_workload.csv \
  --cohort_size 100 \
  --num_cohorts 5 \
  --out_name table2_fleet

This writes:

  • stage_II/runs/table2_fleet/table_II_fleet_results.csv
  • and per-dataset fleet run folders under stage_II/runs/table2_fleet/ (or nested depending on script settings).


Ontology & KG examples

Stage I and Stage II share a consistent “classes vs instances” design:

  • Classes come from the taxonomy (e.g., Temperature, IOPS, TLC, Garbage Collection).
  • Instances represent paper-specific or scenario-specific objects (e.g., SSD_X, EC1, WP1, EXP1).

Common relation patterns you’ll see in the Literature KG (Stage I) and Data KG (Stage II):

  • SSD_X operatesUnder EC1
  • EC1 hasTemperature {"@value": 45, "unit": "C"}
  • EC1 hasWorkloadProfile WP1
  • WP1 hasReadWriteMix "Write-Heavy"
  • Temperature degrades 99th Percentile Latency (directional effect)
  • Workload impactsMetric Latency
  • Assertions always carry evidence text and a confidence score.

Citation

@inproceedings{AkewarEtAl_IPDPS_2026,
  author    = {Akewar, Mayur and Madireddy, Sandeep and Luo, Dongsheng and Bhimani, Janki},
  title     = {KORAL: Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis},
  booktitle = {IEEE International Parallel \\& Distributed Processing Symposium (IPDPS)},
  year      = {2026},
  note      = {To appear}
}

About

Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages