# 🚀 **mCODE Translation Workflow**

A comprehensive guide to the mCODE translation workflow, from quick-start data processing to advanced CLI tooling and AI model optimization.

## 🏁 **Quick Start: Process and Ingest Data**

This section provides a fast track to get data processed and stored in CORE Memory using default settings.

In [None]:
NUM_TRIALS = 2
NUM_PATIENTS = 2
DEFAULT_MODEL = 'deepseek-coder'
DEFAULT_PROMPT = 'direct_mcode_evidence_based_concise'

!python -m src.cli.mcode_translate fetch-trials --condition "breast cancer" --limit {NUM_TRIALS} --out raw_trials.ndjson
!python -m src.cli.mcode_translate fetch-patients --archive breast_cancer_10_years --limit {NUM_PATIENTS} --out raw_patients.ndjson
!python -m src.cli.mcode_translate process-trials raw_trials.ndjson --out mcode_trials.ndjson --model {DEFAULT_MODEL} --prompt {DEFAULT_PROMPT}
!python -m src.cli.mcode_translate process-patients --in raw_patients.ndjson --out mcode_patients.ndjson
!python -m src.cli.mcode_translate summarize-trials --in mcode_trials.ndjson --model {DEFAULT_MODEL} --ingest
!python -m src.cli.mcode_translate summarize-patients --in mcode_patients.ndjson --ingest

## 🛠️ **CLI Tools Demonstration**

Explore the full capabilities of the unified CLI.

In [None]:
!python -m src.cli.mcode_translate --help

In [None]:
!python -m src.cli.mcode_translate download-data --list

In [None]:
!python -m src.cli.mcode_translate run-tests --help

## 🧪 **Advanced: AI Model Optimization**

Find the best AI model and prompt for your specific data.

In [None]:
!python -m src.cli.mcode_translate optimize-trials --list-models

In [None]:
!python -m src.cli.mcode_translate optimize-trials --list-prompts

In [None]:
# Run optimization and save the best configuration
!python -m src.cli.mcode_translate optimize-trials --trials-file raw_trials.ndjson --cv-folds 3 --save-config optimal_config.json

# You can then use this config in other commands
with open('optimal_config.json', 'r') as f:
    config = json.load(f)
    BEST_MODEL = config['optimal_settings']['model']
    BEST_PROMPT = config['optimal_settings']['prompt']

!python -m src.cli.mcode_translate process-trials raw_trials.ndjson --out mcode_trials_optimized.ndjson --model {BEST_MODEL} --prompt {BEST_PROMPT}

## 📈 **Large-Scale Ingestion**

Process a larger dataset for a more comprehensive knowledge base.

In [None]:
LARGE_SCALE_LIMIT = 25
!python -m src.cli.mcode_translate fetch-trials --condition "lung cancer" --limit {LARGE_SCALE_LIMIT} --out large_raw_trials.ndjson
!python -m src.cli.mcode_translate process-trials large_raw_trials.ndjson --out large_mcode_trials.ndjson --model {BEST_MODEL} --prompt {BEST_PROMPT}
!python -m src.cli.mcode_translate summarize-trials --in large_mcode_trials.ndjson --model {BEST_MODEL} --ingest