ClaML XML to JSON parser for ICD-10-GM and OPS classification data.
From the repository:
cargo install --git https://github.com/HappyEmu/cclaml.gitOr build locally:
cargo build --releaseThe binary is at target/release/cclaml.
To produce a statically-linked Linux binary from macOS:
- Install the musl cross-compiler toolchain:
brew install filosottile/musl-cross/musl-cross- Add the Rust target:
rustup target add x86_64-unknown-linux-musl- Configure the linker in
.cargo/config.toml:
[target.x86_64-unknown-linux-musl]
linker = "x86_64-linux-musl-gcc"- Build:
cargo build --release --target x86_64-unknown-linux-muslThe statically-linked binary is at target/x86_64-unknown-linux-musl/release/cclaml. It runs on any x86_64 Linux system with no runtime dependencies.
cclaml [OPTIONS] <INPUT>
| Argument / Option | Description |
|---|---|
<INPUT> |
Input ClaML XML file (ICD-10-GM or OPS) |
-o, --output <PATH> |
Output path. File path writes a single JSON file; directory path (trailing /) splits into separate files. Omit for stdout |
--compact |
Output compact JSON instead of pretty-printed |
--prefix <PREFIX> |
Prefix for output filenames in directory mode (e.g. icd10gm2025_) |
--emit-paths |
Print written file paths to stdout, one per line. Useful for piping to xargs gzip |
--flat |
Resolve modifiers into individual category codes. Each modifier combination produces a new terminal category that inherits parent metadata. For categories with multiple modifiers, only fully-resolved codes are emitted (no partial application). The top-level modifier definitions remain in the output |
cclaml icd10gm2025.xml -o icd10gm2025.jsonUse a trailing / to write chapters, blocks, categories, and modifiers as separate files:
cclaml icd10gm2025.xml -o out/Produces:
out/chapters.json
out/blocks.json
out/categories.json
out/modifiers.json
Add a prefix to output filenames in directory mode:
cclaml icd10gm2025.xml -o out/ --prefix icd10gm2025_Produces:
out/icd10gm2025_chapters.json
out/icd10gm2025_blocks.json
out/icd10gm2025_categories.json
out/icd10gm2025_modifiers.json
Skip pretty-printing for smaller files:
cclaml icd10gm2025.xml -o out/ --compactUse --emit-paths to print written file paths to stdout, then pipe to xargs gzip:
cclaml icd10gm2025.xml -o out/ --prefix icd10gm2025_ --emit-paths | xargs gzipUse --flat to resolve modifiers into individual category codes. Each modifier combination produces a new terminal category:
cclaml icd10gm2025.xml --flat -o flat.jsonParent categories gain a mod_codes field listing the resolved codes, while each resolved category inherits parent metadata (inclusions, exclusions, breadcrumbs, etc.). Resolved categories also get a label_long built by joining the parent label with each modifier value's label using : (e.g., "Diabetes mellitus, Typ 1: Mit Koma: Als entgleist bezeichnet" for E10.01).
Omit -o to write JSON to stdout:
cclaml icd10gm2025.xml > icd10gm2025.json{
"code": "I",
"label": "Bestimmte infektiöse und parasitäre Krankheiten",
"sub_classes": ["A00-A09", "A15-A19", "A20-A28"],
"inclusions": ["Krankheiten, die allgemein als ansteckend oder übertragbar anerkannt sind"],
"exclusions": ["Grippe und sonstige akute Infektionen der Atemwege {{J00-J22}}"]
}Fields:
code— Chapter identifier (roman numeral for ICD-10-GM, digit for OPS).label— Preferred label text.sub_classes— Block codes belonging to this chapter.inclusions,exclusions,notes,coding_hints,definitions— Rubric texts. Omitted when empty.
{
"code": "A00-A09",
"label": "Infektiöse Darmkrankheiten",
"range_start": "A00",
"range_end": "A09",
"super_class": "I",
"sub_classes": ["A00", "A01", "A02", "A03", "A04", "A05", "A06", "A07", "A08", "A09"],
"breadcrumb": [
{ "code": "I", "kind": "chapter" }
]
}Fields:
code— Block range code. ICD-10-GM uses-separator (A00-A09), OPS uses...(1-20...1-33).range_start,range_end— Parsed start/end codes of the range.super_class— Parent chapter or parent block code.sub_classes— Category or sub-block codes within this block.breadcrumb— Ancestor path from chapter to this block's parent, each entry withcodeandkind(chapterorblock). Blocks can be nested (e.g., ICD: chapter II → block C00-C97 → block C00-C75 → block C00-C14).inclusions,exclusions,notes— Rubric texts. Omitted when empty.
{
"code": "A00.1",
"label": "Cholera durch Vibrio cholerae O:1, Biovar eltor",
"is_terminal": true,
"super_class": "A00",
"breadcrumb": [
{ "code": "I", "kind": "chapter" },
{ "code": "A00-A09", "kind": "block" },
{ "code": "A00", "kind": "category" }
],
"inclusions": ["El-Tor-Cholera"]
}Fields:
code— Category code. ICD-10-GM: letter + digits (A00.1). OPS: digit + hyphen + digits (1-202.01).label— Preferred label. References to other codes appear as{{A00.0†}}(dagger),{{G63.0*}}(aster),{{U80!}}(optional).label_long— Extended label. For OPS categories this comes from thepreferredLongrubric in the XML. In--flatmode, it is constructed by joining the parent category label with each modifier value label using:(e.g.,"Diabetes mellitus, Typ 1: Mit Koma: Als entgleist bezeichnet"). Omitted when absent.is_terminal—trueif the category has no sub-categories.super_class— Parent category or block code.sub_classes— Child category codes. Omitted when empty.breadcrumb— Ancestor path from chapter to this category's parent, each entry withcodeandkind(chapter,block, orcategory). Does not include the category itself.inclusions,exclusions,coding_hints,definitions,notes— Rubric texts. Omitted when empty.modifiers— Modifier references. Each has acodepointing into the top-level modifiers map.valid_valueslists allowed modifier codes when not all values apply; omitted when all values are valid.mod_codes— (flat mode only) List of resolved modifier codes derived from this category. Only present when--flatis used and the category has modifiers.
Modifiers are keyed by their code in a top-level map:
{
"S04E10_4": {
"description": "Die folgenden vierten Stellen sind bei den Kategorien E10-E14 zu benutzen:",
"values": [
{
"code": ".0",
"label": "Mit Koma",
"inclusions": ["Diabetisches Koma: hyperosmolar", "Diabetisches Koma: mit oder ohne Ketoazidose"],
"exclusions": ["Hypoglykämisches Koma (.6)"]
},
{
"code": ".2",
"label": "Mit Nierenkomplikationen",
"usage": "dagger",
"inclusions": ["Diabetische Nephropathie {{N08.3*}}", "Kimmelstiel-Wilson-Syndrom {{N08.3*}}"]
}
]
}
}Fields:
description— Label for the modifier group (from thetextrubric).values— Available modifier values, each with:code— Modifier value code (e.g.,.0,.2).label— Preferred label text.usage— Usage kind if present (dagger,aster,optional). Omitted when absent.inclusions,exclusions,coding_hints,definitions,notes— Per-value rubric texts. Omitted when empty.excludes— Modifier value combinations that are invalid when this value is used. Each entry names the othermodifiercode and the excludedcodewithin it. Omitted when empty.
When writing to stdout (no -o), you can pipe directly to jq to extract or filter parts of the output.
cclaml icd10gm2025.xml | jq '.chapters'cclaml icd10gm2025.xml | jq '.blocks'cclaml icd10gm2025.xml | jq '[.categories[] | select(.is_terminal)]'cclaml icd10gm2025.xml | jq '.categories[] | select(.code == "A00.1")'cclaml icd10gm2025.xml | jq '.chapters[] | {code, label}'cclaml icd10gm2025.xml | jq '[.categories[] | select(.breadcrumb[] | .code == "A00-A09" and .kind == "block")]'cclaml icd10gm2025.xml | jq '.modifiers["S02C88_5"]'For large files, --compact skips pretty-printing and produces smaller output, which jq can then re-format as needed:
cclaml icd10gm2025.xml --compact | jq '.chapters'- ICD-10-GM — International Classification of Diseases, German Modification
- OPS — Operationen- und Prozedurenschlüssel (German procedure classification)
Both use the ClaML 2.0.0 XML schema.