# Step 4 - RDLS candidate classification (HDX → RDLS components)

This notebook classifies **HDX dataset-level metadata** into **inclusive RDLS components**:

- **hazard**
- **exposure**
- **vulnerability_proxy**
- **loss_impact**

It combines three evidence channels (in priority order):

1. **HDX tags** (weighted; configurable)
2. **Keywords** in title/notes (regex; configurable)
3. **Organization hints** (publisher heuristics; configurable)

It also enforces the **OSM exclusion policy** from Step 2:
datasets whose IDs appear in `policy/osm_excluded_dataset_ids.txt` are marked as excluded
from downstream RDLS translation (but remain in the classification table).

## Inputs (from earlier steps)

- `hdx_dataset_metadata_dump/dataset_metadata/*.json`  (Step 1 output)
- `hdx_dataset_metadata_dump/policy/osm_excluded_dataset_ids.txt` (Step 2 output)
- `hdx_dataset_metadata_dump/config/tag_to_rdls_component.yaml` (Step 3 output)
- `hdx_dataset_metadata_dump/config/keyword_to_rdls_component.yaml` (Step 3 output)
- `hdx_dataset_metadata_dump/config/org_hints.yaml` (Step 3 output)

## Outputs

Written under `hdx_dataset_metadata_dump/derived/`:

- `classification.csv` — one row per dataset with scores, assigned components, confidence, and policy flags
- `classification_summary.json` — counts by component/confidence and policy exclusions
- `rdls_included_dataset_ids.txt` — dataset IDs where `rdls_candidate==True` and `excluded_by_policy==False`
- `errors_classification.jsonl` — JSONL log for JSON parse errors or unexpected schema variants

## Notes

- **Inclusive by design**: thresholds are deliberately low at first; you will tighten via QA/overrides.
- **Deterministic**: stable file ordering; same inputs → same outputs.
- **Scale**: designed for ~30k datasets on a laptop (streaming iteration).


In [1]:
from __future__ import annotations

import json
import re
from collections import Counter, defaultdict
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional, Tuple

import pandas as pd

# Optional dependency: PyYAML. Install if needed:
#   pip install pyyaml
try:
    import yaml
except ImportError as e:
    raise ImportError(
        "Missing dependency: pyyaml. Install with `pip install pyyaml`."
    ) from e


In [2]:
# =======================================
# Path configuration (robust repo layout)
# =======================================
# You said: repo root = hdx-metadata-crawler/
#           notebooks live under hdx-metadata-crawler/notebook/
#           Step outputs live under hdx-metadata-crawler/hdx_dataset_metadata_dump/
#
# This cell auto-discovers DUMP_DIR by walking up parents until it finds the folder.

def find_dump_dir(start: Path, folder_name: str = "hdx_dataset_metadata_dump") -> Path:
    for p in [start] + list(start.parents):
        candidate = p / folder_name
        if candidate.exists() and candidate.is_dir():
            return candidate
    raise FileNotFoundError(
        f"Could not find '{folder_name}' from start={start}. "
        "Run from within the repo, or set DUMP_DIR manually."
    )

NOTEBOOK_DIR = Path.cwd()
DUMP_DIR = find_dump_dir(NOTEBOOK_DIR)

# Step 1 outputs
DATASET_DIR = DUMP_DIR / "dataset_metadata"

# Step 2 outputs
POLICY_DIR = DUMP_DIR / "policy"
OSM_EXCLUDED_IDS_TXT = POLICY_DIR / "osm_excluded_dataset_ids.txt"

# Step 3 outputs
CONFIG_DIR = DUMP_DIR / "config"
TAG_MAP_YAML = CONFIG_DIR / "tag_to_rdls_component.yaml"
KEYWORD_MAP_YAML = CONFIG_DIR / "keyword_to_rdls_component.yaml"
ORG_HINTS_YAML = CONFIG_DIR / "org_hints.yaml"

# Step 4 outputs
DERIVED_DIR = DUMP_DIR / "derived"
DERIVED_DIR.mkdir(parents=True, exist_ok=True)

OUT_CLASSIFICATION_CSV = DERIVED_DIR / "classification.csv"
OUT_SUMMARY_JSON = DERIVED_DIR / "classification_summary.json"
OUT_INCLUDED_IDS_TXT = DERIVED_DIR / "rdls_included_dataset_ids.txt"
OUT_ERRORS_JSONL = DERIVED_DIR / "errors_classification.jsonl"

print("Notebook dir:", NOTEBOOK_DIR)
print("DUMP_DIR     :", DUMP_DIR)
print("DATASET_DIR  :", DATASET_DIR)
print("POLICY_DIR   :", POLICY_DIR)
print("CONFIG_DIR   :", CONFIG_DIR)
print("DERIVED_DIR  :", DERIVED_DIR)


Notebook dir: C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\notebook
DUMP_DIR     : C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump
DATASET_DIR  : C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\dataset_metadata
POLICY_DIR   : C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\policy
CONFIG_DIR   : C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\config
DERIVED_DIR  : C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\derived


In [3]:
# ==========================
# Load configuration inputs
# ==========================

RDLS_COMPONENTS = ("hazard", "exposure", "vulnerability_proxy", "loss_impact")

def load_yaml(path: Path) -> Dict[str, Any]:
    if not path.exists():
        raise FileNotFoundError(f"Missing config file: {path}")
    with path.open("r", encoding="utf-8") as f:
        return yaml.safe_load(f) or {}

tag_weights: Dict[str, Dict[str, int]] = load_yaml(TAG_MAP_YAML)
keyword_patterns: Dict[str, List[str]] = load_yaml(KEYWORD_MAP_YAML)
org_hints: Dict[str, Dict[str, int]] = load_yaml(ORG_HINTS_YAML)

# Compile regex per component (case-insensitive)
compiled_keywords: Dict[str, List[re.Pattern]] = {}
for comp in RDLS_COMPONENTS:
    pats = keyword_patterns.get(comp, []) or []
    compiled_keywords[comp] = [re.compile(p, flags=re.IGNORECASE) for p in pats]

# Load OSM exclusion list (optional, but expected after Step 2)
osm_excluded: set[str] = set()
if OSM_EXCLUDED_IDS_TXT.exists():
    with OSM_EXCLUDED_IDS_TXT.open("r", encoding="utf-8") as f:
        for line in f:
            v = line.strip()
            if v:
                osm_excluded.add(v)
else:
    print(f"WARNING: OSM exclusion list not found: {OSM_EXCLUDED_IDS_TXT}. Proceeding with empty exclusion set.")

print("Loaded tag maps for components:", [k for k in tag_weights.keys()])
print("Loaded keyword maps for components:", [k for k in keyword_patterns.keys()])
print("Loaded org hints:", len(org_hints))
print("OSM excluded IDs:", len(osm_excluded))


Loaded tag maps for components: ['exposure', 'hazard', 'loss_impact', 'vulnerability_proxy']
Loaded keyword maps for components: ['exposure', 'hazard', 'loss_impact', 'vulnerability_proxy']
Loaded org hints: 4
OSM excluded IDs: 3649


In [4]:
# ==========================
# Helpers: IO and extraction
# ==========================

def iter_json_files(folder: Path) -> Iterable[Path]:
    """Yield JSON files in a folder (non-recursive), sorted for determinism."""
    if not folder.exists():
        raise FileNotFoundError(f"Dataset folder not found: {folder}")
    yield from sorted(folder.glob("*.json"))

def safe_load_json(path: Path) -> Dict[str, Any]:
    """Load JSON with a helpful error message."""
    with path.open("r", encoding="utf-8") as f:
        return json.load(f)

def as_list(x: Any) -> List[Any]:
    if x is None:
        return []
    if isinstance(x, list):
        return x
    return [x]

def normalize_text(s: Any) -> str:
    if not s:
        return ""
    return str(s).strip()

def extract_formats(resources: Any) -> List[str]:
    """Extract unique formats from the dataset's resources."""
    fmts = []
    for r in as_list(resources):
        fmt = normalize_text(r.get("format"))
        if fmt:
            fmts.append(fmt.upper())
    # Deduplicate, stable order
    seen = set()
    out = []
    for f in fmts:
        if f not in seen:
            out.append(f)
            seen.add(f)
    return out


In [5]:
# ==========================
# Scoring / classification
# ==========================

# These are conservative defaults. You can tune after QA.
KEYWORD_HIT_WEIGHT = 1  # per unique regex pattern matched
CANDIDATE_MIN_SCORE = 4  # inclusive threshold for "RDLS candidate"
CONF_HIGH = 7            # score >= this -> high confidence
CONF_MED = 4             # score >= this -> medium confidence

@dataclass
class Classification:
    scores: Dict[str, int]
    components: List[str]
    rdls_candidate: bool
    confidence: str
    top_signals: List[str]

def classify_dataset(meta: Dict[str, Any]) -> Classification:
    """Return classification for one dataset-level HDX metadata JSON."""

    # Core text fields
    title = normalize_text(meta.get("title"))
    notes = normalize_text(meta.get("notes"))
    org = normalize_text(meta.get("organization"))

    text = f"{title}\n{notes}".strip()

    # Tags: HDX tags are the most reliable signal in your current setup
    tags = [normalize_text(t) for t in as_list(meta.get("tags"))]
    tags_lower = [t.lower() for t in tags if t]

    # Prepare scoring containers
    scores = {c: 0 for c in RDLS_COMPONENTS}
    signals: List[Tuple[int, str]] = []  # (abs_weight, signal_string) for later ranking

    # 1) Tag weights
    for comp, weights in tag_weights.items():
        if comp not in RDLS_COMPONENTS:
            continue
        for t in tags_lower:
            w = weights.get(t)
            if w:
                scores[comp] += int(w)
                signals.append((abs(int(w)), f"tag:{t}(+{w})→{comp}"))

    # 2) Keyword regex hits
    if text:
        for comp in RDLS_COMPONENTS:
            hits = 0
            for pat in compiled_keywords.get(comp, []):
                if pat.search(text):
                    hits += 1
                    signals.append((KEYWORD_HIT_WEIGHT, f"kw:{pat.pattern}(+{KEYWORD_HIT_WEIGHT})→{comp}"))
            if hits:
                scores[comp] += hits * KEYWORD_HIT_WEIGHT

    # 3) Organization hints (substring match, normalized)
    org_norm = org.lower()
    for hint, comp_weights in org_hints.items():
        if not hint:
            continue
        if hint.lower() in org_norm:
            for comp, w in comp_weights.items():
                if comp in RDLS_COMPONENTS and w:
                    scores[comp] += int(w)
                    signals.append((abs(int(w)), f"org:{hint}(+{w})→{comp}"))

    # Decide assigned components (multi-label)
    max_score = max(scores.values()) if scores else 0
    components = [c for c, s in scores.items() if s >= CANDIDATE_MIN_SCORE]

    # Inclusive rule: if nothing meets threshold, keep the best component if it has any signal
    if not components and max_score > 0:
        best = [c for c, s in scores.items() if s == max_score]
        components = best[:1]

    rdls_candidate = max_score >= CANDIDATE_MIN_SCORE

    if max_score >= CONF_HIGH:
        confidence = "high"
    elif max_score >= CONF_MED:
        confidence = "medium"
    else:
        confidence = "low"

    # Keep only the most informative signals (top 8 by weight)
    signals_sorted = [s for _, s in sorted(signals, key=lambda x: x[0], reverse=True)]
    top_signals = signals_sorted[:8]

    return Classification(scores=scores, components=components, rdls_candidate=rdls_candidate,
                          confidence=confidence, top_signals=top_signals)


In [6]:
# ==========================
# Run classification (stream)
# ==========================

files = list(iter_json_files(DATASET_DIR))
total = len(files)

rows: List[Dict[str, Any]] = []
errors = 0

# Small progress printer (avoids requiring tqdm)
def progress(i: int, every: int = 2000) -> None:
    if i % every == 0 or i == total:
        print(f"Processed {i}/{total} datasets ({i/total:.1%})")

# Reset errors log
if OUT_ERRORS_JSONL.exists():
    OUT_ERRORS_JSONL.unlink()

for i, fp in enumerate(files, start=1):
    try:
        meta = safe_load_json(fp)

        dataset_id = normalize_text(meta.get("id"))
        cls = classify_dataset(meta)

        excluded_by_policy = dataset_id in osm_excluded

        resources = as_list(meta.get("resources"))
        formats = extract_formats(resources)

        row = {
            "dataset_id": dataset_id,
            "name": normalize_text(meta.get("name")),
            "title": normalize_text(meta.get("title")),
            "organization": normalize_text(meta.get("organization")),
            "dataset_source": normalize_text(meta.get("dataset_source")),
            "license_title": normalize_text(meta.get("license_title")),
            "dataset_date": normalize_text(meta.get("dataset_date")),
            "last_modified": normalize_text(meta.get("last_modified")),
            "data_update_frequency": normalize_text(meta.get("data_update_frequency")),
            "groups": ";".join([normalize_text(g) for g in as_list(meta.get("groups")) if normalize_text(g)]),
            "tags": ";".join([t for t in as_list(meta.get("tags")) if normalize_text(t)]),
            "resource_count": len(resources),
            "formats": ";".join(formats),
            # Scoring outputs
            **{f"score_{k}": int(v) for k, v in cls.scores.items()},
            "rdls_components": ";".join(cls.components),
            "rdls_candidate": bool(cls.rdls_candidate),
            "confidence": cls.confidence,
            # Policy outputs
            "excluded_by_policy": bool(excluded_by_policy),
            "exclusion_reason": "osm_policy" if excluded_by_policy else "",
            # Debug aids
            "top_signals": " | ".join(cls.top_signals),
            "source_file": fp.name,
        }
        rows.append(row)

    except Exception as e:
        errors += 1
        # Log error for post-mortem; keep going
        with OUT_ERRORS_JSONL.open("a", encoding="utf-8") as ef:
            ef.write(json.dumps({"file": fp.name, "error": str(e)}, ensure_ascii=False) + "\n")

    progress(i)

print("Done.")
print("Total rows:", len(rows))
print("Errors:", errors)


Processed 2000/26246 datasets (7.6%)
Processed 4000/26246 datasets (15.2%)
Processed 6000/26246 datasets (22.9%)
Processed 8000/26246 datasets (30.5%)
Processed 10000/26246 datasets (38.1%)
Processed 12000/26246 datasets (45.7%)
Processed 14000/26246 datasets (53.3%)
Processed 16000/26246 datasets (61.0%)
Processed 18000/26246 datasets (68.6%)
Processed 20000/26246 datasets (76.2%)
Processed 22000/26246 datasets (83.8%)
Processed 24000/26246 datasets (91.4%)
Processed 26000/26246 datasets (99.1%)
Processed 26246/26246 datasets (100.0%)
Done.
Total rows: 26246
Errors: 0


In [7]:
# ==========================
# Write outputs + summary
# ==========================

df = pd.DataFrame(rows)

# Stable column order (optional, but improves readability)
base_cols = [
    "dataset_id","name","title","organization","dataset_source","license_title",
    "dataset_date","last_modified","data_update_frequency","groups","tags",
    "resource_count","formats",
    "score_hazard","score_exposure","score_vulnerability_proxy","score_loss_impact",
    "rdls_components","rdls_candidate","confidence",
    "excluded_by_policy","exclusion_reason",
    "top_signals","source_file"
]
for c in base_cols:
    if c not in df.columns:
        df[c] = ""

df = df[base_cols]

df.to_csv(OUT_CLASSIFICATION_CSV, index=False, encoding="utf-8")
print("Wrote:", OUT_CLASSIFICATION_CSV)

# Included IDs = RDLS candidate AND not excluded by policy
included_ids = df.loc[(df["rdls_candidate"] == True) & (df["excluded_by_policy"] == False), "dataset_id"].dropna().tolist()
OUT_INCLUDED_IDS_TXT.write_text("\n".join(included_ids) + ("\n" if included_ids else ""), encoding="utf-8")
print("Wrote:", OUT_INCLUDED_IDS_TXT, "(", len(included_ids), "ids )")

# Summary
summary = {
    "total_datasets": int(len(df)),
    "errors": int(errors),
    "policy": {
        "osm_excluded_ids_loaded": int(len(osm_excluded)),
        "datasets_excluded_by_policy": int(df["excluded_by_policy"].sum()),
    },
    "rdls": {
        "candidates_total": int(df["rdls_candidate"].sum()),
        "included_total": int(((df["rdls_candidate"] == True) & (df["excluded_by_policy"] == False)).sum()),
    },
    "confidence_counts": df["confidence"].value_counts(dropna=False).to_dict(),
    "component_nonzero_counts": {
        comp: int((df[f"score_{comp}"] > 0).sum()) for comp in RDLS_COMPONENTS
    },
}

OUT_SUMMARY_JSON.write_text(json.dumps(summary, indent=2, ensure_ascii=False), encoding="utf-8")
print("Wrote:", OUT_SUMMARY_JSON)

summary


Wrote: C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\derived\classification.csv
Wrote: C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\derived\rdls_included_dataset_ids.txt ( 10759 ids )
Wrote: C:\Users\benny\OneDrive\Documents\Github\hdx-metadata-crawler\hdx_dataset_metadata_dump\derived\classification_summary.json


{'total_datasets': 26246,
 'errors': 0,
 'policy': {'osm_excluded_ids_loaded': 3649,
  'datasets_excluded_by_policy': 3649},
 'rdls': {'candidates_total': 13668, 'included_total': 10759},
 'confidence_counts': {'low': 12578, 'high': 7216, 'medium': 6452},
 'component_nonzero_counts': {'hazard': 4056,
  'exposure': 13671,
  'vulnerability_proxy': 12952,
  'loss_impact': 2745}}

In [8]:
# ==========================
# Quick QA views (optional)
# ==========================

# 1) Top candidates by score
df["score_max"] = df[[f"score_{c}" for c in RDLS_COMPONENTS]].max(axis=1)
df.sort_values(["excluded_by_policy","score_max"], ascending=[True, False]).head(25)


Unnamed: 0,dataset_id,name,title,organization,dataset_source,license_title,dataset_date,last_modified,data_update_frequency,groups,...,score_vulnerability_proxy,score_loss_impact,rdls_components,rdls_candidate,confidence,excluded_by_policy,exclusion_reason,top_signals,source_file,score_max
11675,71a5658e-d896-4d64-a8ff-a0b2b9441dcb,iom-global-profiles-for-internal-displacement-...,IOM - Global Profiles for Internal Displacemen...,International Organization for Migration (IOM),Internal Displacement Monitoring Centre (IDMC),Creative Commons Attribution International (CC...,[2018-01-01T00:00:00 TO 2024-12-31T23:59:59],2025-09-17T15:37:51.696081,Every year,World,...,9,1,hazard;vulnerability_proxy,True,high,False,,tag:cyclones-hurricanes-typhoons(+5)→hazard | ...,71a5658e-d896-4d64-a8ff-a0b2b9441dcb__iom-glob...,24
21889,d58c8a8a-5334-4d71-85d1-bfee280bd13d,maldives-critical-infrastructures,Maldives - Critical infrastructures,OCHA Regional Office for Asia and the Pacific ...,"National Disaster Management Centre, Maldives ...",Public Domain / No restrictions (CC0),[2018-07-17T00:00:00 TO 2018-07-17T23:59:59],2018-07-18T03:02:25.310152,Never,Maldives,...,4,0,exposure;vulnerability_proxy,True,high,False,,tag:facilities-infrastructure(+5)→exposure | t...,d58c8a8a-5334-4d71-85d1-bfee280bd13d__maldives...,24
1087,0b1c4dc1-6ae5-4e2d-8e32-2111d0e10697,idmc_internal_displacement_conflict-violence_d...,Global Internal Displacement Database: Conflic...,Internal Displacement Monitoring Centre (IDMC),Internal Displacement Monitoring Centre (IDMC),Creative Commons Attribution International (CC...,[2008-01-01T00:00:00 TO 2024-12-31T23:59:59],2025-05-27T21:00:18.576933,Every year,World,...,0,0,hazard,True,high,False,,tag:cyclones-hurricanes-typhoons(+5)→hazard | ...,0b1c4dc1-6ae5-4e2d-8e32-2111d0e10697__idmc-int...,23
1707,11309139-2eee-47da-a337-6771f65dbd30,saudi-arabia-transportation-network,Saudi Arabia - Transportation Network,OCHA Middle East and North Africa (ROMENA),OurAirports.com community web site. WFP SDI,Other,[2014-11-30T00:00:00 TO 2014-11-30T23:59:59],2017-03-28T16:06:44.163383,Never,Saudi Arabia,...,0,0,exposure,True,high,False,,tag:facilities-infrastructure(+5)→exposure | t...,11309139-2eee-47da-a337-6771f65dbd30__saudi-ar...,23
2520,191892b0-d982-448f-a925-45f22d30057d,algeria-transportation-network,Algeria - Transportation Network,OCHA Middle East and North Africa (ROMENA),OurAirports.com community web site. WFP SDI,Other,[2014-11-30T00:00:00 TO 2014-11-30T23:59:59],2017-03-27T13:26:32.887444,Never,Algeria,...,0,0,exposure,True,high,False,,tag:facilities-infrastructure(+5)→exposure | t...,191892b0-d982-448f-a925-45f22d30057d__algeria-...,23
11883,73cd4d78-bc1e-4a8f-b9ff-32a3fe884b3b,uae-transportation-network,United Arab Emirates - Transportation Network,OCHA Middle East and North Africa (ROMENA),OurAirports.com community web site. WFP SDI,Other,[2014-11-30T00:00:00 TO 2014-11-30T23:59:59],2017-03-28T16:34:32.117649,Never,United Arab Emirates,...,0,0,exposure,True,high,False,,tag:facilities-infrastructure(+5)→exposure | t...,73cd4d78-bc1e-4a8f-b9ff-32a3fe884b3b__uae-tran...,23
12463,7926b801-09e6-41c9-9aa8-0f775a1a961c,tunisia-transportation-network,Tunisia - Transportation Network,OCHA Middle East and North Africa (ROMENA),OurAirports.com community web site. WFP SDI,Other,[2014-11-30T00:00:00 TO 2014-11-30T23:59:59],2017-03-28T16:23:16.220118,Never,Tunisia,...,0,0,exposure,True,high,False,,tag:facilities-infrastructure(+5)→exposure | t...,7926b801-09e6-41c9-9aa8-0f775a1a961c__tunisia-...,23
11621,7117aece-abc5-47b8-b5a1-6308db53e7f8,reliefweb-disasters-list,ReliefWeb Disasters List,ReliefWeb,Multiple sources,Creative Commons Attribution International (CC...,[1981-11-26T00:00:00 TO 2025-11-05T23:59:59],2025-11-19T17:44:30.844634,Every week,World,...,0,0,hazard,True,high,False,,tag:drought(+5)→hazard | tag:earthquake-tsunam...,7117aece-abc5-47b8-b5a1-6308db53e7f8__reliefwe...,22
11846,7362ef2d-7282-459f-bc1b-0347076fcc12,bangladesh-hazards,"Bangladesh - Hazards (Drought risk, Earthquake...",OCHA Regional Office for Asia and the Pacific ...,Bangladesh Agricultural Research Council (BARC),Other,[2013-01-01T00:00:00 TO 2013-01-01T23:59:59],2021-05-20T07:36:51.560032,Never,Bangladesh,...,0,0,hazard,True,high,False,,tag:drought(+5)→hazard | tag:earthquake-tsunam...,7362ef2d-7282-459f-bc1b-0347076fcc12__banglade...,22
12972,7e2aa982-5439-4fe9-b852-f9943dcb9118,idmc-event-data-for-phl,Philippines - Internal Displacements Updates (...,Internal Displacement Monitoring Centre (IDMC),IDMC,Creative Commons Attribution for Intergovernme...,[2025-08-06T00:00:00 TO 2026-02-02T23:59:59],2026-02-02T13:41:41.600274,Every day,Philippines,...,0,0,hazard,True,high,False,,tag:cyclones-hurricanes-typhoons(+5)→hazard | ...,7e2aa982-5439-4fe9-b852-f9943dcb9118__idmc-eve...,22
