# Zero-Shot NER Toolkit Tour

This notebook walks through the zero-shot NER utilities in OpenMed:

- Building or loading the model index
- Exploring domain defaults
- Running programmatic inference with custom labels
- Converting span entities to token-level BIO/BILOU tags
- Referencing the CLI helpers for automation

## PrerequisitesInstall the optional GLiNER + Hugging Face extras if you have not already:```bashuv pip install ".[hf,gliner]"# orpip install ".[hf,gliner]"```The examples below pull published models directly from the Hugging Face Hub, so no local index is required.

## Explore Domain Defaults

In [None]:
domains = available_domains()
print(f"Domains ({len(domains)}): {', '.join(domains)}")

sample_domain = 'biomedical' if 'biomedical' in domains else domains[0]
print("\nDefault labels for", sample_domain, ":", get_default_labels(sample_domain))


## Programmatic Inference

Pick a GLiNER entry from the index and run zero-shot predictions. Override labels explicitly or rely on domain defaults.

In [None]:
model_id = 'OpenMed/OpenMed-ZeroShot-NER-Genomic-Tiny-60M'
request = NerRequest(
    model_id=model_id,
    text="Imatinib inhibits BCR-ABL in chronic myeloid leukemia patients.",
    threshold=0.5,
    domain='genomic',
)
response = infer(request)
for entity in response.entities:
    print(f"{entity.label:>12} | {entity.score:0.3f} | {entity.text}")

response.meta


## Convert to Token-Level BIO/BILOU Labels

In [None]:
adapter_result = to_token_classification(response.entities, request.text, scheme="BILOU")
for token in adapter_result.tokens[:10]:
    print(f"{token.token!r:>12} -> {token.label}")

adapter_result.metadata

## CLI Cheat Sheet

Run these commands from a shell:

```bash
# Build or refresh the model index (uses $OPENMED_ZEROSHOT_MODELS_DIR)
python -m ner_tools.index

# Inspect default labels
python -m ner_tools.labels dump-defaults --domain biomedical

# Run inference with custom labels
python -m ner_tools.infer \
  --model-id gliner-biomed-tiny \
  --text "Imatinib inhibits BCR-ABL in CML." \
  --threshold 0.55 \
  --labels Drug,Gene,Disease

# Optional smoke test across a few GLiNER models
python scripts/smoke_gliner.py --limit 3 --adapter
```

## Next Steps

- Extend `label_maps/defaults.json` with new domains and labels.
- Integrate the `infer` response into evaluation dashboards or downstream automation.
- Wrap CLI commands into CI jobs to keep zero-shot coverage healthy.