-
Notifications
You must be signed in to change notification settings - Fork 0
Tutorial
This tutorial demonstrates how to download, index, and query the Acinetobacter baumannii PubMLST MLST scheme using MiST.
Download the A. baumannii MLST scheme from PubMLST.org.
Note: You can retrieve download URLs of schemes with the mist list command (Listing available schemes).
mist download \
--downloader bigsdb \
--url https://rest.pubmlst.org/db/pubmlst_abaumannii_seqdef/schemes/1 \
--output mlst \
--include-profilesAfter completion, the mlst directory will contain:
fasta_list.txt
Oxf_cpn60.fasta
Oxf_gdhB.fasta
Oxf_gltA.fasta
Oxf_gpi.fasta
Oxf_gyrB.fasta
Oxf_recA.fasta
Oxf_rpoD.fasta
profiles.tsv
Build an index from the downloaded scheme:
mist index \
mlst/*.fasta \
--profiles mlst/profiles.tsv \
--output mlst_idx
The mlst_idx directory should now contain:
├── loci_repr.fasta
├── locirepr.fasta.mni
├── loci.txt
├── Oxf_cpn60/
├── Oxf_gdhB/
├── Oxf_gltA/
├── Oxf_gpi/
├── Oxf_gyrB/
├── Oxf_recA/
├── Oxf_rpoD/
└── profiles.tsv
Download an A. baumannii genome from ENA/NCBI (or use your own FASTA file):
curl -L -o GCA_900020545.1.fasta \
"https://www.ebi.ac.uk/ena/browser/api/fasta/GCA_900020545.1?download=true&gzip=false"Call the alleles:
mist call \
--db mlst_idx/ \
--fasta GCA_900020545.1.fasta \
--out-json results.json \
--out-tsv results.tsv \
-t 4During the run, MiST logs the number of detected loci and the assigned ST:
2025-XX-XX 00:00:00 - mist_query - INFO - Detected 7/7 loci (100.00%), including 0 (potential) novel alleles
2025-XX-XX 00:00:00 - mist_query - INFO - Matching ST: 1567 (100.00% match)
The results.json file contains detailed information about allele calls, alignments, and the assigned sequence type. Example (truncated):
{
"alleles": {
"...": {},
"Oxf_recA": {
"allele_str": "11",
"allele_results": [
{
"allele": "11",
"alignment": {
"seq_id": "ENA|FITR01000016|FITR01000016.1",
"start": 114660,
"end": 115030,
"strand": "-"
},
"sequence": null,
"closest_alleles": null
}
],
"tags": []
},
"Oxf_rpoD": {
"allele_str": "5",
"allele_results": [
{
"allele": "5",
"alignment": {
"seq_id": "ENA|FITR01000063|FITR01000063.1",
"start": 24250,
"end": 24762,
"strand": "+"
},
"sequence": null,
"closest_alleles": null
}
],
"tags": []
}
},
"profile": {
"name": "1567",
"metadata": [
[
"ST",
"1567"
],
[
"clonal_complex",
"n/a"
],
[
"species",
"Acinetobacter baumannii"
]
],
"alleles": {
"Oxf_gltA": "10",
"...": "..."
},
"pct_match": 100.0
},
"metadata": {
"timestamp": "2025-XX-XXT00:00:00",
"tool_version": "0.0.1"
}
}For a simplified view, the results.tsv file lists the allele calls per locus:
locus allele is_novel
Oxf_cpn60 4 False
Oxf_gdhB 182 False
Oxf_gltA 10 False
Oxf_gpi 100 False
Oxf_gyrB 12 False
Oxf_recA 11 False
Oxf_rpoD 5 False