# **Multilocus Sequence Typing (MLST)**

- **Analysis type:** MLST
- **Input data:** Assembled genomes (FASTA)
- **Organism:** *Acinetobacter baumannii*
- **Typing scheme:** Oxford MLST

Following genome assembly, **multilocus sequence typing (MLST)** was performed to assign sequence types (STs) to each isolate. MLST assigns isolates to sequence types based on the allelic combinations of housekeeping genes defined in a species-specific typing scheme. This allows comparison of isolates across studies and supports population structure analysis.

For *Acinetobacter baumannii*, the Oxford MLST scheme was used.

MLST was performed using **mlst (2.23.0)** (Torsten Seemann), a command-line tool that assigns sequence types by querying assembled genome contigs against curated PubMLST schemes.

In [12]:
%%bash

# initialise conda
source /home/anaconda/miniconda3/etc/profile.d/conda.sh
conda activate mlst

# check mlst
mlst --version


mlst 2.23.0


## Input Files

The MLST analysis was performed using assembled genome contigs generated from the GHRU assembly pipeline.

### Input requirements

- Assembled genomes in FASTA format
- One assembly per sample
- Assemblies must be free of contamination and of sufficient quality


## MLST Execution

The analysis was performed in a Conda-managed environment, standard output was written to a results file and execution logs were retained for reproducibility.

In [9]:
%%bash

# initialise conda
source /home/anaconda/miniconda3/etc/profile.d/conda.sh
conda activate mlst

# define paths
ASSEMBLY_DIR=/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies
MLST_OUTDIR=/data/internship_data/nidhi/aba/new_output/mlst_output
LOGDIR=/data/internship_data/nidhi/aba/new_output/logs

# create required directories
mkdir -p $MLST_OUTDIR $LOGDIR

# run MLST with logging
mlst $ASSEMBLY_DIR/*.short.fasta \
> $MLST_OUTDIR/mlst_results.tsv \
2> $LOGDIR/mlst.log

## MLST Output and Downstream Processing

The MLST tool generates a primary output containing sequence type (ST) and allelic profiles for each isolate. All additional files, including ST distributions, allele variation summaries, International Clone (IC) assignments, and visualizations, were generated through downstream data processing and analysis of the MLST results.


In [11]:
%%bash
column -t /data/internship_data/nidhi/aba/new_output/mlst_output/mlst_results.tsv | head -n 5

/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies/ABA-1000.short.fasta  abaumannii_2  10    Pas_cpn60(1)  Pas_fusA(3)  Pas_gltA(2)   Pas_pyrG(1)  Pas_recA(4)  Pas_rplB(4)  Pas_rpoB(4)
/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies/ABA-1001.short.fasta  abaumannii_2  2     Pas_cpn60(2)  Pas_fusA(2)  Pas_gltA(2)   Pas_pyrG(2)  Pas_recA(2)  Pas_rplB(2)  Pas_rpoB(2)
/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies/ABA-1002.short.fasta  abaumannii_2  2     Pas_cpn60(2)  Pas_fusA(2)  Pas_gltA(2)   Pas_pyrG(2)  Pas_recA(2)  Pas_rplB(2)  Pas_rpoB(2)
/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies/ABA-1003.short.fasta  abaumannii_2  2     Pas_cpn60(2)  Pas_fusA(2)  Pas_gltA(2)   Pas_pyrG(2)  Pas_recA(2)  Pas_rplB(2)  Pas_rpoB(2)
/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies/ABA-1004.short.fasta  abaumannii_2  2     Pas_cpn60(2)  Pas_fusA(2)  Pas_gltA(2)   Pas_pyrG(2)  Pas_recA(2)  Pas_r