Skip to content

Phylogenies

Bert Bogaerts edited this page Sep 23, 2025 · 2 revisions

This page explains how to use the mist dists command to build (cg)MLST-based allele and distance matrices from allele call files.

Overview

mist dists takes as input allele call files (TSV or JSON), filters loci and datasets according to quality thresholds, and generates:

  • A filtered allele matrix (allele_matrix.tsv)
  • A pairwise distance matrix (distances.tsv)

These outputs can be processed and visualized in tools such as GrapeTree to construct minimum spanning trees.

Input Files

The script accepts allele call files in TSV or JSON format generated by MiST.

Note: At least three datasets are required.

Basic Usage

mist dists sample1.tsv sample2.tsv sample3.tsv

or with JSON files:

mist dists sample1.json sample2.json sample3.json

You can mix TSV and JSON inputs:

mist dists sample1.tsv sample2.json sample3.json

Command-line options

Options:
  -d, --out-dists PATH            Distance matrix output  [default: distances.tsv]
  -m, --out-matrix PATH           Allele matrix output  [default: allele_matrix.tsv]
  -l, --min-perc-loci INTEGER     Minimum percentage of loci that should be present in a dataset  [default: 90]
  -s, --min-perc-samples INTEGER  Minimum percentage of datasets where loci should be present  [default: 90]
  --debug                         Enable debug mode
  --log PATH                      Save log to this file
  --help                          Show this message and exit.

Output files

Filtered allele matrix (allele_matrix.tsv)

A matrix of allele calls after filtering datasets and loci.

Example:

ID        SAUR0001 SAUR0002 SAUR0003
sample1   1        2        1
sample2   1        -        2
sample3   1        2        1

Pairwise distance matrix (distances.tsv)

A symmetric matrix of allelic distances between datasets.

Example:

ID       sample1  sample2  sample3
sample1  0        2        1
sample2  2        0        1
sample3  1        1        0

Building the phylogeny

You can construct a phylogeny using GrapeTree:

grapetree --profile allele_matrix.tsv --method MSTreeV2

Note that GrapeTree is not included in the installation, but can be installed using Pip.

pip install grapetree

Clone this wiki locally