# Following usage guide for model inference:


# DATE: Sept. 24, 2025

 Inference from 
https://github.com/WGLab/DeepMod2/tree/main

1. Basecall your FAST5/POD5 files with Dorado (using --emit-moves) or Guppy (using --bam_out --moves_out) parameters to get a BAM file with move tables:

dorado basecaller MODEL INPUT_DIR --emit-moves > basecall.bam
Make sure to use the appropriate Guppy/Dorado model for your sequencing kit. You can supply a reference genome to Guppy/Dorado to get aligned BAM files, or use minimap2 to align these BAM files later.

2. (Optional but recommended) Align basecalled reads to a reference genome while retaining the move tables:

samtools fastq basecall.bam -T mv,ts | minimap2 -ax map-ont ref.fa - -y -t NUM_THREADS |samtools view -o aligned.bam

Run DeepMod2 by providing BAM file and the folder containing FAST5 or POD5 signal files as inputs. You can provide reference FASTA file to get reference anchored methylation calls and per-site frequencies if the BAM file is aligned. Specify the model you want to use and the file type of raw signal files. Use multiple cores and/or GPUs for speedup.

a) If using an aligned BAM file input:

python PATH_TO_DEEPMOD2_REPOSITORY/deepmod2 detect --bam reads.bam --input INPUT_DIR --model MODEL --file_type FILE_TYPE --threads NUM_THREADS --ref ref.fa --output MOD_CALLS

b) If using an unaligned BAM file input:

python PATH_TO_DEEPMOD2_REPOSITORY/deepmod2 detect --bam reads.bam --input INPUT_DIR --model MODEL --file_type FILE_TYPE --threads NUM_THREADS --output MOD_CALLS
This will give you a per-read prediction text file MOD_CALLS/output.per_read, a per-site prediction file MOD_CALLS/output.per_site, a per-site prediction file with both strands aggregated MOD_CALLS/output.per_site.aggregated, and a methylation annotated BAM file MOD_CALLS/output.bam.

Visualize the annotated BAM file produced by DeepMod2 in IGV file. In IGV, select 'Color alignments by' and 'base modifications (5mC)'. The following steps will allow you to open the tagged BAM file in IGV:

a) If an aligned BAM is given to DeepMod2, you only need to sort and index the DeepMod2 methylation tagged BAM file:

samtools sort MOD_CALLS/output.bam -o MOD_CALLS/final.bam --write-index
b) If an unaligned BAM is given to DeepMod2, first align the DeepMod2 methylation tagged BAM file (while preserving methylation tags MM and ML), then sort and index it:

samtools fastq MOD_CALLS/output.bam -T MM,ML,mv,ts| minimap2 -ax map-ont ref.fa - -y -t NUM_THREADS |samtools sort -o MOD_CALLS/final.bam --write-index


<!-- Please refer to Usage.md for details on how to use DeepMod2. -->

# Step - by - step

1. Basecall your FAST5/POD5 files with Dorado (using --emit-moves) or Guppy (using --bam_out --moves_out) parameters to get a BAM file with move tables:


**Make sure to use the appropriate Guppy/Dorado model for your sequencing kit. You can supply a reference genome to Guppy/Dorado to get aligned BAM files, or use minimap2 to align these BAM files later.**


dorado basecaller MODEL INPUT_DIR --emit-moves > basecall.bam

In [1]:
%%bash
# dorado basecaller MODEL INPUT_DIR --emit-moves > basecall.bam 

mkdir ./5mCG
dorado basecaller \
/home/michalula/software/dna_r9.4.1_e8_sup@v3.3	  \
/mnt/faststorage/michalula/T_cells/20250908_Day_35_T_CRoff_nCATs/20250908_1810_MN31715_FBD90703_2c0c1ee5/pod5 \
--modified-bases 5mCG \
 --emit-moves \
> ./5mCG/20250908_Day35_CROFF_Tcells_2Libraries_Minion_R9.dna_r9.4.1_e8_sup@v3.3.5mCG.bam   

[2025-09-24 07:59:15.305] [info] Running: "basecaller" "/home/michalula/software/dna_r9.4.1_e8_sup@v3.3" "/mnt/faststorage/michalula/T_cells/20250908_Day_35_T_CRoff_nCATs/20250908_1810_MN31715_FBD90703_2c0c1ee5/pod5" "--modified-bases" "5mCG" "--emit-moves"
[2025-09-24 07:59:15.315] [info]  - downloading dna_r9.4.1_e8_sup@v3.3_5mCG@v0.1 with httplib
[2025-09-24 07:59:16.641] [info] > Creating basecall pipeline
[2025-09-24 07:59:17.398] [info] Calculating optimized batch size for GPU "NVIDIA GeForce RTX 4090" and model dna_r9.4.1_e8_sup@v3.3. Full benchmarking will run for this device, which may take some time.
[2025-09-24 07:59:20.781] [info] cuda:0 using chunk size 10000, batch size 768
[2025-09-24 07:59:21.256] [info] cuda:0 using chunk size 5000, batch size 1344
[2025-09-24 08:17:25.233] [info] > Finished in (ms): 1083235
[2025-09-24 08:17:25.233] [info] > Simplex reads basecalled: 124898
[2025-09-24 08:17:25.233] [info] > Basecalled @ Samples/s: 1.509058e+07
[2025-09-24 08:17:25.24

2. (Optional but recommended) Align basecalled reads to a reference genome while retaining the move tables:



In [2]:
%%bash
conda activate deepmod2

# samtools fastq basecall.bam -T mv,ts | minimap2 -ax map-ont ref.fa - -y -t NUM_THREADS |samtools view -o aligned.bam

ref_fa = "/home/michalula/data/ref_genomes/t2t_v2_0/chm13v2.0.fa"


samtools fastq ./5mCG/20250908_Day35_CROFF_Tcells_2Libraries_Minion_R9.dna_r9.4.1_e8_sup@v3.3.5mCG.bam  -T mv,ts | minimap2 -ax map-ont $ref_fa - -y -t 10 |samtools view -o aligned.bam




CondaError: Run 'conda init' before 'conda activate'

bash: line 5: ref_fa: command not found
bash: line 8: minimap2: command not found
[main_samview] fail to read the header from "-".


CalledProcessError: Command 'b'conda activate deepmod2\n\n# samtools fastq basecall.bam -T mv,ts | minimap2 -ax map-ont ref.fa - -y -t NUM_THREADS |samtools view -o aligned.bam\n\nref_fa = "/home/michalula/data/ref_genomes/t2t_v2_0/chm13v2.0.fa"\n\n\nsamtools fastq ./5mCG/20250908_Day35_CROFF_Tcells_2Libraries_Minion_R9.dna_r9.4.1_e8_sup@v3.3.5mCG.bam  -T mv,ts | minimap2 -ax map-ont $ref_fa - -y -t 10 |samtools view -o aligned.bam\n\n'' returned non-zero exit status 1.

3. Run DeepMod2 by providing BAM file and the folder containing FAST5 or POD5 signal files as inputs. You can provide reference FASTA file to get reference anchored methylation calls and per-site frequencies if the BAM file is aligned. Specify the model you want to use and the file type of raw signal files. Use multiple cores and/or GPUs for speedup.

a) If using an aligned BAM file input:

python PATH_TO_DEEPMOD2_REPOSITORY/deepmod2 detect --bam reads.bam --input INPUT_DIR --model MODEL --file_type FILE_TYPE --threads NUM_THREADS --ref ref.fa --output MOD_CALLS
b) If using an unaligned BAM file input:

python PATH_TO_DEEPMOD2_REPOSITORY/deepmod2 detect --bam reads.bam --input INPUT_DIR --model MODEL --file_type FILE_TYPE --threads NUM_THREADS --output MOD_CALLS
This will give you a per-read prediction text file MOD_CALLS/output.per_read, a per-site prediction file MOD_CALLS/output.per_site, a per-site prediction file with both strands aggregated MOD_CALLS/output.per_site.aggregated, and a methylation annotated BAM file MOD_CALLS/output.bam.



In [None]:
%%bash

conda activate deepmod2

PATH_TO_DEEPMOD2_REPOSITORY=/home/michalula/code/epiCausality/epiCode/analyze_ont_data/DeepMod2/deepmod2
# INPUT_DIR=/home/michalula/code/epiCausality/epiCode/analyze_ont_data/DeepMod2/5mCG
# INPUT_DIR=/home/michalula/code/epiCausality/epiCode/analyze_ont_data/DeepMod2
INPUT_DIR=/mnt/faststorage/michalula/T_cells/20250908_Day_35_T_CRoff_nCATs/20250908_1810_MN31715_FBD90703_2c0c1ee5/pod5
FILE_TYPE=pod5
MODEL=dna_r9.4.1_e8_sup@v3.3_5mCG
NUM_THREADS=10
MOD_CALLS=./5mCG/deepmod2_results

python PATH_TO_DEEPMOD2_REPOSITORY/deepmod2 detect --bam reads.bam --input INPUT_DIR --model MODEL --file_type FILE_TYPE --threads NUM_THREADS --ref ref.fa --output MOD_CALLS


4. Visualize the annotated BAM file produced by DeepMod2 in IGV file. In IGV, select 'Color alignments by' and 'base modifications (5mC)'. The following steps will allow you to open the tagged BAM file in IGV:

a) If an aligned BAM is given to DeepMod2, you only need to sort and index the DeepMod2 methylation tagged BAM file:

samtools sort MOD_CALLS/output.bam -o MOD_CALLS/final.bam --write-index

b) If an unaligned BAM is given to DeepMod2, first align the DeepMod2 methylation tagged BAM file (while preserving methylation tags MM and ML), then sort and index it:

samtools fastq MOD_CALLS/output.bam -T MM,ML,mv,ts| minimap2 -ax map-ont ref.fa - -y -t NUM_THREADS |samtools sort -o MOD_CALLS/final.bam --write-index


In [None]:
%%bash

conda activate deepmod2

MOD_CALLS=./5mCG/deepmod2_results

samtools sort MOD_CALLS/output.bam -o MOD_CALLS/final.bam --write-index
