Skip to content

Releases: TongZhou2017/modtector

v0.15.4

08 Feb 15:41

Choose a tag to compare

🎉 ModTector v0.15.4 - Major Update Release

We are excited to announce ModTector v0.15.4, a significant update with major feature additions, critical bug fixes, and performance improvements since v0.9.6!

🚀 Release Highlights

This release includes 6 major version updates (0.10.0 - 0.15.4) with substantial improvements:

New Workflows: batch processing, single-cell support
Performance: Base quality filtering, distribution-based k-factor prediction
🔧 Format Support: Extended input format compatibility (7+ formats)
🐛 Critical Fixes: SVG metadata extraction, Siegfried method improvements
🎨 UI Enhancements: Interactive SVG with right-click menus, minimap navigation
📊 Accuracy: Improved evaluation metrics and normalization methods

🌟 Major New Features

Batch & Single-Cell Processing (v0.11.0)

  • Batch processing mode with glob pattern matching
  • Single-cell unified processing with automatic cell label extraction
  • Window-based memory management for large datasets
  • 2-3x performance improvement for single-cell data

Extended Format Support (v0.12.0, v0.12.3)

  • Native convert command with 7+ input formats:
    • RNA Framework formats (rf-rctools, rf-norm, rf-norm-xml)
    • ShapeMapper2 profile format
    • Samtools mpileup format
    • icSHAPE RT format (v0.14.7)
    • Bedgraph format (v0.14.8)
  • Automatic format detection with streaming processing
  • Dual input mode for mutation + stop signal files

Base Quality Filtering (v0.14.0)

  • Per-base quality filtering in modtector count command
  • Quality-based effective depth calculation
  • Recommended threshold: 20 (same as RNAFramework)
  • ~15% low-quality mutation filtering for improved signal-to-noise ratio

Distribution-Based K-Factor Prediction (v0.13.0)

  • Advanced k-factor prediction using statistical distribution analysis
  • Hierarchical loss function based on feature position
  • Automatic fallback to background method
  • Significantly improved accuracy, especially for 2A3 samples

Mod-Only Reactivity Workflow (v0.9.7)

  • Support for smartSHAPE datasets without unmodified controls
  • Clear command-line messaging for mod-only mode

🐛 Critical Bug Fixes

SVG Metadata Extraction (v0.15.4) ⚠️ Critical

  • Fixed base mismatch errors in SVG metadata extraction
  • Replaced context window method with individual circle tag parsing
  • Eliminates cross-contamination between adjacent circles
  • Ensures RNA_METADATA correctly matches circle data-base attributes

Siegfried Method Improvements (v0.12.2, v0.12.5, v0.14.10)

  • Negative values support: Removed forced zero-clamping, allowing negative reactivity values
  • Zarringhalam remap fix: Proper handling of negative input values
  • Integrated alignment: "align min to 0" logic integrated into zarringhalam_remap
  • Zero value percentage reduced from 54.8% to ~0.4%

Evaluation Metrics (v0.12.4)

  • Fixed incorrect sensitivity/specificity calculation
  • Now uses optimal threshold (highest F1-score) instead of fixed value > 0.0
  • Correctly handles methods where all reactivity values are positive

Stop Signal Position Correction (v0.11.1)

  • Fixed strand-specific position correction for reverse-strand reads
  • Forward strand: start_position - 1
  • Reverse strand: start_position + 1

U/T Base Handling (v0.11.3)

  • Fixed chartonum() function to handle 'U' (uracil) in RNA sequences
  • Treats U and T as equivalent, fixing 100% mutation rate issues

🎨 User Interface Improvements

Interactive SVG Enhancements (v0.14.8, v0.14.9)

  • Right-click context menu for style editing:
    • Font size and color adjustment
    • Circle radius and color modification
    • Base-specific style reset
  • Minimap navigation:
    • Draggable viewport for real-time navigation
    • Auto-hide with activity-based timer (3 seconds after idle)
  • Legend positioning: Moved to bottom-right to avoid obscuring RNA structure
  • Loading experience: Improved loading message with SVG filename display

Reactivity Distribution Preview (v0.11.4)

  • Histogram-like visualization above cutoff sliders
  • Color-coded regions based on threshold positions
  • Supports individual base and unified color range modes

📊 Accuracy & Performance Improvements

Normalization Enhancements

  • Quality-based effective depth (v0.14.4): Counts only bases with quality >= threshold
  • PCR bias correction (v0.14.4): Chi-Square distribution-based correction (optional)
  • Zarringhalam remap (v0.14.10): Integrated "align min to 0" logic for better preservation of mod/unmod differences

Evaluation Improvements

  • NaN handling (v0.12.1): Proper NaN value handling in sorting and calculations
  • Optimal threshold selection (v0.12.4): Uses F1-score optimization for all metrics
  • SNP filtering (v0.11.3): --snp-cutoff parameter to filter high mutation rate positions

🔧 Technical Improvements

Code Quality

  • Streaming processing for large files (tested up to 35GB)
  • Enhanced error handling with limited warnings strategy
  • Progress reporting for long-running operations

Workflow Integration

  • Updated Snakefile rules for new features
  • Better integration with RNA Framework, ShapeMapper2, samtools
  • Reduced external script dependencies

📦 Installation

Installation methods remain the same as v0.9.6:

# Quick Install (Recommended)
cargo install modtector

# From Source
git clone https://github.com/TongZhou2017/modtector.git
cd modtector
cargo build --release

🚀 Quick Start Examples

New: Batch Processing

modtector count \
    --batch \
    -b "samples/*.bam" \
    -f reference.fa \
    -o output_dir/ \
    -t 16

New: Format Conversion

# Convert RNA Framework output
modtector convert \
    -i input.rf \
    -o output.csv \
    -f rf-rctools \
    --rf-count-mutations

# Convert ShapeMapper2 profile
modtector convert \
    -i profile.txt \
    -o output.csv \
    -f shapemapper-profile \
    --ref-fasta reference.fa

Enhanced: Quality Filtering

modtector count \
    -b sample.bam \
    -f reference.fa \
    -o output.csv \
    --min-base-qual 20 \
    -t 8

📚 Documentation

🔄 Migration Notes

Breaking Changes

  • None - All changes are backward compatible

Recommended Updates

  • Update Snakefile rules to use new modtector convert command instead of Python scripts
  • Enable --min-base-qual 20 for improved signal quality
  • Use --k-prediction-method distribution for better k-factor prediction (especially 2A3 samples)

📈 Performance Metrics

  • Single-cell processing: 2-3x speedup over batch mode
  • Format conversion: ~1-2 million lines/second
  • Memory usage: O(1) constant for streaming operations
  • Quality filtering: ~0.9% overhead, ~15% noise reduction

🤝 Contributing

We welcome contributions! Please feel free to:

  • 🐛 Report bugs and issues
  • 💡 Suggest new features
  • 📝 Improve documentation
  • 🔧 Submit pull requests

📄 License

ModTector is licensed under the MIT License.

📞 Support

🙏 Acknowledgments

Thanks to all users who reported issues and provided feedback, especially for the SVG metadata extraction bug fixes and format conversion feature requests!


Full Changelog: https://github.com/TongZhou2017/modtector/blob/main/CHANGELOG.md

Download: See assets below for source code archives

Verify Release: SHA256 checksums available in SHA256SUMS file

v0.9.6

09 Oct 02:46
16ab6f5

Choose a tag to compare

🎉 ModTector v0.9.6 - First Official Release

We are excited to announce the first official stable release of ModTector, a high-performance RNA modification detection tool written in Rust!

🚀 Release Highlights

ModTector provides a complete, production-ready workflow for detecting RNA modifications from high-throughput sequencing data:

  • Stable & Production-Ready: Thoroughly tested and ready for research use
  • High Performance: Rust-based implementation with multi-threading support
  • 📦 Easy Installation: Available on crates.io and conda-forge ready
  • 📚 Complete Documentation: Comprehensive guides and examples
  • 🧬 Full Workflow: From BAM files to publication-ready results
  • 🎨 Rich Visualizations: ROC curves, RNA structure plots, and more

📦 Installation

Quick Install (Recommended)

cargo install modtector

System Dependencies

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install build-essential pkg-config libssl-dev libhts-dev

macOS:

brew install htslib

CentOS/RHEL:

sudo yum groupinstall "Development Tools"
sudo yum install pkgconfig openssl-devel htslib-devel

From Source

git clone https://github.com/TongZhou2017/modtector.git
cd modtector
cargo build --release
# Binary will be in target/release/modtector

🌟 Key Features

Multi-Signal Analysis

  • Simultaneous analysis of stop signals (RT truncation) and mutation signals (base mutations)
  • Support for multiple reactivity calculation methods
  • Comprehensive signal normalization and filtering

High Performance

  • Rust-based implementation for memory safety and speed
  • Multi-threading support for parallel processing
  • Efficient htslib integration for BAM file handling
  • Optimized for large-scale datasets

Complete Workflow

  1. Count: Generate pileup statistics from BAM files
  2. Reactivity: Calculate reactivity scores between modified/unmodified samples
  3. Normalize: Filter and normalize signals with multiple methods
  4. Compare: Identify differential modification sites
  5. Plot: Generate publication-quality visualizations
  6. Evaluate: Assess accuracy with ROC/PR curves and AUC metrics

Rich Visualizations

  • Signal distribution scatter plots
  • Reactivity bar charts
  • ROC and PR curves
  • RNA structure SVG plots with reactivity overlay
  • Multi-threaded parallel plotting

Accuracy Assessment

  • AUC (Area Under Curve) calculation
  • F1-score, sensitivity, specificity
  • ROC and PR curve generation
  • Auto-alignment for sequence matching
  • T/U base equivalence handling

🚀 Quick Start

# Generate pileup data from BAM files
modtector count -b sample.bam -f reference.fa -o output.csv -t 8

# Calculate reactivity scores
modtector reactivity -M modified.csv -U unmodified.csv -O reactivity.csv -t 24

# Normalize signals
modtector norm -i reactivity.csv -o normalized.csv -m winsor90 --bases AC

# Generate visualizations
modtector plot -M modified.csv -U unmodified.csv -o plots/ -r normalized.csv -t 8

# Evaluate accuracy
modtector evaluate -r normalized.csv -s structure.dp -o evaluation/ --gene-id 16S_rRNA

📊 Example Results

ModTector produces publication-ready outputs including:

  • Signal Analysis: Comprehensive pileup statistics with depth and coverage
  • Reactivity Profiles: Normalized reactivity scores for modification detection
  • ROC Curves: Performance evaluation against known modification sites
  • Structure Plots: RNA secondary structure visualization with signal overlay
  • Comparison Reports: Statistical analysis of differential modifications

🔬 Use Cases

ModTector is designed for:

  • DMS-seq: Dimethyl sulfate sequencing analysis
  • SHAPE-MaP: SHAPE mutational profiling
  • icSHAPE: In vivo click SHAPE
  • m6A/m1A Detection: RNA methylation analysis
  • Custom RNA Modifications: Flexible framework for various modification types

📚 Documentation

Comprehensive documentation is available:


🎯 System Requirements

  • Operating System: Linux, macOS, or Windows
  • RAM: 4 GB minimum (8 GB recommended for large datasets)
  • Storage: 2 GB free space
  • CPU: Multi-core processor recommended for parallel processing
  • Rust: Version 1.70 or higher

🤝 Contributing

We welcome contributions! Please feel free to:

  • 🐛 Report bugs and issues
  • 💡 Suggest new features
  • 📝 Improve documentation
  • 🔧 Submit pull requests

Visit our GitHub repository to get started.


📄 License

ModTector is licensed under the MIT License, allowing free use in both academic and commercial settings.


📞 Support


🙏 Acknowledgments

Thanks to the bioinformatics community for inspiration and to all early testers who provided valuable feedback!


Full Changelog: https://github.com/TongZhou2017/modtector/blob/main/CHANGELOG.md

Download: See assets below for source code archives

Verify Release: SHA256 checksums available in SHA256SUMS file