Releases: TongZhou2017/modtector
v0.15.4
🎉 ModTector v0.15.4 - Major Update Release
We are excited to announce ModTector v0.15.4, a significant update with major feature additions, critical bug fixes, and performance improvements since v0.9.6!
🚀 Release Highlights
This release includes 6 major version updates (0.10.0 - 0.15.4) with substantial improvements:
✅ New Workflows: batch processing, single-cell support
⚡ Performance: Base quality filtering, distribution-based k-factor prediction
🔧 Format Support: Extended input format compatibility (7+ formats)
🐛 Critical Fixes: SVG metadata extraction, Siegfried method improvements
🎨 UI Enhancements: Interactive SVG with right-click menus, minimap navigation
📊 Accuracy: Improved evaluation metrics and normalization methods
🌟 Major New Features
Batch & Single-Cell Processing (v0.11.0)
- Batch processing mode with glob pattern matching
- Single-cell unified processing with automatic cell label extraction
- Window-based memory management for large datasets
- 2-3x performance improvement for single-cell data
Extended Format Support (v0.12.0, v0.12.3)
- Native
convertcommand with 7+ input formats:- RNA Framework formats (rf-rctools, rf-norm, rf-norm-xml)
- ShapeMapper2 profile format
- Samtools mpileup format
- icSHAPE RT format (v0.14.7)
- Bedgraph format (v0.14.8)
- Automatic format detection with streaming processing
- Dual input mode for mutation + stop signal files
Base Quality Filtering (v0.14.0)
- Per-base quality filtering in
modtector countcommand - Quality-based effective depth calculation
- Recommended threshold: 20 (same as RNAFramework)
- ~15% low-quality mutation filtering for improved signal-to-noise ratio
Distribution-Based K-Factor Prediction (v0.13.0)
- Advanced k-factor prediction using statistical distribution analysis
- Hierarchical loss function based on feature position
- Automatic fallback to background method
- Significantly improved accuracy, especially for 2A3 samples
Mod-Only Reactivity Workflow (v0.9.7)
- Support for smartSHAPE datasets without unmodified controls
- Clear command-line messaging for mod-only mode
🐛 Critical Bug Fixes
SVG Metadata Extraction (v0.15.4) ⚠️ Critical
- Fixed base mismatch errors in SVG metadata extraction
- Replaced context window method with individual circle tag parsing
- Eliminates cross-contamination between adjacent circles
- Ensures
RNA_METADATAcorrectly matches circledata-baseattributes
Siegfried Method Improvements (v0.12.2, v0.12.5, v0.14.10)
- Negative values support: Removed forced zero-clamping, allowing negative reactivity values
- Zarringhalam remap fix: Proper handling of negative input values
- Integrated alignment: "align min to 0" logic integrated into
zarringhalam_remap - Zero value percentage reduced from 54.8% to ~0.4%
Evaluation Metrics (v0.12.4)
- Fixed incorrect sensitivity/specificity calculation
- Now uses optimal threshold (highest F1-score) instead of fixed
value > 0.0 - Correctly handles methods where all reactivity values are positive
Stop Signal Position Correction (v0.11.1)
- Fixed strand-specific position correction for reverse-strand reads
- Forward strand:
start_position - 1 - Reverse strand:
start_position + 1
U/T Base Handling (v0.11.3)
- Fixed
chartonum()function to handle 'U' (uracil) in RNA sequences - Treats U and T as equivalent, fixing 100% mutation rate issues
🎨 User Interface Improvements
Interactive SVG Enhancements (v0.14.8, v0.14.9)
- Right-click context menu for style editing:
- Font size and color adjustment
- Circle radius and color modification
- Base-specific style reset
- Minimap navigation:
- Draggable viewport for real-time navigation
- Auto-hide with activity-based timer (3 seconds after idle)
- Legend positioning: Moved to bottom-right to avoid obscuring RNA structure
- Loading experience: Improved loading message with SVG filename display
Reactivity Distribution Preview (v0.11.4)
- Histogram-like visualization above cutoff sliders
- Color-coded regions based on threshold positions
- Supports individual base and unified color range modes
📊 Accuracy & Performance Improvements
Normalization Enhancements
- Quality-based effective depth (v0.14.4): Counts only bases with quality >= threshold
- PCR bias correction (v0.14.4): Chi-Square distribution-based correction (optional)
- Zarringhalam remap (v0.14.10): Integrated "align min to 0" logic for better preservation of mod/unmod differences
Evaluation Improvements
- NaN handling (v0.12.1): Proper NaN value handling in sorting and calculations
- Optimal threshold selection (v0.12.4): Uses F1-score optimization for all metrics
- SNP filtering (v0.11.3):
--snp-cutoffparameter to filter high mutation rate positions
🔧 Technical Improvements
Code Quality
- Streaming processing for large files (tested up to 35GB)
- Enhanced error handling with limited warnings strategy
- Progress reporting for long-running operations
Workflow Integration
- Updated Snakefile rules for new features
- Better integration with RNA Framework, ShapeMapper2, samtools
- Reduced external script dependencies
📦 Installation
Installation methods remain the same as v0.9.6:
# Quick Install (Recommended)
cargo install modtector
# From Source
git clone https://github.com/TongZhou2017/modtector.git
cd modtector
cargo build --release🚀 Quick Start Examples
New: Batch Processing
modtector count \
--batch \
-b "samples/*.bam" \
-f reference.fa \
-o output_dir/ \
-t 16New: Format Conversion
# Convert RNA Framework output
modtector convert \
-i input.rf \
-o output.csv \
-f rf-rctools \
--rf-count-mutations
# Convert ShapeMapper2 profile
modtector convert \
-i profile.txt \
-o output.csv \
-f shapemapper-profile \
--ref-fasta reference.faEnhanced: Quality Filtering
modtector count \
-b sample.bam \
-f reference.fa \
-o output.csv \
--min-base-qual 20 \
-t 8📚 Documentation
- 📖 ReadTheDocs - Complete user guide
- 🚀 Quick Start Guide
- 💻 Command Reference
- 📝 Examples
🔄 Migration Notes
Breaking Changes
- None - All changes are backward compatible
Recommended Updates
- Update Snakefile rules to use new
modtector convertcommand instead of Python scripts - Enable
--min-base-qual 20for improved signal quality - Use
--k-prediction-method distributionfor better k-factor prediction (especially 2A3 samples)
📈 Performance Metrics
- Single-cell processing: 2-3x speedup over batch mode
- Format conversion: ~1-2 million lines/second
- Memory usage: O(1) constant for streaming operations
- Quality filtering: ~0.9% overhead, ~15% noise reduction
🤝 Contributing
We welcome contributions! Please feel free to:
- 🐛 Report bugs and issues
- 💡 Suggest new features
- 📝 Improve documentation
- 🔧 Submit pull requests
📄 License
ModTector is licensed under the MIT License.
📞 Support
- GitHub Issues: https://github.com/TongZhou2017/modtector/issues
- Documentation: https://modtector.readthedocs.io/
- Repository: https://github.com/TongZhou2017/modtector
🙏 Acknowledgments
Thanks to all users who reported issues and provided feedback, especially for the SVG metadata extraction bug fixes and format conversion feature requests!
Full Changelog: https://github.com/TongZhou2017/modtector/blob/main/CHANGELOG.md
Download: See assets below for source code archives
Verify Release: SHA256 checksums available in SHA256SUMS file
v0.9.6
🎉 ModTector v0.9.6 - First Official Release
We are excited to announce the first official stable release of ModTector, a high-performance RNA modification detection tool written in Rust!
🚀 Release Highlights
ModTector provides a complete, production-ready workflow for detecting RNA modifications from high-throughput sequencing data:
- ✅ Stable & Production-Ready: Thoroughly tested and ready for research use
- ⚡ High Performance: Rust-based implementation with multi-threading support
- 📦 Easy Installation: Available on crates.io and conda-forge ready
- 📚 Complete Documentation: Comprehensive guides and examples
- 🧬 Full Workflow: From BAM files to publication-ready results
- 🎨 Rich Visualizations: ROC curves, RNA structure plots, and more
📦 Installation
Quick Install (Recommended)
cargo install modtectorSystem Dependencies
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install build-essential pkg-config libssl-dev libhts-devmacOS:
brew install htslibCentOS/RHEL:
sudo yum groupinstall "Development Tools"
sudo yum install pkgconfig openssl-devel htslib-develFrom Source
git clone https://github.com/TongZhou2017/modtector.git
cd modtector
cargo build --release
# Binary will be in target/release/modtector🌟 Key Features
Multi-Signal Analysis
- Simultaneous analysis of stop signals (RT truncation) and mutation signals (base mutations)
- Support for multiple reactivity calculation methods
- Comprehensive signal normalization and filtering
High Performance
- Rust-based implementation for memory safety and speed
- Multi-threading support for parallel processing
- Efficient htslib integration for BAM file handling
- Optimized for large-scale datasets
Complete Workflow
- Count: Generate pileup statistics from BAM files
- Reactivity: Calculate reactivity scores between modified/unmodified samples
- Normalize: Filter and normalize signals with multiple methods
- Compare: Identify differential modification sites
- Plot: Generate publication-quality visualizations
- Evaluate: Assess accuracy with ROC/PR curves and AUC metrics
Rich Visualizations
- Signal distribution scatter plots
- Reactivity bar charts
- ROC and PR curves
- RNA structure SVG plots with reactivity overlay
- Multi-threaded parallel plotting
Accuracy Assessment
- AUC (Area Under Curve) calculation
- F1-score, sensitivity, specificity
- ROC and PR curve generation
- Auto-alignment for sequence matching
- T/U base equivalence handling
🚀 Quick Start
# Generate pileup data from BAM files
modtector count -b sample.bam -f reference.fa -o output.csv -t 8
# Calculate reactivity scores
modtector reactivity -M modified.csv -U unmodified.csv -O reactivity.csv -t 24
# Normalize signals
modtector norm -i reactivity.csv -o normalized.csv -m winsor90 --bases AC
# Generate visualizations
modtector plot -M modified.csv -U unmodified.csv -o plots/ -r normalized.csv -t 8
# Evaluate accuracy
modtector evaluate -r normalized.csv -s structure.dp -o evaluation/ --gene-id 16S_rRNA📊 Example Results
ModTector produces publication-ready outputs including:
- Signal Analysis: Comprehensive pileup statistics with depth and coverage
- Reactivity Profiles: Normalized reactivity scores for modification detection
- ROC Curves: Performance evaluation against known modification sites
- Structure Plots: RNA secondary structure visualization with signal overlay
- Comparison Reports: Statistical analysis of differential modifications
🔬 Use Cases
ModTector is designed for:
- DMS-seq: Dimethyl sulfate sequencing analysis
- SHAPE-MaP: SHAPE mutational profiling
- icSHAPE: In vivo click SHAPE
- m6A/m1A Detection: RNA methylation analysis
- Custom RNA Modifications: Flexible framework for various modification types
📚 Documentation
Comprehensive documentation is available:
- 📖 ReadTheDocs - Complete user guide
- 🚀 Quick Start Guide - Get started in minutes
- 💻 Command Reference - Detailed command documentation
- 📝 Examples - Real-world usage examples
- 🔧 Installation Guide - Platform-specific instructions
🎯 System Requirements
- Operating System: Linux, macOS, or Windows
- RAM: 4 GB minimum (8 GB recommended for large datasets)
- Storage: 2 GB free space
- CPU: Multi-core processor recommended for parallel processing
- Rust: Version 1.70 or higher
🤝 Contributing
We welcome contributions! Please feel free to:
- 🐛 Report bugs and issues
- 💡 Suggest new features
- 📝 Improve documentation
- 🔧 Submit pull requests
Visit our GitHub repository to get started.
📄 License
ModTector is licensed under the MIT License, allowing free use in both academic and commercial settings.
📞 Support
- GitHub Issues: https://github.com/TongZhou2017/modtector/issues
- Documentation: https://modtector.readthedocs.io/
- Repository: https://github.com/TongZhou2017/modtector
🙏 Acknowledgments
Thanks to the bioinformatics community for inspiration and to all early testers who provided valuable feedback!
Full Changelog: https://github.com/TongZhou2017/modtector/blob/main/CHANGELOG.md
Download: See assets below for source code archives
Verify Release: SHA256 checksums available in SHA256SUMS file