Skip to content

Pure Rust OCR Library - Fast, Safe, and Cross-Platform

License

Notifications You must be signed in to change notification settings

byrizki/rusto-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

50 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RustO! πŸ¦€

Pure Rust OCR Library - Fast, Safe, and Cross-Platform

Crates.io NuGet npm CocoaPods Maven Central Documentation License: MIT CI

RustO! is a high-performance OCR (Optical Character Recognition) library written in pure Rust, based on RapidOCR and powered by PaddleOCR models with MNN inference engine.

🎯 Why RustO!?

  • πŸš€ Pure Rust - Zero OpenCV dependency, optional OpenCV backend available
  • 🎯 High Accuracy - 99.3% parity with OpenCV-based implementations
  • ⚑ Fast Performance - Optimized with LTO, single codegen unit compilation
  • πŸ”’ Memory Safe - Leverages Rust's safety guarantees
  • 🌐 Cross-Platform - Linux, macOS, Windows, iOS, Android support
  • πŸ”§ FFI Ready - C FFI bindings for integration with other languages
  • πŸ“¦ Easy to Use - Simple API, modern CLI with JSON/Text/TSV output

πŸ—οΈ Architecture

RustO! is built on top of proven OCR technology:

  • Based on: RapidOCR architecture
  • Models: PaddleOCR PPOCRv4/v5 models
  • Inference: MNN inference engine for high-performance cross-platform execution
  • Image Processing: Pure Rust implementation (image + imageproc crates)
  • Contour Detection: Custom Rust implementation matching OpenCV behavior

πŸ“ Project Structure

rusto-rs/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ lib.rs          # Public API
β”‚   β”œβ”€β”€ main.rs         # CLI application
β”‚   β”œβ”€β”€ ffi.rs          # C FFI bindings (optional)
β”‚   β”œβ”€β”€ det.rs          # Text detection
β”‚   β”œβ”€β”€ rec.rs          # Text recognition
β”‚   β”œβ”€β”€ layout.rs       # Layout detection
β”‚   β”œβ”€β”€ doc_pipeline.rs # Document pipeline (layout + OCR)
β”‚   β”œβ”€β”€ preprocess.rs   # Image preprocessing
β”‚   β”œβ”€β”€ postprocess.rs  # Result postprocessing
β”‚   β”œβ”€β”€ contours.rs     # Pure Rust contour detection
β”‚   β”œβ”€β”€ geometry.rs     # Geometric transformations + NMS
β”‚   β”œβ”€β”€ image_impl.rs   # Image abstraction layer
β”‚   └── ...
β”œβ”€β”€ Cargo.toml          # Dependencies & optimization
β”œβ”€β”€ docs/               # Documentation
β”œβ”€β”€ examples/           # Example applications
β”‚   β”œβ”€β”€ doc_pipeline_demo.rs  # Document pipeline example
β”‚   └── ...
└── packages/           # Additional packages

Model Conversion

RustO! uses MNN inference engine. You need to convert PaddleOCR models to MNN format:

# Install required tools
pip install paddle2onnx
# Download and build MNN from https://github.com/alibaba/MNN

# Convert models using the provided script
python convert_paddle_to_mnn.py --ocr-dir ./models

See MODEL_CONVERSION.md for detailed conversion instructions.


Quick Start

1. Build the Library

# Pure Rust build (default)
cargo build --release

# With FFI bindings
cargo build --release --features ffi

# With OpenCV backend (optional)
cargo build --release --features use-opencv

2. Run CLI Application

# JSON output (default)
cargo run --release -- \
  --det-model path/to/det.mnn \
  --rec-model path/to/rec.mnn \
  --dict path/to/dict.txt \
  image.jpg

# Plain text output
cargo run --release -- \
  --det-model path/to/det.mnn \
  --rec-model path/to/rec.mnn \
  --dict path/to/dict.txt \
  --format text \
  image.jpg

# TSV output
cargo run --release -- \
  --det-model path/to/det.mnn \
  --rec-model path/to/rec.mnn \
  --dict path/to/dict.txt \
  --format tsv \
  image.jpg

3. Use as a Library

Add to your Cargo.toml:

[dependencies]
rusto = "0.1"

Then in your code:

use rusto::{RapidOCR, RapidOCRConfig};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure OCR
    let config = RapidOCRConfig {
        det_model_path: "models/det.mnn".to_string(),
        rec_model_path: "models/rec.mnn".to_string(),
        dict_path: "models/dict.txt".to_string(),
    };
    
    // Create OCR instance
    let ocr = RapidOCR::new(config)?;
    
    // Run OCR on an image
    let results = ocr.ocr("image.jpg")?;
    
    // Process results
    for result in results {
        println!("Text: {}, Score: {:.3}", result.text, result.score);
        println!("Box: {:?}", result.box_points);
    }
    
    Ok(())
}

4. Document Pipeline (Layout + OCR)

RustO! now supports document layout analysis combined with OCR for structured document processing:

use rusto::{DocPipeline, DocPipelineConfig, LayoutConfig, RustOConfig};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure layout detection
    let layout_config = LayoutConfig::default("models/DocOCR/layout.mnn".into());
    
    // Configure OCR
    let ocr_config = RustOConfig::new_ppv5(
        "models/det.mnn".into(),
        "models/rec.mnn".into(),
        "models/dict.txt".into(),
    );
    
    // Create document pipeline
    let config = DocPipelineConfig {
        layout: layout_config,
        ocr: ocr_config,
    };
    
    let mut pipeline = DocPipeline::new(config)?;
    let result = pipeline.run("document.jpg")?;
    
    // Generate markdown output
    println!("{}", result.to_markdown());
    
    Ok(())
}

Supported Layout Elements:

  • Text, Title, Header, Footer
  • Figure, Figure Caption
  • Table, Table Caption
  • Reference, Equation

Example:

cargo run --example doc_pipeline_demo -- \
  --image document.jpg \
  --layout-model models/DocOCR/layout.mnn \
  --det-model models/det.mnn \
  --rec-model models/rec.mnn \
  --keys-path models/dict.txt

5. iOS Integration

Install via CocoaPods:

pod 'RustO', '~> 0.1'

Then in Swift:

import RustO

let ocr = try RapidOCR(
    detModelPath: Bundle.main.path(forResource: "det", ofType: "mnn")!,
    recModelPath: Bundle.main.path(forResource: "rec", ofType: "mnn")!,
    dictPath: Bundle.main.path(forResource: "dict", ofType: "txt")!
)

let results = try ocr.recognizeFile("image.jpg")
for result in results {
    print("\(result.text): \(result.score)")
}

API Reference

RapidOCRConfig

Configuration structure for initializing the OCR engine.

pub struct RapidOCRConfig {
    pub det_model_path: String,  // Path to detection MNN model
    pub rec_model_path: String,  // Path to recognition MNN model
    pub dict_path: String,       // Path to character dictionary
}

TextResult

OCR result for a single detected text region.

pub struct TextResult {
    pub text: String,                    // Recognized text
    pub score: f32,                      // Confidence score (0.0-1.0)
    pub box_points: [(f32, f32); 4],    // Bounding box corners
}

RapidOCR

Main OCR engine.

impl RapidOCR {
    // Create a new OCR instance
    pub fn new(config: RapidOCRConfig) -> Result<Self, EngineError>;
    
    // Run OCR on an image file
    pub fn ocr<P: AsRef<Path>>(&self, image_path: P) -> Result<Vec<TextResult>, EngineError>;
    
    // Run OCR on image data in memory
    pub fn ocr_from_bytes(&self, image_data: &[u8]) -> Result<Vec<TextResult>, EngineError>;
}

FFI Bindings

The library includes C FFI bindings for integration with other languages. Enable with the ffi feature:

cargo build --release --features ffi

This produces:

  • Linux: librusto.so
  • macOS: librusto.dylib
  • Windows: rusto.dll

See src/ffi.rs for the complete FFI API documentation.


πŸ“¦ Models

RustO! uses PaddleOCR models converted to ONNX format:

Supported Models

  • PPOCRv4 - PaddleOCR version 4 models
  • PPOCRv5 - PaddleOCR version 5 models (recommended)

Model Components

  1. Detection Model (det.onnx) - Detects text regions in images
  2. Recognition Model (rec.onnx) - Recognizes text within detected regions
  3. Dictionary (dict.txt) - Character dictionary for text recognition

Download Models

# Example: Download PPOCRv5 models
wget https://github.com/RapidAI/RapidOCR/releases/download/v1.3.0/det.onnx
wget https://github.com/RapidAI/RapidOCR/releases/download/v1.3.0/rec.onnx
wget https://github.com/RapidAI/RapidOCR/releases/download/v1.3.0/dict.txt

⚑ Performance

Benchmarks

Tested on typical document images:

Metric Value
Detection ~80ms
Recognition (per box) ~120ms
Total (28 boxes) ~3.5s
Memory Peak ~200MB

Comparison with OpenCV-based implementations

Aspect RustO! OpenCV-based
Speed βœ… Similar (Β±10%) Baseline
Accuracy βœ… 99.3% parity 100%
Binary Size βœ… Smaller Larger (OpenCV deps)
Memory Usage βœ… Lower Higher (OpenCV overhead)
Dependencies βœ… Minimal OpenCV required
Safety βœ… Memory safe Manual memory management

Configuration

Cargo Features

[features]
default = []           # Pure Rust mode
use-opencv = ["opencv"] # Use OpenCV backend
ffi = []               # Enable C FFI bindings

Build Profiles

[profile.release]
opt-level = 3          # Maximum optimization
lto = "fat"            # Link-time optimization
codegen-units = 1      # Single codegen unit for better optimization
strip = true           # Strip symbols
panic = "abort"        # Smaller binary

Development

Run Tests

cd rapidocr
cargo test
cargo test --features use-opencv  # Test OpenCV backend

Run Benchmarks

cargo bench

Check Code

cargo clippy
cargo fmt --check

Known Issues

Rust Library (contours.rs)

  • ⚠️ Unused functions (400+ lines) - cleanup pending
  • ⚠️ Minor lint warnings - non-blocking

Remaining Parity Gap (0.7%)

  • 2 minor text differences out of 28 boxes
  • Caused by: Spacing ("Gol. Darah:" vs "Gol. Darah :")
  • Impact: Negligible for production use

License

MIT (or your license)


Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: cargo test
  5. Submit a pull request

Support

  • πŸ“§ Email: support@rapidocr.com
  • πŸ’¬ Discussions: GitHub Discussions
  • πŸ› Issues: GitHub Issues

πŸ™ Acknowledgments

RustO! builds upon the excellent work of:

  • RapidOCR - Architecture and design inspiration
  • PaddleOCR - State-of-the-art OCR models (PPOCRv4/v5)
  • ONNX Runtime - Cross-platform inference engine
  • Rust Community - Excellent tooling and libraries (image, imageproc, nalgebra)

πŸ“ Citation

If you use RustO! in your research or project, please cite:

@software{rusto2024,
  title = {RustO! - Pure Rust OCR Library},
  author = {byrizki},
  year = {2024},
  url = {https://github.com/byrizki/rusto-rs},
  note = {Based on RapidOCR and powered by PaddleOCR models}
}

Also consider citing the underlying technologies:


Status: Production Ready πŸš€
Version: 0.1.2
License: MIT

Made with ❀️ and πŸ¦€ Rust

Report Bug Β· Request Feature Β· Contribute

About

Pure Rust OCR Library - Fast, Safe, and Cross-Platform

Resources

License

Stars

Watchers

Forks

Packages

No packages published