Litedoc v2.0 — The Major Release

The biggest update yet. Faster. Smarter. Tougher.

Core Engine Upgrades

Document Layout Analysis (DLA) Engine

Replaced blind linear reading with a recursive XY-Cut algorithm. Litedoc now geometrically maps every page, isolating headers, sidebars, and main text blocks for perfect reconstruction.

Asymmetrical Multi-Column Routing

Massive improvements for academic papers. The engine now detects microscopic gutters and natively processes columns top-to-bottom, eliminating horizontal text interleaving.

Vector-Based Table Reconstruction

Enhanced addons.js intersection matrix logic now captures table structures as clean Markdown grids, bypassing the need for OCR on structured data.

Heavy-Duty Memory Management

Massive stability boost for large (200+ page) documents. The new Batch Queuing system processes files in 10-page chunks, forcefully clearing VRAM between cycles to prevent browser crashes.

Performance & Reliability

Language Auto-Detect (OSD Router)

The OCR engine now runs a lightweight 400×400px OSD pre-pass to detect script (Arabic, Latin, etc.) before initializing the heavy-duty language workers.

Intelligent Image Triage

Automatically detects native text vs. image-based PDFs, routing to the optimal path to save processing time and battery.

Mobile & Desktop Optimization

Aggressive performance tuning including a 150 DPI cap for mobile OCR and worker respawning every 5–10 pages to defeat iOS/Android thermal throttling.

Crash Recovery & Telemetry

If a file fails, the UI now flags it with an error badge and provides a one-click litedoc-crash-log.txt for easy bug reporting.

Developer & UX / UI Improvements

Mobile UI

We've completely overhauled the UI for mobile devices, ensuring a seamless experience on all screen sizes.

Modular Architecture

Completely decoupled the codebase. The project is now structured for easy community contributions, with a new Python build script that compiles the distribution-ready index.html.

View Project Structure

src/
├── index.html                  # Main entry point
├── css/                        # Stylesheets
│   ├── addons.css              # Plugin & extra component styles
│   ├── main.css                # Core application styles
│   └── mobile.css              # Mobile-specific overrides
└── js/                         # Application Logic
    ├── addons.js               # OCR & Password handling
    ├── demo.js                 # Sample document logic
    ├── downloads.js            # ZIP & individual file export
    ├── dropzone.js             # File upload & triage UI
    ├── file-tree.js            # Workspace explorer logic
    ├── main.js                 # Central orchestrator
    ├── markdown-renderer.js    # MD & Math processing
    ├── mobile-ux.js            # Mobile view switching
    ├── ocr.js                  # Tesseract & OCR engine
    ├── pdf-parser.js           # Core PDF.js extraction
    ├── reset-utils.js          # Workspace cleanup helpers
    ├── state.js                # Global application state
    ├── terminal.js             # Diagnostic logging
    ├── ui-controls.js          # Editor & button interactions
    ├── ui.js                   # General UI component logic
    └── utils.js                # Shared helper functions

Manual Hallucination Fallback

Added an "Unformat" button inside the editor. If the parser ever makes a mistake, one click instantly strips markdown/table formatting, returning the selection to plain text.

Queue Control

New "Skip" functionality allows users to force-abort long-running files without locking the application thread.

Minor Improvements & Maintenance

Area	What Changed
General Polish	Series of under-the-hood refinements and stability patches across the interface
Micro-Optimizations	Dozens of small adjustments to rendering speed, UI responsiveness, and memory footprint
Refined Error Handling	Improved edge-case handling for malformed PDF object streams to prevent silent failures

A Note on Performance

Litedoc runs entirely in your browser. While v2.0 handles large files significantly better, please be mindful that OCR is computationally expensive. For massive 200+ page documents, stick to native text PDFs whenever possible—if you force OCR on a 200-page image-only PDF in a mobile browser, you're going to hit hardware limits (because physics).

🔭 Looking Ahead

A massive thank you to everyone who supported this project via Ko-fi.

My focus moving forward is purely on stability, broader PDF format support, and extreme optimization.

🤝 Want to Help?

The codebase is now fully modular. If you're a dev, check the repo and send a PR; it's easier than ever to contribute. I'm taking a well-deserved break, but if you have questions, reach out via email or X.

	Link
🌐 Website	litedoc.xyz
𝕏 Twitter	@0xovoo
☕ Ko-fi	ko-fi.com/0xovo
📦 GitHub	github.com/0xovo/LiteDoc
📧 Email	contact@litedoc.xyz

---

🧪 Tests & Benchmarks

Our test suite and performance benchmarks are fully updated in the repository to guarantee conversion parity across standard document layouts.

Made with passion · No cloud · No nonsense · Just your browser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Litedoc v2.0: The Major Release

Choose a tag to compare

Sorry, something went wrong.