Skip to content

kenzitjandra/Layout-Parser-ComputerVision

 
 

Repository files navigation

Layout Parser + Handwriting Recognition

Detects layout regions (titles, paragraphs, lists) in handwritten note images using DocLayout-YOLO, then runs a custom character-level OCR model to convert them into structured Markdown text.

Requirements

  • Python 3.11 — TensorFlow does not support 3.12+ on Windows yet
  • Git

Setup

1. Clone the repo

git clone https://github.com/Gilliooo/Layout-Parser-ComputerVision.git
cd Layout-Parser-ComputerVision

2. Install Python 3.11

Windows:

winget install Python.Python.3.11

Mac / Linux: Download from python.org

3. Create a virtual environment

Windows (PowerShell):

Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
py -3.11 -m venv .venv --without-pip
.venv\Scripts\Activate.ps1
python -m ensurepip --upgrade

Mac / Linux:

python3.11 -m venv .venv
source .venv/bin/activate

4. Install dependencies

python -m pip install -r requirements.txt

5. Run the app

python -m streamlit run app.py

On first run with DocLayout-YOLO mode selected, the model weights (~100 MB) are automatically downloaded from Hugging Face and cached locally.

Notes

  • handwriting_recognition_model.keras and classes.json are included in the repo — no manual download needed.
  • The DocLayout-YOLO weights are fetched from juliozhao/DocLayout-YOLO-DocStructBench on first use.
  • Use python -m streamlit instead of just streamlit to ensure the venv's Python is used.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%