vlm-ocr

Star

Here are 7 public repositories matching this topic...

bytedance / Dolphin

Star

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

python pdf parser ocr pdf-converter document-analysis pdf-parser layout-analysis vlm-ocr

Updated Aug 13, 2025
Python

vlm-run / vlmrun-hub

Star

A hub for various industry-specific schemas to be used with VLMs.

json ai computer-vision etl vlm multimodal pydantic pydantic-models genai vlm-ocr

Updated May 27, 2025
Python

kaminoer / KokoDOS

Star

Yet another self-hosted AI voice assistant. GlaDOS' blazing fast pipeline with Kokoro TTS voice and vision.

ai tts vision stt voice-assistant vlm llm kokoro-tts vlm-ocr

Updated Jan 28, 2025
Python

video-db / ocr-benchmark

Star

Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments

benchmark ocr arxiv research-paper easyocr rapidocr vlms videodb vlm-ocr

Updated Feb 14, 2025
Python

OmarSamirz / ImageFromTextGenerator

Star

IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, apply over 10 built-in noise effects, and customize fonts and layouts. IFTG supports all languages and offers endless noise combinations, including custom noise creation.

Updated Apr 3, 2025
Python

Niraya666 / DocuLingo

Star

DocuLingo is a powerful document parsing tool built with multimodal large language models to enhance RAG (Retrieval Augmented Generation) workflows.

document-converting rag vlm-ocr

Updated May 7, 2025
Python

Takk8IS / CyberTechVLMDetector

Sponsor

Star

The CyberTech VLM Detector is a computer vision system designed to run entirely on edge devices, without requiring cloud access. The system uses vision-language models (VLM) to detect and locate objects in images based on natural language commands and development, including my creation of HIM™ and MAIC™

python camera view read detector vlm vlms takk8is takk-ag takk-design davidccavalcante vlm-ocr

Updated Jul 24, 2025
Python

Improve this page

Add a description, image, and links to the vlm-ocr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vlm-ocr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vlm-ocr

Here are 7 public repositories matching this topic...

bytedance / Dolphin

vlm-run / vlmrun-hub

kaminoer / KokoDOS

video-db / ocr-benchmark

OmarSamirz / ImageFromTextGenerator

Niraya666 / DocuLingo

Takk8IS / CyberTechVLMDetector

Improve this page

Add this topic to your repo