A wrapper to work with Tesseract OCR inside PHP.
-
Updated
Mar 25, 2025 - PHP
A wrapper to work with Tesseract OCR inside PHP.
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
MORT 번역기 프로젝트 - Real-time game translator with OCR
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.
A Node.js wrapper for the Tesseract OCR API
Data release for the ImageInWords (IIW) paper.
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.
A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
L-Verse: Bidirectional Generation Between Image and Text
Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app
Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
Solution to im2latex request for research of openai
To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
OCR with Google's AI technology (Cloud Vision API)
The largest multilingual image-text classification dataset. It contains fashion products.
Add a description, image, and links to the image-to-text topic page so that developers can more easily learn about it.
To associate your repository with the image-to-text topic, visit your repo's landing page and select "manage topics."