Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
Nov 7, 2024 - HTML
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
Convert PDF to HTML without losing text or format.
A web interface to extract tabular data from PDFs
An extensible Markdown Editor, Viewer and Weblog Publisher for Windows
JavaScript Promiseの本
JavaScript and Node.js cheatsheets
Use RMarkdown to generate PDF Conference Posters via HTML
Display paginated content in the browser and generate print books using web technology
Book publishing as easy as it should be (built with Symfony components)
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Generate a single PDF file from MkDocs repository.
(Java)A Method to Extract Tabular Content from PDF Files
Ruby Hacking Guide Translation
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
Android application fuzzing framework with fuzzers and crash monitor.
Add a description, image, and links to the pdf topic page so that developers can more easily learn about it.
To associate your repository with the pdf topic, visit your repo's landing page and select "manage topics."