Skip to content

RandyHaylor/pdf-to-text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

pdf-to-text

AI coding skill for converting PDF files to clean text. Handles both embedded-text PDFs and scanned/image PDFs via OCR.

setup

navigate to ai project skills folder

  • Antigravity <workspace-root>/.agent/skills/
  • Claude Code <workspace-root>/.claude/skills/
git clone https://github.com/RandyHaylor/pdf-to-text.git

use

  • option 1: tell ai agent to use the pdf-to-text skill for your task
  • option 2: update your project or global agent instructions to incorporate the pdf-to-text skill

Workflow Enhancements

This skill goes beyond basic extraction with several techniques that improve reliability and output quality.

  • Two-page sample before full extraction

    Rather than blindly processing the entire PDF, the agent extracts just the first two pages and evaluates quality before committing to a method. This catches encoding issues, column mangling, and layout problems early — before wasting time on a bad full run.

  • Comparative method selection

    When embedded text extraction produces poor results, the agent runs OCR on the same sample pages and compares both outputs side by side. The better method wins. No guessing.

  • Visual inspection via AI vision

    When neither extraction method produces clean output, the agent renders pages as images and visually inspects them using multimodal vision. It can see what's actually on the page — watermarks, unusual fonts, columns, embedded images of text — and diagnose the specific issue before recommending a strategy.

  • User checkpoint before full processing

    The agent reports its sample findings and recommended approach to the user before running full extraction. No surprises, no wasted processing on the wrong method.

  • Graceful tool fallback chain

    The skill defines three extraction paths (pdftotextocrmypdftesseract + pdftoppm) and checks tool availability upfront. If the preferred tool is missing, it falls through to the next option rather than failing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors