Skip to content

v0.0.5

Choose a tag to compare

@MinuraPunchihewa MinuraPunchihewa released this 01 May 10:52
· 22 commits to main since this release
38b6ec9

TL;DR

Performance Improvements

  • Pages are now evaluated for visual content before being passed to the LLM.
  • Non-visual pages are processed with traditional text parsing.
  • Override with use_llm_for_all=True to use LLM for all pages.

Dependency Cleanup

  • Replaced pdf2image with PyMuPDF for image conversion.
  • Removes the need for poppler, simplifying installation.
  • Dependencies are cleaned up and version-pinned to avoid conflicts (e.g., with MindsDB).

Enhanced Configurability

  • Added support for additional OpenAI parameters (e.g., temperature).
  • API key can now be set via the AIPDF_API_KEY environment variable.

Async OCR Support

  • Introduced ocr_async() to make asynchronous OpenAI API calls.
  • Complements the existing multi-threaded ocr() function.

Code Quality & Testing

  • Refactored functions for better readability and maintainability.
  • Added unit and integration tests to ensure reliability.

What's Changed

Full Changelog: v0.0.4...v0.0.5