Skip to content
#

docx-parser

Here are 9 public repositories matching this topic...

Language: All
Filter by language

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

  • Updated Feb 14, 2025
  • Python

A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.

  • Updated Mar 21, 2025
  • Python

Improve this page

Add a description, image, and links to the docx-parser topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the docx-parser topic, visit your repo's landing page and select "manage topics."

Learn more