A Python-based tool (requires Python 3.10+) that extracts content from various document formats (PDF, DOCX, etc.) using MarkItDown and refines the resulting Markdown using Google AI Studio (gemini-3.1-flash-lite-preview).
Warning
Data Privacy Notice: When using this tool, your document's Markdown content is sent to Google's AI Studio servers for processing. Do not use this tool with highly sensitive or strictly confidential data if your organization's policy prohibits sharing data with third-party LLM providers.
- Document Extraction: Uses
markitdownto convert complex documents into raw Markdown. - LLM Cleanup: Leverages Gemini 3.1 Flash Lite via Google AI Studio to fix broken tables, inconsistent headings, and formatting artifacts.
- Safety & Validation: Robust input/output path validation and specific exception handling for API and file operations.
- Easy Configuration: Managed environment variables via
.envfiles.
- Python 3.10+
- A Google AI Studio API key.
Clone the repository and install the package in editable mode:
git clone <GITHUB_REPO_URL>
cd markdown-agent
pip install -e .Create and activate a ready-to-use environment from the provided environment.yml:
conda env create -f environment.yml
conda activate markdown-agent- Go to Google AI Studio.
- Click on "Get API key" in the left sidebar.
- Click "Create API key in new project" (or use an existing one).
- Copy your key.
Create a .env file in your working directory and add your Google API key:
GOOGLE_AI_STUDIO_KEY=your_google_api_key_hereThe mda command will automatically load this file on startup.
mda path/to/document.pdf -o path/to/output.md| Argument | Description |
|---|---|
input |
Path to the source document (PDF, DOCX, etc.) |
-o, --output |
Path to the output Markdown file (parent directory is created automatically) |
-v, --verbose |
Enable verbose (DEBUG) logging |
-s, --silent |
Disable all but ERROR logging |
This tool uses Gemini 3.1 Flash Lite Preview via Google AI Studio, which has the following quota for free-tier users:
- RPM (Requests Per Minute): 15
- TPM (Tokens Per Minute): 250,000
- RPD (Requests Per Day): 500
Note
Based on your specific configuration, the following limits are expected: 15 RPM, 250,000 TPM, and 500 RPD. Users exceeding these limits will encounter 429: Too Many Requests errors.
- API Errors (503/429): Gemini API may occasionally return service errors or rate limit exceptions. If you encounter a connection error, wait a few seconds and retry.
- Token Limits: This tool uses heavy-duty models capable of handling large contexts (up to 1M tokens), but extremely long documents might still hit limits or perform slower.
markdown_agent/cli.py: Core logic for extraction, LLM processing, and validation.pyproject.toml: Package configuration and dependencies.environment.yml: Conda environment definition..env: Environment variable configuration (not committed to git).