Smart Invoice Processor

This repository contains the source code for the article "From Chaos to Structure: Mastering OpenAI & Gemini Tool Calling with Pydantic". It provides a practical demonstration of various advanced strategies for extracting reliable, structured JSON data from unstructured text using Large Language Models (LLMs).

This project moves beyond basic prompting to showcase robust, production-ready techniques that guarantee the shape and type of LLM outputs, eliminating fragile text-parsing logic.

🚀 Key Techniques Demonstrated

This project provides hands-on examples for the following structured output methods:

Direct JSON Mode: Using LangChain's .with_structured_output() to force an LLM (like GPT-4o or Gemini 1.5 Pro) to return JSON matching a Pydantic schema.
Validation & Automatic Retries: Using the instructor library to not only request structured data but also to validate the output and automatically trigger correction loops if the model's response doesn't conform to the Pydantic schema.
Grammar-Based Generation: Using the outlines library to constrain the LLM's token generation process, guaranteeing that its output is syntactically perfect JSON from the very first token.

Project Structure

Here is an overview of the key files in this project:

Smart_Invoice_Processor/
├── .env                  # Your secret API keys (ignored by Git)
├── .gitignore            # Specifies files for Git to ignore
├── README.md             # This file
├── requirements.txt      # Project dependencies
├── invoice_01.txt        # Sample invoice data for testing
├── invoice_02.txt        # Sample invoice data for testing
├── invoice_03.txt        # Sample invoice data for testing
└── enhanced_invoice_processor.py  # Main script to process invoices using different methods
└── comparison_test.py    # A script to run and compare different extraction methods

⚙️ Getting Started

Follow these steps to set up and run the project locally.

Prerequisites

Python 3.10
An OpenAI API Key
A Google AI (Gemini) API Key

Installation

Clone the repository:

git clone [https://github.com/your-username/Smart_Invoice_Processor.git](https://github.com/your-username/Smart_Invoice_Processor.git)
cd Smart_Invoice_Processor

Create and activate a Python virtual environment:

# For Windows
python -m venv venv
.\venv\Scripts\activate

# For macOS/Linux
python3 -m venv venv
source venv/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```
Configure your environment variables: Create a file named .env by copying the example file:
```
# For Windows
copy .env.example .env

# For macOS/Linux
cp .env.example .env
```
Now, open the .env file and add your secret API keys from OpenAI and Google.

Usage

This project contains two primary scripts to run.

1. Enhanced Invoice Processor

This script (enhanced_invoice_processor.py) takes a complex invoice text and runs it through the three different extraction strategies, printing the results of each.

python enhanced_invoice_processor.py

After running, it will generate a results_complex.json file containing the detailed structured output.

2. Comparison Test

This script (comparison_test.py) processes multiple simple invoice files (invoice_01.txt, etc.) in a batch, allowing you to see how the different methods perform on various inputs.

python comparison_test.py

The output will be saved to batch_results.json.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Invoice Processor

🚀 Key Techniques Demonstrated

Project Structure

⚙️ Getting Started

Prerequisites

Installation

Usage

1. Enhanced Invoice Processor

2. Comparison Test

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
comparison_test.py		comparison_test.py
enhanced_invoice_processor.py		enhanced_invoice_processor.py
invoice_01.txt		invoice_01.txt
invoice_02.txt		invoice_02.txt
invoice_03.txt		invoice_03.txt
invoice_04.txt		invoice_04.txt
requirements.txt		requirements.txt

License

bitphonix/Smart_Invoice_Processor

Folders and files

Latest commit

History

Repository files navigation

Smart Invoice Processor

🚀 Key Techniques Demonstrated

Project Structure

⚙️ Getting Started

Prerequisites

Installation

Usage

1. Enhanced Invoice Processor

2. Comparison Test

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages