AI KYC OCR ID Validator

AI-powered offline KYC document extraction and matching using Tesseract OCR and local LLMs via Ollama.

Features

📤 Upload scanned ID card images (JPEG/PNG)
🧠 Tesseract OCR with multi-language support (bul+eng)
🤖 Mistral 7B via Ollama for structured KYC field extraction
✅ JSON output with full name, DOB, nationality, document number
📊 Match percentage scoring against expected profiles
💻 Local processing — no cloud dependencies
🐳 Docker Compose support for frontend/backend services

Preview

Project Structure

ai-kyc-ocr-id-validator/
├── kyc-ocr-frontend/       # React + Redux + Tailwind CSS
├── kyc-ocr-backend/        # Node.js + Express + OCR + LLM
├── docs/                   # Architecture, prompts, benchmarks
├── docker-compose.yml
└── README.md

Prerequisites

Node.js 18+
Tesseract OCR
Ollama with Mistral model
Docker and Docker Compose

Quick Start

Pull the Mistral model:

ollama pull mistral

Start the application:

docker-compose up

Access the services:

Frontend: http://localhost:5173
Backend: http://localhost:5000

Development

Source code changes are automatically reflected via volume mounting
Rebuild containers after dependency changes:

docker-compose up --build

Development mode with hot-reloading:

docker-compose -f docker-compose.dev.yml up

Check logs:

docker-compose logs -f

System Architecture

graph TD
    subgraph Frontend
        A[React App] --> B[Image Upload]
        B --> C[Redux State]
        C --> D[Results Display]
    end

    subgraph Backend
        E[Express Server] --> F[Image Processing]
        F --> G[Tesseract OCR]
        G --> H[Text Extraction]
        H --> I[Ollama LLM]
        I --> J[Structured Data]
    end

    subgraph Local Services
        K[Tesseract OCR] --> L[Multi-language Support]
        M[Ollama] --> N[Mistral 7B]
    end

    B --> E
    J --> C
    G --> K
    I --> M

    style Frontend fill:#f9f,stroke:#333,stroke-width:2px
    style Backend fill:#bbf,stroke:#333,stroke-width:2px
    style Local Services fill:#bfb,stroke:#333,stroke-width:2px

Flow Description

Frontend Layer
- User uploads ID image through React interface
- Redux manages application state
- Results displayed with match scoring
Backend Layer
- Express server handles requests
- Image processing and OCR extraction
- LLM analysis for structured data
Local Services
- Tesseract OCR for text extraction
- Ollama running Mistral 7B for data structuring

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

Testing

# Compare different LLM prompts
npm run compare-prompts

# Run tests
npm test

Mock data available in mock/ directory:

personal_image.png - Sample ID card

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI KYC OCR ID Validator

Features

Preview

Project Structure

Prerequisites

Quick Start

Development

System Architecture

Flow Description

Contributing

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
docs		docs
kyc-ocr-backend		kyc-ocr-backend
kyc-ocr-frontend		kyc-ocr-frontend
mock		mock
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

krasimirkostadinov/AI-kyc-ocr-id-validator

Folders and files

Latest commit

History

Repository files navigation

AI KYC OCR ID Validator

Features

Preview

Project Structure

Prerequisites

Quick Start

Development

System Architecture

Flow Description

Contributing

Testing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages