Graphora API

AI-powered knowledge graph construction from unstructured data

A sophisticated document processing backend that leverages AI-powered intelligent document analysis with flexible and modular preprocessing capabilities. The system uses FastAPI, LLM, and Neo4j to create an intelligent document processing pipeline.

Features

Advanced AI-driven document processing pipeline with robust error handling
Multi-format document support and intelligent workflow management
Scalable microservice architecture optimized for document intelligence
Real-time status tracking for preprocessing steps
Temporary subgraph creation for review before final integration
Human-in-the-loop feedback system

Prerequisites

Python 3.11 or higher
LLM API key
Neo4j database access

Getting Started

Building the Project

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Create virtual env:
```
uv venv
```
Install libmagic
```
sudo apt-get install libmagic1
```
```
 uv sync
```
Setup BAML

Developer Shortcuts

make install – sync Python dependencies via uv
uv sync --group dev – install optional dev tools (e.g. Vulture for dead-code checks)
make compose-up – start local Neo4j and Redis containers (see Local Development Guide)
make lint, make test, make typecheck – run quality gates before committing
make test-unit, make test-integration – run just unit or integration slices as needed
make test now emits coverage stats to the terminal and writes coverage.xml for CI tooling
make deadcode – run Vulture against the codebase to surface unused definitions
make openapi-snapshot – regenerate tests/snapshots/openapi.json after intentional API changes so contract tests stay green

Running the Project

The project will automatically start when you run it on Replit. The FastAPI server will be available at port 8000.

To manually start the server:

python -m app.main

The API will be available at:

API Documentation: /api/v1/docs
OpenAPI Specification: /api/v1/openapi.json

Authentication

All API requests must include a Clerk-issued bearer token: Authorization: Bearer <token>.
Configure the backend with Clerk credentials via .env: CLERK_JWKS_URL, CLERK_ISSUER, CLERK_AUDIENCE, and CLERK_API_KEY if server-to-server calls are required.
Clients no longer send the legacy user-id header; the backend derives the user from the JWT subject claim.

Service-to-service / pipeline tokens

You can call the API from CI jobs or data pipelines without any extra backend code. Mint short-lived Clerk JWTs on demand and feed them to the Graphora client:

Create (or reuse) a Clerk user that represents the pipeline and add a JWT template (e.g. graphora_pipeline) whose aud value matches CLERK_AUDIENCE.

When the pipeline starts, create a token via Clerk's backend API using your Clerk API key:

curl -X POST "https://api.clerk.com/v1/users/<USER_ID>/tokens/graphora_pipeline" \\
  -H "Authorization: Bearer $CLERK_API_KEY" \\
  -H "Content-Type: application/json" \\
  -d '{"expires_in_seconds": 3600}'

Export the returned token right before invoking the client:

export GRAPHORA_AUTH_TOKEN="<clerk-jwt-from-step-2>"
python pipeline.py

Repeat the minting step whenever the token expires (keep TTLs short and rotate the Clerk API key like any other secret).

The graphora Python package automatically reads GRAPHORA_AUTH_TOKEN, so no application changes are required as long as the bearer token is valid.

Development Guide

Project Structure

app/
├── agents/             # AI agents for workflow and feedback
├── api/               # API endpoints (REST and GraphQL)
├── services/          # Core business logic services
├── schemas/           # Pydantic models and schemas
├── utils/             # Utility functions and helpers
└── main.py           # Application entry point

Key Components

Preprocessing Service
- Handles multi-step document preprocessing
- Provides real-time status updates
- Implements robust error handling
Extraction Service
- Manages entity and relationship extraction
- Integrates with OpenAI for intelligent processing
- Handles temporary graph creation
Graph Service
- Manages Neo4j database operations
- Handles subgraph creation and updates
- Processes user feedback

API Endpoints

REST API

Document Upload

POST /api/v1/documents/upload
Content-Type: multipart/form-data

Submit Feedback

POST /api/v1/feedback/{document_id}
Content-Type: application/json

Error Handling

The system implements comprehensive error handling:

Status tracking for each preprocessing step
Detailed error messages and logging
Graceful failure recovery
User-friendly error responses

Deployment

The project is configured to run automatically with the following features:

Auto-reload during development
Production-ready ASGI server (uvicorn)
Proper port configuration for Replit hosting

Production Considerations

When deploying to production:

Update CORS settings in main.py
Configure proper logging levels
Set up proper database credentials
Enable rate limiting and security measures

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Before Contributing

Documentation

Contributing Guide - How to contribute
Repository Guidelines - Quick contributor reference
Local Development Guide - Spin up dependencies and run the API locally
Security Policy - How to report security issues
Support - How to get help
Trademark Policy - Trademark usage guidelines

Related Repositories

Frontend: graphora/graphora-fe
Python Client: graphora/graphora-client

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

✅ Use for free under AGPL v3 terms
✅ Modify and distribute with source code
❌ Cannot use as closed-source SaaS without commercial license

For commercial licensing (closed-source SaaS, enterprise deployments, OEM), contact: sales@graphora.io

See LICENSE for full terms.

Commercial Support

Enterprise Support: SLA-backed support for production deployments
Consulting: Custom integrations, training, architecture design
Commercial Licensing: Closed-source and SaaS deployments
Database Vendor Partnerships: OEM licensing for database companies

Contact: support@graphora.io

Community

GitHub Discussions: Ask questions, share ideas
Discord: Coming soon
Twitter: Coming soon

Security

Please report security vulnerabilities to support@graphora.io

See SECURITY.md for details.

Made with ❤️ by Arivan Labs

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
.github/workflows		.github/workflows
.vscode		.vscode
app		app
docker		docker
docs		docs
migrations		migrations
ollama-backend		ollama-backend
prototyping		prototyping
scripts		scripts
supabase		supabase
tests		tests
typings/supabase		typings/supabase
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLA.md		CLA.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
TRADEMARK.md		TRADEMARK.md
api.txt		api.txt
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.local.yml		docker-compose.local.yml
example_ontology_with_quality.yaml		example_ontology_with_quality.yaml
form10k-ontology.yml		form10k-ontology.yml
generated-icon.png		generated-icon.png
merged_nodes.json		merged_nodes.json
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
test_quality_system.py		test_quality_system.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Graphora API

Features

Prerequisites

Getting Started

Building the Project

Developer Shortcuts

Running the Project

Authentication

Service-to-service / pipeline tokens

Development Guide

Project Structure

Key Components

API Endpoints

REST API

Error Handling

Deployment

Production Considerations

Contributing

Before Contributing

Documentation

Related Repositories

License

Commercial Support

Community

Security

About

Uh oh!

Releases 2

Packages

Languages

License

graphora/graphora-api

Folders and files

Latest commit

History

Repository files navigation

Graphora API

Features

Prerequisites

Getting Started

Building the Project

Developer Shortcuts

Running the Project

Authentication

Service-to-service / pipeline tokens

Development Guide

Project Structure

Key Components

API Endpoints

REST API

Error Handling

Deployment

Production Considerations

Contributing

Before Contributing

Documentation

Related Repositories

License

Commercial Support

Community

Security

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages