Skip to content

graphora/graphora-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Graphora API

License: AGPL v3 Python FastAPI Neo4j PRs Welcome

AI-powered knowledge graph construction from unstructured data

A sophisticated document processing backend that leverages AI-powered intelligent document analysis with flexible and modular preprocessing capabilities. The system uses FastAPI, LLM, and Neo4j to create an intelligent document processing pipeline.

Features

  • Advanced AI-driven document processing pipeline with robust error handling
  • Multi-format document support and intelligent workflow management
  • Scalable microservice architecture optimized for document intelligence
  • Real-time status tracking for preprocessing steps
  • Temporary subgraph creation for review before final integration
  • Human-in-the-loop feedback system

Prerequisites

  • Python 3.11 or higher
  • LLM API key
  • Neo4j database access

Getting Started

Building the Project

  • Install uv:
    curl -LsSf https://astral.sh/uv/install.sh | sh
  • Create virtual env:
    uv venv
  • Install libmagic
    sudo apt-get install libmagic1
  •  uv sync
  • Setup BAML

Developer Shortcuts

  • make install – sync Python dependencies via uv
  • uv sync --group dev – install optional dev tools (e.g. Vulture for dead-code checks)
  • make compose-up – start local Neo4j and Redis containers (see Local Development Guide)
  • make lint, make test, make typecheck – run quality gates before committing
  • make test-unit, make test-integration – run just unit or integration slices as needed
  • make test now emits coverage stats to the terminal and writes coverage.xml for CI tooling
  • make deadcode – run Vulture against the codebase to surface unused definitions
  • make openapi-snapshot – regenerate tests/snapshots/openapi.json after intentional API changes so contract tests stay green

Running the Project

The project will automatically start when you run it on Replit. The FastAPI server will be available at port 8000.

To manually start the server:

python -m app.main

The API will be available at:

  • API Documentation: /api/v1/docs
  • OpenAPI Specification: /api/v1/openapi.json

Authentication

  • All API requests must include a Clerk-issued bearer token: Authorization: Bearer <token>.
  • Configure the backend with Clerk credentials via .env: CLERK_JWKS_URL, CLERK_ISSUER, CLERK_AUDIENCE, and CLERK_API_KEY if server-to-server calls are required.
  • Clients no longer send the legacy user-id header; the backend derives the user from the JWT subject claim.

Service-to-service / pipeline tokens

You can call the API from CI jobs or data pipelines without any extra backend code. Mint short-lived Clerk JWTs on demand and feed them to the Graphora client:

  1. Create (or reuse) a Clerk user that represents the pipeline and add a JWT template (e.g. graphora_pipeline) whose aud value matches CLERK_AUDIENCE.
  2. When the pipeline starts, create a token via Clerk's backend API using your Clerk API key:
    curl -X POST "https://api.clerk.com/v1/users/<USER_ID>/tokens/graphora_pipeline" \\
      -H "Authorization: Bearer $CLERK_API_KEY" \\
      -H "Content-Type: application/json" \\
      -d '{"expires_in_seconds": 3600}'
  3. Export the returned token right before invoking the client:
    export GRAPHORA_AUTH_TOKEN="<clerk-jwt-from-step-2>"
    python pipeline.py
  4. Repeat the minting step whenever the token expires (keep TTLs short and rotate the Clerk API key like any other secret).

The graphora Python package automatically reads GRAPHORA_AUTH_TOKEN, so no application changes are required as long as the bearer token is valid.

Development Guide

Project Structure

app/
├── agents/             # AI agents for workflow and feedback
├── api/               # API endpoints (REST and GraphQL)
├── services/          # Core business logic services
├── schemas/           # Pydantic models and schemas
├── utils/             # Utility functions and helpers
└── main.py           # Application entry point

Key Components

  1. Preprocessing Service

    • Handles multi-step document preprocessing
    • Provides real-time status updates
    • Implements robust error handling
  2. Extraction Service

    • Manages entity and relationship extraction
    • Integrates with OpenAI for intelligent processing
    • Handles temporary graph creation
  3. Graph Service

    • Manages Neo4j database operations
    • Handles subgraph creation and updates
    • Processes user feedback

API Endpoints

REST API

  1. Document Upload
POST /api/v1/documents/upload
Content-Type: multipart/form-data
  1. Submit Feedback
POST /api/v1/feedback/{document_id}
Content-Type: application/json

Error Handling

The system implements comprehensive error handling:

  • Status tracking for each preprocessing step
  • Detailed error messages and logging
  • Graceful failure recovery
  • User-friendly error responses

Deployment

The project is configured to run automatically with the following features:

  • Auto-reload during development
  • Production-ready ASGI server (uvicorn)
  • Proper port configuration for Replit hosting

Production Considerations

When deploying to production:

  1. Update CORS settings in main.py
  2. Configure proper logging levels
  3. Set up proper database credentials
  4. Enable rate limiting and security measures

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Before Contributing

  1. Read the Code of Conduct
  2. Sign the Contributor License Agreement
  3. Check out good first issues

Documentation

Related Repositories

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

  • ✅ Use for free under AGPL v3 terms
  • ✅ Modify and distribute with source code
  • ❌ Cannot use as closed-source SaaS without commercial license

For commercial licensing (closed-source SaaS, enterprise deployments, OEM), contact: sales@graphora.io

See LICENSE for full terms.

Commercial Support

  • Enterprise Support: SLA-backed support for production deployments
  • Consulting: Custom integrations, training, architecture design
  • Commercial Licensing: Closed-source and SaaS deployments
  • Database Vendor Partnerships: OEM licensing for database companies

Contact: support@graphora.io

Community

Security

Please report security vulnerabilities to support@graphora.io

See SECURITY.md for details.


Made with ❤️ by Arivan Labs

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published