Ontology Generation API

A FastAPI service for generating ontologies through LLM-powered workflows with MongoDB persistence and comprehensive quality assurance.

📋 Table of Contents

Overview
Features
Architecture
Quick Start
Configuration
API Documentation
Development
Docker Deployment
Frontend
Testing
Troubleshooting
Project Structure

🎯 Overview

The Ontology Generation API is a comprehensive service that generates ontologies through a multi-step LLM-powered process. It creates structured ontologies based on business ideas and tenant requirements, stores them in MongoDB, and provides quality assurance validation.

Workflow

The ontology generation process follows a structured 3-step workflow:

Use Case Generation & Ranking
- LLM generates 2-5 use cases based on business_idea and tenant inputs
- Each use case is ranked with relevance and importance scores (0-1)
- Results are saved to the MONGODB_RULES collection
- Ranking evaluates:
  - Use case relevance to the business idea
  - Use case importance to the business idea
ERD Generation
- For each use case, an Entity-Relationship Diagram (ERD) is generated
- Results are saved to the MONGODB_COLNAME collection
Quality Assurance
- Validates that generated Tables and Columns meet required fields according to Pydantic models
- Scores tables and columns for compliance
- Note: This is a structural validation, not content relevance validation
- Results are saved to the MONGODB_QA collection

✨ Features

🧠 LLM-Powered Generation: Leverages OpenAI models for intelligent ontology creation
📊 Quality Assurance: Automated validation of generated ontologies
🎯 Use Case Ranking: Intelligent scoring of use case relevance and importance
🗄️ MongoDB Persistence: Robust data storage with MongoDB
🔍 RESTful API: Clean, well-documented REST endpoints
🎨 Streamlit Frontend: Interactive web UI for ontology management
🐳 Docker Support: Containerized deployment ready
☸️ Kubernetes Ready: Helm charts included for Kubernetes deployment
📝 Type Safety: Pydantic models for request/response validation with complete type annotations across service layer and API routes
🔒 Environment-Based Configuration: Secure configuration management
📈 Structured Logging: Comprehensive logging throughout the application
🛡️ Error Handling: Robust error handling with proper exception propagation and HTTP status codes
📚 Auto-Generated Documentation: OpenAPI/Swagger documentation automatically generated from type annotations

🏗️ Architecture

Technology Stack

Framework: FastAPI 0.119+
Database: MongoDB (via PyMongo)
LLM: OpenAI API
Frontend: Streamlit
Python: 3.12+
Package Manager: UV (optional) or pip

Design Principles

Separation of Concerns: Clear separation between API routes, business logic, and data access
Modular Architecture: Routes organized by resource type
Type Safety: Comprehensive Pydantic models
Error Handling: Consistent error responses with appropriate HTTP status codes
Configuration Management: Centralized settings with environment variable validation

🚀 Quick Start

Prerequisites

Python 3.12 or higher
MongoDB instance (local or cloud)
OpenAI API key
(Optional) UV package manager

Installation

Clone the repository:

git clone <repository-url>
cd ontology-generation-api

Install dependencies:

Using pip:
```
pip install -r requirements.txt
```
Using UV (recommended):
```
uv sync
```

Set up environment variables:

Copy the example environment file:

cp env.example .env

Edit .env with your configuration:

# MongoDB Configuration
MONGODB_URL="mongodb://localhost:27017"  # or mongodb+srv://...
MONGODB_DBNAME="ontoai"
MONGODB_RULES="ontologies_ideation"
MONGODB_COLNAME="ontologies_v2"
MONGODB_QA="ontologies_qa"

# OpenAI Configuration
OPENAI_API_KEY="your-openai-api-key"
OPENAI_BASE_URL="https://api.openai.com/v1"
OPENAI_MODEL_LARGE="gpt-4o"
OPENAI_MODEL_SMALL="gpt-4o-mini"
OPENAI_MODEL_NANO="gpt-4o-mini"

# API Configuration (optional)
HOST="0.0.0.0"
PORT="8000"
LOG_LEVEL="INFO"

Start MongoDB (if running locally):

# macOS
brew services start mongodb/brew/mongodb-community

# Linux
sudo systemctl start mongod

# Docker
docker run -d -p 27017:27017 --name mongodb mongo:latest

Run the API:

Using UV:

uv run uvicorn src.app:app --reload

Using Python:

python -m uvicorn src.app:app --reload

Or directly:

uvicorn src.app:app --reload

Verify the API is running:
- Visit http://localhost:8000/docs for interactive API documentation
- Visit http://localhost:8000/ for the root endpoint
- Visit http://localhost:8000/readiness to check MongoDB connection

🔧 Configuration

Required Environment Variables

Variable	Description	Example
`MONGODB_URL`	MongoDB connection URI	`mongodb://localhost:27017` or `mongodb+srv://...`
`MONGODB_DBNAME`	MongoDB database name	`ontoai`
`MONGODB_COLNAME`	MongoDB collection for ontologies	`ontologies_v2`
`MONGODB_RULES`	MongoDB collection for use case rules	`ontologies_ideation`
`MONGODB_QA`	MongoDB collection for QA results	`ontologies_qa`
`OPENAI_API_KEY`	OpenAI API key	`sk-...`
`OPENAI_BASE_URL`	OpenAI API base URL	`https://api.openai.com/v1`

Optional Environment Variables

Variable	Description	Default
`HOST`	Server host	`0.0.0.0`
`PORT`	Server port	`8000`
`LOG_LEVEL`	Logging level	`INFO`
`OPENAI_MODEL_LARGE`	Large OpenAI model	`gpt-4o`
`OPENAI_MODEL_SMALL`	Small OpenAI model	`gpt-4o-mini`
`OPENAI_MODEL_NANO`	Nano OpenAI model	`gpt-4o-mini`

MongoDB Connection

Local MongoDB: Use mongodb://localhost:27017
Docker MongoDB: Use mongodb://host.docker.internal:27017 when running API in Docker
Cloud MongoDB: Use mongodb+srv://***/ format

📖 API Documentation

Interactive Documentation

Once the API is running, visit:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

The API documentation is automatically generated from the route handler type annotations, ensuring accuracy and consistency between the code and documentation.

Endpoints

All endpoints have complete type annotations for request parameters and response types, enabling:

Automatic OpenAPI schema generation
Type checking and validation
Better IDE support and autocomplete
Accurate API documentation

Health & Status

`GET /`

Root endpoint for basic health check.

Response:

{
  "message": "Ontology Generation API is running",
  "version": "1.0.0"
}

`GET /readiness`

Check MongoDB connection and service readiness.

Response:

{
  "status": "ready",
  "services_ready": true,
  "timestamp": "2025-11-06T21:59:49.625368+00:00",
  "version": "1.0.0"
}

Ontologies

`GET /api/v1/ontologies/health`

Health check for the ontologies router.

Response:

{
  "msg": "Hello from ontologies router"
}

`GET /api/v1/ontologies/ontologies`

Get all ontologies from MongoDB.

Response: list[OntologiesMongo]

`GET /api/v1/ontologies/ontologies_ids`

Get all ontology ObjectIds from MongoDB.

Response: list[str]

Example:

[
  "69069d0c358d4c067d0e9156",
  "6906a30a96448b738a9d96a6",
  "6907ae6c7da72bcf7d70e1cf"
]

`GET /api/v1/ontologies/ontology/{id}`

Get a specific ontology by ObjectId.

Parameters:

id (path): Ontology ObjectId

Response: OntologiesMongo

Note: Returns 404 if the ontology ID is not found.

Example:

{
  "ontologies": [
    {
      "use_case_name": "Product and Channel Profitability Analytics",
      "use_case_uid": "1",
      "use_case_relevance_score": 0.88,
      "use_case_relevance_score_motivation": "Directly informs pricing discounts...",
      "use_case_importance_score": 0.9,
      "use_case_importance_score_motivation": "High immediate impact on gross margins...",
      "domains": "Finance and product profitability analytics",
      "questions": [...],
      "concepts": [...],
      "erd": {}
    }
  ],
  "rules_id": "690d0d35f15220b7a7a8582f",
  "initial_business_idea": "finance and accounting",
  "_id": "690d0e66f15220b7a7a85830",
  "date_created": "2025-11-06T13:08:54.791000"
}

`GET /api/v1/ontologies/ontology_ranking/{id}`

Get ontology ranking (use case scores) by ObjectId.

Parameters:

id (path): Ontology ObjectId

Response: list[UseCaseRanking]

Note: Returns 404 if the ontology ID is not found.

Example:

[
  {
    "use_case_name": "Tariff-aware Costing and Wholesale Margin Optimization",
    "use_case_uid": "1",
    "use_case_relevance_score": 0.92,
    "use_case_relevance_score_motivation": "Tariffs directly influence landed cost...",
    "use_case_importance_score": 0.88,
    "use_case_importance_score_motivation": "Immediate profitability and alignment...",
    "initial_business_idea": "tariffs supply chain"
  }
]

`GET /api/v1/ontologies/ontology_qa/{id}`

Get QA results for an ontology by ObjectId.

Parameters:

id (path): Ontology ObjectId

Response: dict[str, Any] | None

Note:

Returns None if QA has not been run for this ontology yet
QA is automatically run when an ontology is created via the POST /api/v1/ontologies/create_ontology endpoint
Returns 404 if the ontology ID is not found

Example:

{
  "_id": "690d0e66f15220b7a7a85831",
  "use_case_id": "690d0e66f15220b7a7a85830",
  "qa": [
    {
      "id": "690d0e66f15220b7a7a85830",
      "use_case_id": "1",
      "tables_scores": [
        {
          "name": "RevenueFact",
          "score": 100
        }
      ],
      "columns_scores": [
        {
          "name": "RevenueFactId",
          "score": 100,
          "parent_table": "RevenueFact"
        }
      ]
    }
  ]
}

`GET /api/v1/ontologies/ontology_candidate/best`

Get the best ontology candidate (highest average of importance and relevance scores).

Parameters:

id (query): Ontology ObjectId

Response: Ontology

Note: Returns 404 if the ontology ID is not found. The best candidate is determined by averaging the use_case_relevance_score and use_case_importance_score for each use case.

Example:

{
  "use_case_name": "Product and Channel Profitability Analytics",
  "use_case_uid": "1",
  "use_case_relevance_score": 0.88,
  "use_case_relevance_score_motivation": "Directly informs pricing discounts...",
  "use_case_importance_score": 0.9,
  "use_case_importance_score_motivation": "High immediate impact on gross margins...",
  "domains": "Finance and product profitability analytics",
  "questions": [...],
  "concepts": [...],
  "erd": {}
}

`POST /api/v1/ontologies/create_ontology`

Create a new ontology for a use case and tenant.

Request Body:

{
  "use_case": "finance and accounting",
  "tenant": "your-tenant-name"
}

Response: dict[str, str]

Example:

{
  "id": "690d0e66f15220b7a7a85830",
  "status": "ontology created, QA'ed, saved to MongoDB"
}

Note:

This endpoint triggers the full ontology generation workflow (use case generation, ERD creation, and QA validation)
QA is automatically performed and saved to MongoDB
Returns 500 if ontology generation fails at any step

`DELETE /api/v1/ontologies/delete/{id}`

Delete an ontology by ObjectId.

Parameters:

id (path): Ontology ObjectId

Response: dict[str, str | bool]

Example:

{
  "id": "690d0e66f15220b7a7a85830",
  "acknowledged": true
}

Note:

Returns 404 if the ontology ID is not found
Returns 500 if deletion fails

💻 Development

Development Setup

Install development dependencies:

pip install -r requirements.txt
# or
uv sync

Run with auto-reload:

uvicorn src.app:app --reload --host 0.0.0.0 --port 8000

Run with UV:
```
uv run uvicorn src.app:app --reload
```

Code Quality

The codebase follows best practices for type safety and error handling:

Complete Type Annotations: All service layer and API route handler functions have proper return type annotations
API Type Safety: All FastAPI route handlers are fully typed, enabling automatic OpenAPI schema generation and better IDE support
Error Handling: Functions properly raise exceptions instead of silently failing
Type Safety: Uses Pydantic models for data validation and type checking
MongoDB Operations: Proper handling of None returns from database queries

Code Structure

The project follows a clean architecture with clear separation of concerns:

src/
├── api/              # API routes and endpoints
│   ├── routes/       # Route handlers by resource
│   └── routes.py     # Main API router
├── config/           # Configuration management
│   └── settings.py   # Environment settings
├── models/           # Pydantic models
│   ├── health_models.py
│   ├── ontologies_models.py
│   ├── qa_models.py
│   └── use_cases_models.py
├── services/         # Business logic
│   ├── ontologies_service.py
│   └── openai_service.py
├── utils/            # Utility functions
│   ├── prompts.py    # LLM prompts
│   └── qa.py         # QA validation
├── frontend/         # Streamlit frontend pages
│   └── pages/
├── app.py            # FastAPI application
└── startup.py        # Startup and shutdown logic

🐳 Docker Deployment

Build Docker Image

docker build -t ontology-generation-api .

Run Docker Container

For local MongoDB:

docker run -d \
  --name ontology-generation-api-container \
  -p 8000:8000 \
  -e MONGODB_URL=mongodb://host.docker.internal:27017 \
  -e MONGODB_DBNAME=ontoai \
  -e MONGODB_COLNAME=ontologies_v2 \
  -e MONGODB_RULES=ontologies_ideation \
  -e MONGODB_QA=ontologies_qa \
  -e OPENAI_API_KEY=your-api-key \
  -e OPENAI_BASE_URL=https://api.openai.com/v1 \
  ontology-generation-api

For cloud MongoDB:

docker run -d \
  --name ontology-generation-api-container \
  -p 8000:8000 \
  -e MONGODB_URL=mongodb+srv://user:password@cluster.mongodb.net/ \
  -e MONGODB_DBNAME=ontoai \
  -e MONGODB_COLNAME=ontologies_v2 \
  -e MONGODB_RULES=ontologies_ideation \
  -e MONGODB_QA=ontologies_qa \
  -e OPENAI_API_KEY=your-api-key \
  -e OPENAI_BASE_URL=https://api.openai.com/v1 \
  ontology-generation-api

Docker Compose

Use the provided docker-compose.yml for a complete setup with MongoDB:

docker-compose up -d

View Logs

docker logs ontology-generation-api-container

Expected output:

INFO:     Started server process [7]
INFO:     Waiting for application startup.
2025-10-20 23:38:17,443 - src.startup - INFO - ✅ Mongodb connection established
2025-10-20 23:38:17,464 - src.startup - INFO - ✅ openai connection established
2025-10-20 23:38:17,464 - src.startup - INFO - ✅ Ontology Generation API initialization complete!
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

🎨 Frontend

The project includes a Streamlit-based web UI for interacting with the API.

Running the Frontend

streamlit run streamlit_app.py

The frontend will be available at http://localhost:8501

Frontend Pages

View Ontologies: View a specific ontology by selecting its ID
Create Ontology: Create a new ontology based on use case and tenant
Details Ontology: View detailed information about an ontology
Read Me: Instructions and documentation

Frontend Configuration

Set the backend URL via environment variable:

export BACKEND_URL=http://127.0.0.1:8000

Or modify it in streamlit_app.py:

BACKEND_URL = os.getenv("BACKEND_URL", "http://127.0.0.1:8000")

🧪 Testing

Run tests using pytest:

pytest tests/test.py

Or with verbose output:

pytest tests/test.py -v

🐛 Troubleshooting

Common Issues

1. MongoDB Connection Failed

Symptoms: ConnectionFailure error or readiness check fails

Solutions:

Verify MongoDB is running: mongosh or mongo --eval "db.adminCommand('ping')"
Check MONGODB_URL is correct
For Docker: Use mongodb://host.docker.internal:27017 for local MongoDB
Check firewall settings
Verify MongoDB credentials (for cloud instances)

2. OpenAI API Errors

Symptoms: API calls fail with authentication or rate limit errors

Solutions:

Verify OPENAI_API_KEY is set correctly
Check API key has sufficient credits
Verify OPENAI_BASE_URL is correct
Check rate limits in OpenAI dashboard

3. Missing Environment Variables

Symptoms: ValueError: Missing required environment variables

Solutions:

Copy env.example to .env
Verify all required variables are set
Check variable names match exactly (case-sensitive)
Restart the application after setting variables

4. Port Already in Use

Symptoms: Address already in use error

Solutions:

Change PORT environment variable
Kill the process using the port: lsof -ti:8000 | xargs kill
Use a different port: uvicorn src.app:app --port 8001

5. Import Errors

Symptoms: ModuleNotFoundError or import errors

Solutions:

Ensure virtual environment is activated
Reinstall dependencies: pip install -r requirements.txt
Verify Python version is 3.12+
Check PYTHONPATH is set correctly

Getting Help

Check Logs: Review application logs for detailed error information
API Documentation: Use interactive docs at /docs to test endpoints
Validate Configuration: The app validates settings on startup
MongoDB Status: Check MongoDB connection with /readiness endpoint

📁 Project Structure

ontology-generation-api/
├── deploy_script.sh          # Deployment script
├── docker-compose.yml        # Docker Compose configuration
├── Dockerfile                # Docker image definition
├── env.example               # Environment variables example
├── pyproject.toml            # Project configuration (UV)
├── requirements.txt          # Python dependencies
├── README.md                 # This file
├── streamlit_app.py          # Streamlit frontend entry point
├── uv.lock                   # UV lock file
├── infra/                    # Infrastructure as code
│   └── helm/                 # Kubernetes Helm charts
│       ├── Chart.yaml
│       ├── values.yaml
│       └── templates/
│           ├── _helpers.tpl
│           ├── deployment.yaml
│           └── service.yaml
├── src/                      # Source code
│   ├── api/                  # API routes
│   │   ├── routes/
│   │   │   ├── __init__.py
│   │   │   └── ontology.py
│   │   └── routes.py
│   ├── app.py                # FastAPI application
│   ├── config/               # Configuration
│   │   ├── __init__.py
│   │   └── settings.py
│   ├── frontend/             # Streamlit frontend
│   │   ├── __init__.py
│   │   └── pages/
│   │       ├── __init__.py
│   │       ├── create.py
│   │       ├── delete.py
│   │       ├── details.py
│   │       ├── instructions.py
│   │       ├── read_use_cases.py
│   │       └── read.py
│   ├── models/               # Pydantic models
│   │   ├── health_models.py
│   │   ├── ontologies_models.py
│   │   ├── qa_models.py
│   │   └── use_cases_models.py
│   ├── services/             # Business logic
│   │   ├── __init__.py
│   │   ├── ontologies_service.py
│   │   └── openai_service.py
│   ├── startup.py            # Startup/shutdown logic
│   └── utils/                # Utilities
│       ├── __init__.py
│       ├── prompts.py
│       └── qa.py
└── tests/                    # Tests
    ├── __init__.py
    └── test.py

📄 License

[Add your license information here]

🤝 Contributing

[Add contributing guidelines here]

Version: 1.0.0
Last Updated: 2025-11-08

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
infra/helm		infra/helm
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
deploy_script.sh		deploy_script.sh
docker-compose.yml		docker-compose.yml
env.example		env.example
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Ontology Generation API

📋 Table of Contents

🎯 Overview

Workflow

✨ Features

🏗️ Architecture

Technology Stack

Design Principles

🚀 Quick Start

Prerequisites

Installation

🔧 Configuration

Required Environment Variables

Optional Environment Variables

MongoDB Connection

📖 API Documentation

Interactive Documentation

Endpoints

Health & Status

GET /

GET /readiness

Ontologies

GET /api/v1/ontologies/health

GET /api/v1/ontologies/ontologies

GET /api/v1/ontologies/ontologies_ids

GET /api/v1/ontologies/ontology/{id}

GET /api/v1/ontologies/ontology_ranking/{id}

GET /api/v1/ontologies/ontology_qa/{id}

GET /api/v1/ontologies/ontology_candidate/best

POST /api/v1/ontologies/create_ontology

DELETE /api/v1/ontologies/delete/{id}

💻 Development

Development Setup

Code Quality

Code Structure

🐳 Docker Deployment

Build Docker Image

Run Docker Container

Docker Compose

View Logs

🎨 Frontend

Running the Frontend

Frontend Pages

Frontend Configuration

🧪 Testing

🐛 Troubleshooting

Common Issues

1. MongoDB Connection Failed

2. OpenAI API Errors

3. Missing Environment Variables

4. Port Already in Use

5. Import Errors

Getting Help

📁 Project Structure

📄 License

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /readiness`

`GET /api/v1/ontologies/health`

`GET /api/v1/ontologies/ontologies`

`GET /api/v1/ontologies/ontologies_ids`

`GET /api/v1/ontologies/ontology/{id}`

`GET /api/v1/ontologies/ontology_ranking/{id}`

`GET /api/v1/ontologies/ontology_qa/{id}`

`GET /api/v1/ontologies/ontology_candidate/best`

`POST /api/v1/ontologies/create_ontology`

`DELETE /api/v1/ontologies/delete/{id}`

Packages