A FastAPI service for generating ontologies through LLM-powered workflows with MongoDB persistence and comprehensive quality assurance.
- Overview
- Features
- Architecture
- Quick Start
- Configuration
- API Documentation
- Development
- Docker Deployment
- Frontend
- Testing
- Troubleshooting
- Project Structure
The Ontology Generation API is a comprehensive service that generates ontologies through a multi-step LLM-powered process. It creates structured ontologies based on business ideas and tenant requirements, stores them in MongoDB, and provides quality assurance validation.
The ontology generation process follows a structured 3-step workflow:
-
Use Case Generation & Ranking
- LLM generates 2-5 use cases based on
business_ideaandtenantinputs - Each use case is ranked with relevance and importance scores (0-1)
- Results are saved to the
MONGODB_RULEScollection - Ranking evaluates:
- Use case relevance to the business idea
- Use case importance to the business idea
- LLM generates 2-5 use cases based on
-
ERD Generation
- For each use case, an Entity-Relationship Diagram (ERD) is generated
- Results are saved to the
MONGODB_COLNAMEcollection
-
Quality Assurance
- Validates that generated
TablesandColumnsmeet required fields according to Pydantic models - Scores tables and columns for compliance
- Note: This is a structural validation, not content relevance validation
- Results are saved to the
MONGODB_QAcollection
- Validates that generated
- 🧠 LLM-Powered Generation: Leverages OpenAI models for intelligent ontology creation
- 📊 Quality Assurance: Automated validation of generated ontologies
- 🎯 Use Case Ranking: Intelligent scoring of use case relevance and importance
- 🗄️ MongoDB Persistence: Robust data storage with MongoDB
- 🔍 RESTful API: Clean, well-documented REST endpoints
- 🎨 Streamlit Frontend: Interactive web UI for ontology management
- 🐳 Docker Support: Containerized deployment ready
- ☸️ Kubernetes Ready: Helm charts included for Kubernetes deployment
- 📝 Type Safety: Pydantic models for request/response validation with complete type annotations across service layer and API routes
- 🔒 Environment-Based Configuration: Secure configuration management
- 📈 Structured Logging: Comprehensive logging throughout the application
- 🛡️ Error Handling: Robust error handling with proper exception propagation and HTTP status codes
- 📚 Auto-Generated Documentation: OpenAPI/Swagger documentation automatically generated from type annotations
- Framework: FastAPI 0.119+
- Database: MongoDB (via PyMongo)
- LLM: OpenAI API
- Frontend: Streamlit
- Python: 3.12+
- Package Manager: UV (optional) or pip
- Separation of Concerns: Clear separation between API routes, business logic, and data access
- Modular Architecture: Routes organized by resource type
- Type Safety: Comprehensive Pydantic models
- Error Handling: Consistent error responses with appropriate HTTP status codes
- Configuration Management: Centralized settings with environment variable validation
- Python 3.12 or higher
- MongoDB instance (local or cloud)
- OpenAI API key
- (Optional) UV package manager
-
Clone the repository:
git clone <repository-url> cd ontology-generation-api
-
Install dependencies:
Using pip:
pip install -r requirements.txt
Using UV (recommended):
uv sync
-
Set up environment variables:
Copy the example environment file:
cp env.example .env
Edit
.envwith your configuration:# MongoDB Configuration MONGODB_URL="mongodb://localhost:27017" # or mongodb+srv://... MONGODB_DBNAME="ontoai" MONGODB_RULES="ontologies_ideation" MONGODB_COLNAME="ontologies_v2" MONGODB_QA="ontologies_qa" # OpenAI Configuration OPENAI_API_KEY="your-openai-api-key" OPENAI_BASE_URL="https://api.openai.com/v1" OPENAI_MODEL_LARGE="gpt-4o" OPENAI_MODEL_SMALL="gpt-4o-mini" OPENAI_MODEL_NANO="gpt-4o-mini" # API Configuration (optional) HOST="0.0.0.0" PORT="8000" LOG_LEVEL="INFO"
-
Start MongoDB (if running locally):
# macOS brew services start mongodb/brew/mongodb-community # Linux sudo systemctl start mongod # Docker docker run -d -p 27017:27017 --name mongodb mongo:latest
-
Run the API:
Using UV:
uv run uvicorn src.app:app --reload
Using Python:
python -m uvicorn src.app:app --reload
Or directly:
uvicorn src.app:app --reload
-
Verify the API is running:
- Visit
http://localhost:8000/docsfor interactive API documentation - Visit
http://localhost:8000/for the root endpoint - Visit
http://localhost:8000/readinessto check MongoDB connection
- Visit
| Variable | Description | Example |
|---|---|---|
MONGODB_URL |
MongoDB connection URI | mongodb://localhost:27017 or mongodb+srv://... |
MONGODB_DBNAME |
MongoDB database name | ontoai |
MONGODB_COLNAME |
MongoDB collection for ontologies | ontologies_v2 |
MONGODB_RULES |
MongoDB collection for use case rules | ontologies_ideation |
MONGODB_QA |
MongoDB collection for QA results | ontologies_qa |
OPENAI_API_KEY |
OpenAI API key | sk-... |
OPENAI_BASE_URL |
OpenAI API base URL | https://api.openai.com/v1 |
| Variable | Description | Default |
|---|---|---|
HOST |
Server host | 0.0.0.0 |
PORT |
Server port | 8000 |
LOG_LEVEL |
Logging level | INFO |
OPENAI_MODEL_LARGE |
Large OpenAI model | gpt-4o |
OPENAI_MODEL_SMALL |
Small OpenAI model | gpt-4o-mini |
OPENAI_MODEL_NANO |
Nano OpenAI model | gpt-4o-mini |
- Local MongoDB: Use
mongodb://localhost:27017 - Docker MongoDB: Use
mongodb://host.docker.internal:27017when running API in Docker - Cloud MongoDB: Use
mongodb+srv://***/format
Once the API is running, visit:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
The API documentation is automatically generated from the route handler type annotations, ensuring accuracy and consistency between the code and documentation.
All endpoints have complete type annotations for request parameters and response types, enabling:
- Automatic OpenAPI schema generation
- Type checking and validation
- Better IDE support and autocomplete
- Accurate API documentation
Root endpoint for basic health check.
Response:
{
"message": "Ontology Generation API is running",
"version": "1.0.0"
}Check MongoDB connection and service readiness.
Response:
{
"status": "ready",
"services_ready": true,
"timestamp": "2025-11-06T21:59:49.625368+00:00",
"version": "1.0.0"
}Health check for the ontologies router.
Response:
{
"msg": "Hello from ontologies router"
}Get all ontologies from MongoDB.
Response: list[OntologiesMongo]
Get all ontology ObjectIds from MongoDB.
Response: list[str]
Example:
[
"69069d0c358d4c067d0e9156",
"6906a30a96448b738a9d96a6",
"6907ae6c7da72bcf7d70e1cf"
]Get a specific ontology by ObjectId.
Parameters:
id(path): Ontology ObjectId
Response: OntologiesMongo
Note: Returns 404 if the ontology ID is not found.
Example:
{
"ontologies": [
{
"use_case_name": "Product and Channel Profitability Analytics",
"use_case_uid": "1",
"use_case_relevance_score": 0.88,
"use_case_relevance_score_motivation": "Directly informs pricing discounts...",
"use_case_importance_score": 0.9,
"use_case_importance_score_motivation": "High immediate impact on gross margins...",
"domains": "Finance and product profitability analytics",
"questions": [...],
"concepts": [...],
"erd": {}
}
],
"rules_id": "690d0d35f15220b7a7a8582f",
"initial_business_idea": "finance and accounting",
"_id": "690d0e66f15220b7a7a85830",
"date_created": "2025-11-06T13:08:54.791000"
}Get ontology ranking (use case scores) by ObjectId.
Parameters:
id(path): Ontology ObjectId
Response: list[UseCaseRanking]
Note: Returns 404 if the ontology ID is not found.
Example:
[
{
"use_case_name": "Tariff-aware Costing and Wholesale Margin Optimization",
"use_case_uid": "1",
"use_case_relevance_score": 0.92,
"use_case_relevance_score_motivation": "Tariffs directly influence landed cost...",
"use_case_importance_score": 0.88,
"use_case_importance_score_motivation": "Immediate profitability and alignment...",
"initial_business_idea": "tariffs supply chain"
}
]Get QA results for an ontology by ObjectId.
Parameters:
id(path): Ontology ObjectId
Response: dict[str, Any] | None
Note:
- Returns
Noneif QA has not been run for this ontology yet - QA is automatically run when an ontology is created via the
POST /api/v1/ontologies/create_ontologyendpoint - Returns 404 if the ontology ID is not found
Example:
{
"_id": "690d0e66f15220b7a7a85831",
"use_case_id": "690d0e66f15220b7a7a85830",
"qa": [
{
"id": "690d0e66f15220b7a7a85830",
"use_case_id": "1",
"tables_scores": [
{
"name": "RevenueFact",
"score": 100
}
],
"columns_scores": [
{
"name": "RevenueFactId",
"score": 100,
"parent_table": "RevenueFact"
}
]
}
]
}Get the best ontology candidate (highest average of importance and relevance scores).
Parameters:
id(query): Ontology ObjectId
Response: Ontology
Note: Returns 404 if the ontology ID is not found. The best candidate is determined by averaging the use_case_relevance_score and use_case_importance_score for each use case.
Example:
{
"use_case_name": "Product and Channel Profitability Analytics",
"use_case_uid": "1",
"use_case_relevance_score": 0.88,
"use_case_relevance_score_motivation": "Directly informs pricing discounts...",
"use_case_importance_score": 0.9,
"use_case_importance_score_motivation": "High immediate impact on gross margins...",
"domains": "Finance and product profitability analytics",
"questions": [...],
"concepts": [...],
"erd": {}
}Create a new ontology for a use case and tenant.
Request Body:
{
"use_case": "finance and accounting",
"tenant": "your-tenant-name"
}Response: dict[str, str]
Example:
{
"id": "690d0e66f15220b7a7a85830",
"status": "ontology created, QA'ed, saved to MongoDB"
}Note:
- This endpoint triggers the full ontology generation workflow (use case generation, ERD creation, and QA validation)
- QA is automatically performed and saved to MongoDB
- Returns 500 if ontology generation fails at any step
Delete an ontology by ObjectId.
Parameters:
id(path): Ontology ObjectId
Response: dict[str, str | bool]
Example:
{
"id": "690d0e66f15220b7a7a85830",
"acknowledged": true
}Note:
- Returns 404 if the ontology ID is not found
- Returns 500 if deletion fails
-
Install development dependencies:
pip install -r requirements.txt # or uv sync -
Run with auto-reload:
uvicorn src.app:app --reload --host 0.0.0.0 --port 8000
-
Run with UV:
uv run uvicorn src.app:app --reload
The codebase follows best practices for type safety and error handling:
- Complete Type Annotations: All service layer and API route handler functions have proper return type annotations
- API Type Safety: All FastAPI route handlers are fully typed, enabling automatic OpenAPI schema generation and better IDE support
- Error Handling: Functions properly raise exceptions instead of silently failing
- Type Safety: Uses Pydantic models for data validation and type checking
- MongoDB Operations: Proper handling of
Nonereturns from database queries
The project follows a clean architecture with clear separation of concerns:
src/
├── api/ # API routes and endpoints
│ ├── routes/ # Route handlers by resource
│ └── routes.py # Main API router
├── config/ # Configuration management
│ └── settings.py # Environment settings
├── models/ # Pydantic models
│ ├── health_models.py
│ ├── ontologies_models.py
│ ├── qa_models.py
│ └── use_cases_models.py
├── services/ # Business logic
│ ├── ontologies_service.py
│ └── openai_service.py
├── utils/ # Utility functions
│ ├── prompts.py # LLM prompts
│ └── qa.py # QA validation
├── frontend/ # Streamlit frontend pages
│ └── pages/
├── app.py # FastAPI application
└── startup.py # Startup and shutdown logic
docker build -t ontology-generation-api .For local MongoDB:
docker run -d \
--name ontology-generation-api-container \
-p 8000:8000 \
-e MONGODB_URL=mongodb://host.docker.internal:27017 \
-e MONGODB_DBNAME=ontoai \
-e MONGODB_COLNAME=ontologies_v2 \
-e MONGODB_RULES=ontologies_ideation \
-e MONGODB_QA=ontologies_qa \
-e OPENAI_API_KEY=your-api-key \
-e OPENAI_BASE_URL=https://api.openai.com/v1 \
ontology-generation-apiFor cloud MongoDB:
docker run -d \
--name ontology-generation-api-container \
-p 8000:8000 \
-e MONGODB_URL=mongodb+srv://user:password@cluster.mongodb.net/ \
-e MONGODB_DBNAME=ontoai \
-e MONGODB_COLNAME=ontologies_v2 \
-e MONGODB_RULES=ontologies_ideation \
-e MONGODB_QA=ontologies_qa \
-e OPENAI_API_KEY=your-api-key \
-e OPENAI_BASE_URL=https://api.openai.com/v1 \
ontology-generation-apiUse the provided docker-compose.yml for a complete setup with MongoDB:
docker-compose up -ddocker logs ontology-generation-api-containerExpected output:
INFO: Started server process [7]
INFO: Waiting for application startup.
2025-10-20 23:38:17,443 - src.startup - INFO - ✅ Mongodb connection established
2025-10-20 23:38:17,464 - src.startup - INFO - ✅ openai connection established
2025-10-20 23:38:17,464 - src.startup - INFO - ✅ Ontology Generation API initialization complete!
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
The project includes a Streamlit-based web UI for interacting with the API.
streamlit run streamlit_app.pyThe frontend will be available at http://localhost:8501
- View Ontologies: View a specific ontology by selecting its ID
- Create Ontology: Create a new ontology based on use case and tenant
- Details Ontology: View detailed information about an ontology
- Read Me: Instructions and documentation
Set the backend URL via environment variable:
export BACKEND_URL=http://127.0.0.1:8000Or modify it in streamlit_app.py:
BACKEND_URL = os.getenv("BACKEND_URL", "http://127.0.0.1:8000")Run tests using pytest:
pytest tests/test.pyOr with verbose output:
pytest tests/test.py -vSymptoms: ConnectionFailure error or readiness check fails
Solutions:
- Verify MongoDB is running:
mongoshormongo --eval "db.adminCommand('ping')" - Check
MONGODB_URLis correct - For Docker: Use
mongodb://host.docker.internal:27017for local MongoDB - Check firewall settings
- Verify MongoDB credentials (for cloud instances)
Symptoms: API calls fail with authentication or rate limit errors
Solutions:
- Verify
OPENAI_API_KEYis set correctly - Check API key has sufficient credits
- Verify
OPENAI_BASE_URLis correct - Check rate limits in OpenAI dashboard
Symptoms: ValueError: Missing required environment variables
Solutions:
- Copy
env.exampleto.env - Verify all required variables are set
- Check variable names match exactly (case-sensitive)
- Restart the application after setting variables
Symptoms: Address already in use error
Solutions:
- Change
PORTenvironment variable - Kill the process using the port:
lsof -ti:8000 | xargs kill - Use a different port:
uvicorn src.app:app --port 8001
Symptoms: ModuleNotFoundError or import errors
Solutions:
- Ensure virtual environment is activated
- Reinstall dependencies:
pip install -r requirements.txt - Verify Python version is 3.12+
- Check
PYTHONPATHis set correctly
- Check Logs: Review application logs for detailed error information
- API Documentation: Use interactive docs at
/docsto test endpoints - Validate Configuration: The app validates settings on startup
- MongoDB Status: Check MongoDB connection with
/readinessendpoint
ontology-generation-api/
├── deploy_script.sh # Deployment script
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Docker image definition
├── env.example # Environment variables example
├── pyproject.toml # Project configuration (UV)
├── requirements.txt # Python dependencies
├── README.md # This file
├── streamlit_app.py # Streamlit frontend entry point
├── uv.lock # UV lock file
├── infra/ # Infrastructure as code
│ └── helm/ # Kubernetes Helm charts
│ ├── Chart.yaml
│ ├── values.yaml
│ └── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ └── service.yaml
├── src/ # Source code
│ ├── api/ # API routes
│ │ ├── routes/
│ │ │ ├── __init__.py
│ │ │ └── ontology.py
│ │ └── routes.py
│ ├── app.py # FastAPI application
│ ├── config/ # Configuration
│ │ ├── __init__.py
│ │ └── settings.py
│ ├── frontend/ # Streamlit frontend
│ │ ├── __init__.py
│ │ └── pages/
│ │ ├── __init__.py
│ │ ├── create.py
│ │ ├── delete.py
│ │ ├── details.py
│ │ ├── instructions.py
│ │ ├── read_use_cases.py
│ │ └── read.py
│ ├── models/ # Pydantic models
│ │ ├── health_models.py
│ │ ├── ontologies_models.py
│ │ ├── qa_models.py
│ │ └── use_cases_models.py
│ ├── services/ # Business logic
│ │ ├── __init__.py
│ │ ├── ontologies_service.py
│ │ └── openai_service.py
│ ├── startup.py # Startup/shutdown logic
│ └── utils/ # Utilities
│ ├── __init__.py
│ ├── prompts.py
│ └── qa.py
└── tests/ # Tests
├── __init__.py
└── test.py
[Add your license information here]
[Add contributing guidelines here]
Version: 1.0.0
Last Updated: 2025-11-08