A production-ready, open-source platform for high-volume data ingestion, distributed analytics, and AI-driven anomaly detection. Built with Python, FastAPI, Kafka, Spark, Redis, and Kubernetes.
If this project helps you, please consider giving it a ⭐ star — it means a lot!
- End-to-end reference — From ingestion to AI inference to query serving in one repo
- Production patterns — Microservices, shared libs, Helm, Terraform, Prometheus/Grafana
- Modern stack — FastAPI, Kafka, Spark, Redis, Elasticsearch, Docker, K8s
- Ready to extend — Clear structure for adding your own models and services
| Capability | Description |
|---|---|
| High-volume ingestion | Kafka-backed ingestion with schema validation and backpressure |
| Distributed analytics | Spark for batch and stream processing |
| AI/ML | Anomaly detection, forecasting, and pattern detection (scikit-learn ready) |
| Intelligent caching | Redis cache-aside for sub-second query acceleration |
| Enterprise monitoring | Prometheus metrics, Grafana dashboards, structured logging |
| Layer | Technologies |
|---|---|
| Runtime | Python 3.11+, FastAPI |
| Messaging | Apache Kafka |
| Processing | Apache Spark |
| Cache | Redis |
| Data | PostgreSQL, Elasticsearch |
| Infra | Docker, Kubernetes, Terraform |
ai-analytics-platform/
├── services/ # Microservices (FastAPI)
│ ├── ingestion-service/ # Event ingestion → Kafka
│ ├── analytics-engine/ # Spark job orchestration
│ ├── ai-model-service/ # Anomaly, forecast, pattern APIs
│ ├── query-service/ # Query API + cache-aside
│ └── caching-service/ # Redis cache management
├── libs/ # Shared Python libraries
│ ├── shared-logging/ # Structured JSON logging
│ ├── shared-config/ # Settings and env config
│ └── shared-messaging/ # Kafka producer helpers
├── ai-models/ # ML training and artifacts
│ ├── anomaly-detection/
│ ├── forecasting/
│ └── pattern-detection/
├── infrastructure/ # Terraform + Kubernetes/Helm
├── monitoring/ # Prometheus + Grafana
├── docs/ # Architecture and deployment guides
└── .github/workflows/ # CI/CD pipelines
# 1. Start dependencies (Kafka, Redis, Postgres, Elasticsearch, Prometheus, Grafana)
docker-compose up -d
# 2. Install shared libs (from repo root)
pip install -e libs/shared-config -e libs/shared-logging -e libs/shared-messaging
# 3. Run a service (e.g. ingestion)
cd services/ingestion-service && pip install -r requirements.txt && uvicorn app.main:app --reloadSee docs/DEPLOYMENT.md for full Kubernetes and Terraform setup.
| Doc | Description |
|---|---|
| Architecture | System design, data flow, scaling |
| Deployment | Local, Terraform, Kubernetes |
| API Overview | Service endpoints and contracts |
Contributions are welcome! Please read CONTRIBUTING.md for guidelines. Feel free to open an issue or a pull request.
This project is licensed under the MIT License — see LICENSE for details.
If you find this useful, give it a ⭐ star on GitHub!