Train models. Capture everything. Query your ML history like a database engineer.
This repo demonstrates how to train deep learning models while persistently logging runs, metrics, configs, and predictions to MongoDB Atlas — turning ephemeral experiments into durable, queryable intelligence.
Perfect for:
- 🤖 Agentic ML workflows
- 📊 Experiment tracking without heavyweight tools
- 🧬 Reproducible research
- ☁️ Cloud‑native training telemetry
- 🔍 Mining past runs for insights
- 🐋 Building LLM/RAG training corpora from real model activity
🔥 End‑to‑End CIFAR‑10 Training Agent
📦 Automatic dataset handling
⚡ Mixed‑precision + distributed training support
🗃️ MongoDB Atlas logging for EVERYTHING
🔁 Reload configs from past runs
🏆 Query best runs by metric
🔮 Log predictions + evaluations
🧠 CLI‑driven workflow
🛠️ Zero external experiment tracker required
flowchart TD
A["🖼️ CIFAR-10 Dataset"]
B["⚙️ Training Pipeline (TensorFlow / Keras)"]
C["📊 Metrics + Config + History"]
D["🗄️ MongoDB Atlas"]
E["🧠 Queryable Training Intelligence"]
A --> B --> C --> D --> E
MongoDB becomes your:
- 📚 Experiment database
- 🧾 Model registry (lightweight)
- 🔍 Analytics backend
- 🧠 Knowledge store for agents
Each run produces a rich document containing:
- 🧬 Model architecture
- ⚙️ Hyperparameters
- 📅 Timestamps
- 📉 Training history
- 🏆 Best metrics
- 🧪 Test results
- 🔮 Predictions
- 📦 Dataset metadata
- ✅ Completion status
Example document shape:
{
"type": "training",
"model_name": "cifar10_basic_20260224_120101",
"architecture": "basic",
"epochs": 50,
"learning_rate": 0.001,
"best_metrics": {
"val_accuracy": 0.84
},
"test_metrics": {
"test_accuracy": 0.82,
"test_loss": 0.65
},
"history": { "...": "..." },
"timestamp": "2026‑02‑24T12:01:01Z",
"status": "completed"
}pip install tensorflow pymongo matplotlib numpyEdit the script:
MONGODB_URI = "your_connection_string_here"
DB_NAME = "agentic_demo"
COLLECTION_NAME = "training_data"python cifar10_agent.py train --architecture basic --epochs 50Advanced model + mixed precision:
python cifar10_agent.py train \
--architecture advanced \
--epochs 100 \
--mixed-precisionpython cifar10_agent.py evaluate \
--model-path checkpoints/best_model.h5python cifar10_agent.py predict \
--image-path test.jpg \
--model-path checkpoints/best_model.h5Logs prediction confidence + top‑3 classes to MongoDB.
python cifar10_agent.py info \
--show-classes \
--show-stats \
--show-samplepython cifar10_agent.py list-runs --limit 10python cifar10_agent.py list-runs \
--best \
--metric test_accuracy \
--limit 5Load config from MongoDB:
python cifar10_agent.py load-config \
--run-id <DOCUMENT_ID>Retrain automatically:
python cifar10_agent.py load-config \
--run-id <DOCUMENT_ID> \
--retrainReproducibility achieved 🧪✨
Unlike traditional experiment trackers, MongoDB gives you:
✅ Flexible schema for evolving experiments
✅ Powerful ad‑hoc queries
✅ Aggregation pipelines for analytics
✅ Vector search compatibility
✅ Native cloud scaling
✅ Easy integration with agents + apps
You can ask questions like:
- 🏆 “What architecture performs best at LR=0.001?”
- 📉 “Show runs where validation diverged.”
- ⚡ “Which configs train fastest?”
- 🔍 “Find models similar to this performance profile.”
This repository is designed as a building block for:
- Autonomous ML agents
- Self‑optimizing training systems
- Feedback‑driven pipelines
- Experiment mining
- RAG over training history
- LLM‑guided hyperparameter search
MongoDB becomes the agent’s memory.
All commands support:
--no-mongodbUseful for offline or privacy‑sensitive runs.
Ideas for next steps:
- 📈 Add dashboards (Streamlit / Gradio)
- 🧠 Integrate vector embeddings of runs
- 🔁 Auto‑tuning agents
- 🗂️ Model artifact storage references
- 🔔 Real‑time training notifications
- 📊 Aggregation‑based leaderboards
If you like:
- MongoDB
- ML engineering
- Agentic systems
- Cloud‑native tooling
- Reproducible experiments
…this repo is for you.
PRs welcome! Add new architectures, datasets, logging features, or agent capabilities.
MIT — use it, fork it, ship it 🚀
Stop losing your experiments in terminal scrollback.
Give your models a memory.
🧠➡️🗄️➡️🚀