DataEngineerGPT is a highly specialized Streamlit chatbot powered by advanced LLMs hosted on Groq Cloud. Designed for data engineering, DevOps, and cloud architecture support, this assistant provides real-time, context-aware help with building and debugging production-grade data systems.
-
✅ Interactive Streamlit chat interface
-
🤖 Backed by state-of-the-art LLMs:
meta-llama/llama-4-scout-17b-16e-instructmeta-llama/llama-4-maverick-17b-128e-instructqwen-qwq-32bdeepseek-r1-distill-llama-70bcompound-beta
-
🧠 System prompt tuned for Data Engineering expertise
-
⚡ Fast responses powered by Groq's ultra-performant API
-
🔐 Secure configuration via Streamlit secrets
git clone https://github.com/lostdir/DATAENG-GPT.gitpip install -r requirements.txtCreate a file at .streamlit/secrets.toml:
GROQ_API_KEY = "your_groq_api_key"streamlit run app.py.
├── app.py # Streamlit chatbot app
├── requirements.txt # Python dependencies
├── .streamlit/
│ └── secrets.toml # Groq API key
├── .gitignore
└── README.md
- Get code for streaming pipelines, CDC, and ETL
- Debug Spark, Kafka, Airflow or SQL queries
- Generate infrastructure plans (Terraform, CI/CD)
- Ask anything about Data Engineering best practices
MIT License – use, modify, and share freely.