Welcome to the official repository for Data Analytics and AI with Python, a hands-on course designed by Solomon Tessema to bridge foundational analytics with modern agentic AI workflows. This repo blends classical machine learning, PySpark-based data engineering, and LLM integration into reproducible, real-world pipelines.
- Master Python for data analysis, visualization, and automation
- Build scalable ETL pipelines using PySpark
- Apply classical ML techniques: regression, classification, clustering
- Integrate LLMs and vector search for intelligent data workflows
- Design modular, observable systems using n8n, LangChain, and custom APIs
| Category | Tools & Libraries |
|---|---|
| Data Wrangling | Pandas, PySpark, SQL |
| Visualization | Matplotlib, Seaborn, Plotly |
| Machine Learning | Scikit-learn, TensorFlow, XGBoost |
| Workflow Design | n8n, FastAPI, RESTful APIs |
βββ notebooks/ # Jupyter & Colab notebooks
βββ datasets/ # Sample CSVs and Parquet files
βββ modules/ # Reusable Python scripts
βββ pipelines/ # End-to-end ETL and ML flows
βββ visualizations/ # Charts and dashboards
βββ README.md # This file