# Data Analytics with AI – Streamlit Capstone Project

## 🌐 Project Title
*A compelling, human-readable title for your app (e.g., "Predicting Rental Prices in London with Streamlit")*

---

## 🚀 Project Overview

This project presents a data analytics web app built using **Streamlit**, designed to explore [dataset topic] and deliver clear, actionable insights to [intended audience].

The project combines **Python-based data analysis**, **interactive visualisations**, and a user-friendly **web interface** to support data-driven decisions in a real-world scenario.

---

## 🧠 Business Case

- **Domain**: e.g., housing, retail, environment, health
- **Challenge**: e.g., pricing prediction, pattern detection, resource optimisation
- **Target Users**: e.g., business analysts, NGOs, policymakers
- **App Purpose**: e.g., inform strategy, monitor trends, guide resource allocation

---

## 📊 Dataset Description

- **Source**: [dataset link or citation]
- **Size & Format**: e.g., 50k rows of CSV-formatted real estate listings
- **Privacy**: Data anonymised or cleared for public use
- **Cleaning & Transformation**: Done in Jupyter Notebook and explained below

---

## 🧰 Tech Stack & Tools

- **Python 3**
- **Pandas / NumPy** – data cleaning & manipulation
- **Matplotlib / Seaborn / Plotly** – data visualisation
- **Scikit-learn / Statsmodels** – ML or statistical modelling (if applicable)
- **Streamlit** – web app interface
- **GitHub Copilot / ChatGPT** – AI support for ideation and debugging

---

## 📺 App Features & Pages

Describe the structure and key features of your Streamlit app.

1. **Homepage** – App intro, navigation guide, and overview of purpose
2. **Data Explorer** – Interactive filters and visual exploration
3. **Insights Page** – Key metrics, trends, visualisations (with tooltips or explanations)
4. **Model / Recommendation Page** – If applicable: ML outputs or guidance
5. **Download / Export Section** – Optional: download CSVs, charts, etc.

---

## 🔍 Key Insights

Summarise 3–5 major takeaways the app reveals:

- [Example] Prices are highest in Zone 1, even when adjusting for square footage.
- [Example] Pollution index correlates with traffic data at r = 0.81.
- [Example] Random forest outperformed linear regression in accuracy.

---

## 🎯 Learning Outcomes Mapping

| LO | Description | How Met in Project |
|----|-------------|--------------------|
| LO1 | Stats/Theory | Explained in notebook; applied in app charts |
| LO2 | Python Tools | Pandas, Plotly, Scikit-learn, Streamlit |
| LO3 | Real-world analysis | Applied to live dataset with business framing |
| LO4 | AI Support | Copilot used in EDA and code generation |
| LO5 | Data Management | Cleaning, preprocessing steps detailed in notebook |
| LO6 | Ethics & Privacy | Data anonymised; ethics discussed below |
| LO7 | Research Design | Structured approach from question to solution |
| LO8 | Communication | App interface, markdown, and plots used effectively |
| LO9 | Domain Link | Problem and dataset clearly rooted in domain relevance |
| LO10 | Project Plan | Roadmap and iterations described in README |
| LO11 | Adaptability | Learned Streamlit + visualisation libraries + AI tools |

---

## 🧪 AI & Tools Used

- GitHub Copilot for function suggestions
- ChatGPT for prompt-based exploratory insights
- AI-supported summarisation (in storytelling section of dashboard)

---
## 📁 File Structure

📦 project-name/
├── app.py # Streamlit App
├── data/
│ ├── raw_data.csv
│ └── cleaned_data.csv
├── notebooks/
│ └── analysis.ipynb # Jupyter Notebook
├── requirements.txt # For deployment
├── Procfile # For Heroku deployment
├── README.md
└── images/
└── dashboard_screenshots.png

## ⚖️ Ethics, Privacy & Limitations

- Personal identifiers removed
- Dataset declared publicly available or permission documented
- Known biases and assumptions disclosed

---

## 🔧 Challenges & Reflections

- [Example] Cleaning messy categorical data required regex & manual mapping
- [Example] Streamlit was new to me, and layout took time to master
- [Example] Initial model overfit the training set, so I simplified features

---

## 💡 Future Development

- Add user login and saved dashboards
- Deploy to Streamlit Cloud or Heroku
- Add NLP summarisation for generated reports