Week 2 – AICTE Internship Milestone (GitHub Repo) This repository tracks Week 2 work for the AICTE internship on the AI/ML track.
You'll upload this repo link to the LMS "Week 2" submission.
Project Scaffold The project structure remains the same, but now includes a models/ directory for your saved machine learning model.
. ├── notebooks/ # Jupyter notebooks (Week2_Environmental_Monitoring.ipynb) ├── src/ # Python source code (.py) ├── models/ # Saved models (e.g., rf_model.pkl) ├── reports/ │ └── figures/ # Charts saved from notebook ├── data/ # Local data (kept out of Git) ├── requirements.txt # Python deps (AI/ML track) ├── LICENSE # MIT License ├── .gitignore # Ignore junk / large data └── README.md # You are here Week 2 Scope (Suggested) AI/ML: Clean and prepare the dataset, train multiple advanced models (e.g., Random Forest, Gradient Boosting), evaluate their performance with metrics (RMSE, R², MAE), visualize the results (e.g., actual vs. predicted, feature importance), and save the best model file.
Power BI: Refine the data model, add DAX calculated columns/measures, implement drill-through or advanced filtering, and publish the report to the Power BI service.
How to Run (AI/ML) Create a virtual environment and install dependencies:
Bash
pip install -r requirements.txt Put the raw AirQualityUCI.csv file into the data/ directory.
Use notebooks/Week2_Environmental_Monitoring.ipynb for the complete workflow from data cleaning to model evaluation.
The notebook will automatically save charts to reports/figures/ and the trained model to models/.
Improvisations (for LMS comment box) In the LMS Week 2 submission, add a brief note about your specific accomplishments, for example:
"Compared three regression models (Linear, RF, GB) and evaluated them using RMSE and R²."
"Analyzed feature importances from the Random Forest model and identified PT08.S2(NMHC) as a key predictor, noting potential data leakage."
"Created visualizations for model performance and data distribution, saving the best model using joblib."
Checklist (Before Submitting GitHub Link to LMS) [ ] Repo is Public and named Week 2.
[ ] README.md is updated with your project details, steps, and outcomes.
[ ] At least one notebook is in notebooks/ showing the full training and evaluation process.
[ ] The best-performing model is saved in the models/ directory (e.g., rf_model.pkl).
[ ] All visualization outputs (.png files) are saved in reports/figures/.
[ ] Commit message is clear, e.g., feat(week2): model training and evaluation.
Meta Owner: Anandha Krishnan P (Intern)
Date: 2025-09-06
Project: Environmental Monitoring & Pollution Control (AI/ML) This repository contains the Week 2 notebook Week2_Environmental_Monitoring.ipynb, which focuses on training and evaluating machine learning models to predict Benzene (C6H6(GT)) concentration from sensor data. The key outcomes include a performance comparison of Linear Regression, Random Forest, and Gradient Boosting models, along with visualizations of feature importance and prediction accuracy.