# Stage 16 Lifecycle Review

This notebook assists in finalizing your project repo and mapping its lifecycle stages.

In [None]:
checklist = {
    "repo_clean": True,
    "repo_complete": True,
    "readme_complete": True,
    "lifecycle_map": True,
    "summary_doc": True,
    "framework_guide_table": True
}
checklist

## Reflection Prompts
- Which lifecycle stage posed the greatest challenge, and why?
- What part of your repo offers the most reuse potential?
- What would most benefit a teammate inheriting your repo?

- **Greatest Challenge**: The Deployment stage (Stage 13) was the toughest, as setting up the Flask API involved resolving port issues and ensuring model compatibility, which was difficult due to my limited server experience.
- **Most Reusable Part**: The `notebooks/utils.py` module is highly reusable, with functions like `train_model` and `fill_missing_values` that can be applied to other regression projects with slight modifications.
- **Best Support for Teammate**: A detailed `README.md` with setup guide, `requirements.txt` for dependencies, and a `summary.md` with key results would be most helpful. A `faq.md` in `docs/` could address common issues.

# Lifecycle Framework Guide - Housing Price Prediction Project

## Stage Mapping
| Stage            | Description                                      | Artifacts/Outputs                     | Tools/Techniques                     |
|-------------------|--------------------------------------------------|---------------------------------------|--------------------------------------|
| Problem Definition | Set objective: predict Ames housing prices       | `README.md`, initial plan             | Stakeholder meetings                 |
| Data Collection   | Download Kaggle dataset                          | `/data/raw/train.csv`, `test.csv`     | Data retrieval, CSV handling         |
| Data Cleaning     | Process missing values, encode features          | `/data/processed/updated_encoded_train.csv` | Pandas, custom scripts              |
| Exploratory Analysis | Investigate feature relationships             | `/notebooks/exploratory_analysis.ipynb` | Matplotlib, Seaborn                  |
| Feature Engineering | Add `LotArea_squared`, normalize data           | `/data/processed/updated_encoded_train.csv` | Scikit-learn preprocessing           |
| Modeling          | Train linear regression model                    | `/model/linear_model.pkl`             | Scikit-learn (LinearRegression)      |
| Evaluation        | Measure R² (0.68), RMSE (0.067)                  | `/reports/evaluation_metrics.json`    | Scikit-learn metrics                 |
| Deployment        | Deploy Flask API                                 | `app.py`, `/deliverables/testing_evidence.png` | Flask, joblib                       |
| Monitoring        | Plan for latency, data quality tracking          | `reflection.md`, monitoring strategy  | Manual logs, future tools            |
| Iteration         | Schedule retraining based on performance         | Updated `model/linear_model.pkl`      | Automated scripts, manual review     |