# **Notebook 4: Price Analysis**

## Objectives

* **Explore Relationships Between Features and Sale Price**
  * Conduct in-depth analysis of the relationships between house attributes and the target variable, `SalePrice`.
  * Utilize visualizations and statistical methods to identify trendds and patterns that affect property values.
* **Validate Business Hypotheses**
  * Test key hypotheses about the drivers of house prices, including the influence of quality, size and location attributes.
* **Generate Insights for Client Needs**
  * Provide actionable insights to help the Client to understand the factors influencing the value of their inherited properties and similar houses in Ames, Iowa.
  * Present findings in a clear and interpretable manner for client use.
* **Prepare Visualizations for Dashboard**
  * Develop interactive and static visualizations that effectively communicate findings and lign with the dashboard requirements.

## Inputs

* **Processed Datasets**
  * `x_train_transformed.csv`: Feature-engineered and scaled training dataset for modeling and analysis.
  * `x_test_transformed.csv`: Feature-engineered and scales testing dataset for validation.
  * `y_train.csv`: Training dataset target variable (SalePrice).
  * `y_test.csv`: Testing dataset target variable (SalePrice).
* **Supplementary Data**
  * Domain knowledge and project-specific hypotheses for guiding analysis.
* **Stored Locations**
  * Datasets are located in the `outputs/datasets/processed/transformed/` and `outputs/datasets/processed/split/` directories.

## Outputs

* **Insights and Findings**
  * Detailed analysis of the relationships between key features and sale price.
  * Validation of hypotheses with supporting evidence.
* **Visualizations**
  * Scatter plots, box plots, heatmaps, and other graphical representations to highlight trends and patterns.
  * Summary visualizations prepared for dashboard integration.
* **Documentation**
  * Summary of analysis, key takeaways, and recommendations for downstream modeling and dashboard integration.

## Additional Comments

* **Context**
  * This notebook focuses on data exploration and analysis, bridhing the gap between feature engineering and model building. It provides the foundation for deriving insights and recommendations.
* **Alignment with CRISP-DM**
  * This notebook aligns with the Data Understanding and Business Understanding steps, ensuring that exploratory findings are actionable and relevant to the client's needs.
* **Next Steps**
  * The outputs from this notebook will inform the Model Training and Evaluation notebook, where predictive models will be developed and optimized.


---

## Change working directory

* We are assuming you will store the notebooks in a subfolder, therefore when running the notebook in the editor, you will need to change the working directory

We need to change the working directory from its current folder to its parent folder
* We access the current directory with os.getcwd()

In [None]:
import os
current_dir = os.getcwd()
current_dir

We want to make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [None]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

Confirm the new current directory

In [None]:
current_dir = os.getcwd()
current_dir

---

## Load and Prepare Data

---

## Exploratory Data Analysis (EDA)

### Sale Price Distribution

### Correlation Analysis

### Pairwise Analysis

### Multivariate Analysis

### Feature Comparison Accross Quartiles

### Outlier Analysis

---

## Business Insights

### Key Drivers of Sale Price

### Client-Specific Observations

---

## Save Outputs

---

## Conclusion & Next Steps

### Conclusion

### Next Steps