This project uses Python to analyze housing data and visualize the relationship between various features and the sale price of houses.
To run this project on your local machine, you will need to have Python and the following libraries installed:
- NumPy
- Pandas
- Matplotlib
- Seaborn
housing_data.csv: The dataset used for this project, which contains information about various houses and their sale prices. housing_data_analysis.py: The Python code for the project, which loads the data, explores the data, and visualizes the relationship between different features and the sale price of houses.
The project code is organized as follows:
- Import necessary libraries and the housing data.
- Explore the columns of the housing data.
- Use a distribution plot to visualize the distribution of sale prices.
- Transform the sale prices using the log function to reduce skewness, and visualize the new distribution.
- Concatenate the sale prices with the first floor square footage data and visualize the relationship between the two.
- Concatenate the sale prices with the garage area data and visualize the relationship between the two.
- Concatenate the sale prices with the year built data and visualize the relationship between the two using a box plot.
This project demonstrates how Python can be used to explore and visualize relationships in housing data. The code can be easily modified to analyze other features and relationships in the dataset.