Skip to content

dashminded/boston-housing-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Boston Housing Data Analysis

This project analyzes the Boston Housing dataset using Python. It includes visualizations, descriptive statistics, and statistical tests to explore factors that affect housing prices in the Boston area.

The dataset contains information on crime rates, property taxes, number of rooms, proximity to the Charles River, and other features for different housing tracts.


Project Objectives

  • Explore the distribution of housing values
  • Visualize relationships between housing features
  • Test whether certain factors significantly affect house prices
  • Apply T-tests, ANOVA, correlation, and linear regression
  • Practice working with real-world data in a data science context

Dataset Description

The dataset contains the following variables:

Variable Description
CRIM Per capita crime rate by town
ZN Proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS Proportion of non-retail business acres per town
CHAS Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX Nitric oxides concentration (parts per 10 million)
RM Average number of rooms per dwelling
AGE Proportion of owner-occupied units built prior to 1940
DIS Weighted distances to five Boston employment centers
RAD Index of accessibility to radial highways
TAX Full-value property tax rate per $10,000
PTRATIO Pupil-teacher ratio by town
LSTAT % lower status of the population
MEDV Median value of owner-occupied homes in $1000's

Analysis Summary

The notebook includes the following:

  • Boxplot of median home values
  • Bar chart showing how many houses border the Charles River
  • Grouped boxplots comparing home values across age brackets
  • Scatter plot to visualize the relationship between industrial land use and nitric oxide pollution
  • Histogram of pupil-teacher ratio across towns

Statistical Tests Performed

  • T-Test: Compared house values for homes next to the Charles River vs not
  • ANOVA: Compared house values across different age group categories
  • Pearson Correlation: Assessed relationship between NOX and INDUS
  • Linear Regression: Evaluated the effect of distance to employment centers on house prices

Tools Used

  • Python
  • Pandas
  • Seaborn
  • Matplotlib
  • Scipy
  • Statsmodels

About

Statistical analysis and visualization of the Boston Housing dataset using Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors