Data_Analysis_Project

This is a Data Analysis project done as part of a Data Science Course. Analysis is done on the data set containing the House Sales in King County, USA, which includes Seattle. It includes homes sold between May 2014 and May 2015.

Download the entire dataset here : https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-SkillsNetwork/labs/FinalModule_Coursera/data/kc_house_data_NaN.csv

The project is done on Jupyter Notebook using Python.
The libraries used are :

pandas
matplotlib
seaborn
scikitlearn

The aims of the project were:

Obtaining a statistical summary of the dataframe
Replacing the missing values in the dataset with the mean values
Comparing the outliers in the sales of houses with and without waterfont view
Finding the correlation of certain feature with the price of the house
Creating linear regression models to predict price using different features and finding the coefficient of determination
Splitting the dataset into test samples and training samples.
Creating ridge regression object using the test data and training data and finding the coefficient of determination, thus, evaluating the model and refining it if required.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
House_Sales_in_King_Count_USA.ipynb		House_Sales_in_King_Count_USA.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

House_Sales_in_King_Count_USA.ipynb

House_Sales_in_King_Count_USA.ipynb

README.md

README.md

Repository files navigation

Data_Analysis_Project

About

Releases

Packages

Languages

govindplal/Data_Analysis_Project

Folders and files

Latest commit

History

House_Sales_in_King_Count_USA.ipynb

House_Sales_in_King_Count_USA.ipynb

README.md

README.md

Repository files navigation

Data_Analysis_Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages