Skip to content

following the data science process for generating useful insights from the New York Airbnb data

Notifications You must be signed in to change notification settings

ManarOmar/New-York-Airbnb-2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

New-York-Airbnb-2019

following the data science process for generating useful insights from the New York Airbnb data. using Jupyter Notebook tool with python3 language

Libraries

1.Pandas

2.Sklearn

3.Seaborn

4.Matplotlib

Motivation

these questions are motivated me to find useful results from the data:

  • how many hosts are in every neighbourhood_group? what are the top host ids who have the highest number of rooms in every neighbourhood_group?

  • What are the features that affect on the price? can we predict the price from these features?

  • Which neighbourhood has the highest number of reviews, and the highest number of rooms for every neighbourhood group? what can we learn from the results?

Files

1.AB_NYC_2019.csv : this file contains the data which was used in the process, it was downloaded from Kaggle

2.New York City Airbnb: This is the Jupyter Notebook which contains all of the code of the data science process with some markdown cells

Summery

this project extracted some of many useful insights associated with the listings in New York city AriBnB in 2019, specifically these insights related to the prices, hosts, room types in every neighborhood group in New York, I have used one data-set in the analysis: New York City Airbnb Open Data.

I have followed the CRISP-DM(Cross-Industry Standard Process for Data Mining) process in the analysis:

  • Business Understanding : asking questions related to the business field
  • Data Understanding : specifying the columns in our data used in every question
  • Data Preperation : cleaning the data and wrangling it to be ready to give us the answer of the question
  • Modeling : making a model to predict one of the features
  • Evaluation : evaluating the model
  • Deployment : communicating the results and the process

Acknowledgement

  • This is the first project in the data scientist Nanodegree program from Udacity

  • I found the data on Kaggle

  • I used the stackoverflow site to solve the problem faced me during coding

this is a blog post on medium of this project for the non technical persons

About

following the data science process for generating useful insights from the New York Airbnb data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published