# Airbnb EDA
![image_name](../assets/working-at-airbnb.jpg "alt-image_name!")


## About Author
Author: Joshua Farara

Project: Airbnb EDA
### Contact Info
Click on link below to contact/follow/correct me:

Email: joshua.farara@gmail.com

[LinkedIn](https://www.linkedin.com/in/joshuafarara/)

[Facebook](https://www.facebook.com/josh.farara/)

[Twitter](https://x.com/FararaTheArtist)

[Github](https://github.com/JoshuaFarara)


## About Data

Title: AirBNB_Data

Dataset: [Link](https://www.kaggle.com/datasets/paramvir705/airbnb-data)

Brief Description of dataset

### Dataset Columns Names

Features:

- `id`
- `log_price`
- `property_type`
- `room_type`
- `amenities`
- `accommodates`
- `bathrooms`
- `bed_type`
- `cancellation_policy`
- `cleaning_fee`
- `city`
- `description`
- `first_review`
- `host_has_profile_pic`
- `host_identity_verified`
- `host_response_rate`
- `host_since`
- `instant_bookable`
- `last_review`
- `latitude`
- `longitude`
- `name`
- `neighbourhood`
- `number_of_reviews`
- `review_scores_rating`
- `thumbnail_url`
- `zipcode`
- `bedrooms`
- `beds`


### Metadata

Author/Collaborators: Paramvir_705 (Owner)

Source: Google websites

Collection Methodology: Web scrapping

License: Apache 2.0

### Task
Describe task

### Objective

Describe observed objective of dataset.

### Kernel Version Used

Python==3.11.7


## Import Libraries
We will use the following libraries
1. Pandas: Data manipulation and analysis
2. Numpy: Numerical operations and calculations
3. Matplotlib: Data visualization and plotting
4. Seaborn: Enhanced data visualization and statistical graphics
5. Scipy: Scientific computing and advanced mathematical operations


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp
import os
import sys
# this is for jupyter notebook to show the plot in the notebook itself instead of opening a new window
%matplotlib inline

for dirname, _, filenames in os.walk('/kaggle/input'):
   for filename in filenames:
       print(os.path.join(dirname, filename))
# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session


## Data Loading and Exploration | Cleaning

### Load a CSV file then creating a dataframe

In [3]:
# Kaggle Notebook
# df = pd.read_csv('/kaggle/input/coffee-sales/index.csv')

# Local Machine Notebook
df = pd.read_csv('../data/Airbnb_Data.csv')


### Set the option to show maximum columns

In [4]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)


### Get a sneak peek of data
The purpose of a sneak peek is to get a quick overview of the data and identify any potential problems or areas of interest


In [5]:
df.head(5)

Unnamed: 0,id,log_price,property_type,room_type,amenities,accommodates,bathrooms,bed_type,cancellation_policy,cleaning_fee,city,description,first_review,host_has_profile_pic,host_identity_verified,host_response_rate,host_since,instant_bookable,last_review,latitude,longitude,name,neighbourhood,number_of_reviews,review_scores_rating,thumbnail_url,zipcode,bedrooms,beds
0,6901257,5.010635,Apartment,Entire home/apt,"{""Wireless Internet"",""Air conditioning"",Kitche...",3,1.0,Real Bed,strict,True,NYC,"Beautiful, sunlit brownstone 1-bedroom in the ...",2016-06-18,t,t,,2012-03-26,f,2016-07-18,40.696524,-73.991617,Beautiful brownstone 1-bedroom,Brooklyn Heights,2,100.0,https://a0.muscache.com/im/pictures/6d7cbbf7-c...,11201.0,1.0,1.0
1,6304928,5.129899,Apartment,Entire home/apt,"{""Wireless Internet"",""Air conditioning"",Kitche...",7,1.0,Real Bed,strict,True,NYC,Enjoy travelling during your stay in Manhattan...,2017-08-05,t,f,100%,2017-06-19,t,2017-09-23,40.766115,-73.98904,Superb 3BR Apt Located Near Times Square,Hell's Kitchen,6,93.0,https://a0.muscache.com/im/pictures/348a55fe-4...,10019.0,3.0,3.0
2,7919400,4.976734,Apartment,Entire home/apt,"{TV,""Cable TV"",""Wireless Internet"",""Air condit...",5,1.0,Real Bed,moderate,True,NYC,The Oasis comes complete with a full backyard ...,2017-04-30,t,t,100%,2016-10-25,t,2017-09-14,40.80811,-73.943756,The Garden Oasis,Harlem,10,92.0,https://a0.muscache.com/im/pictures/6fae5362-9...,10027.0,1.0,3.0
3,13418779,6.620073,House,Entire home/apt,"{TV,""Cable TV"",Internet,""Wireless Internet"",Ki...",4,1.0,Real Bed,flexible,True,SF,This light-filled home-away-from-home is super...,,t,t,,2015-04-19,f,,37.772004,-122.431619,Beautiful Flat in the Heart of SF!,Lower Haight,0,,https://a0.muscache.com/im/pictures/72208dad-9...,94117.0,2.0,2.0
4,3808709,4.744932,Apartment,Entire home/apt,"{TV,Internet,""Wireless Internet"",""Air conditio...",2,1.0,Real Bed,moderate,True,DC,"Cool, cozy, and comfortable studio located in ...",2015-05-12,t,t,100%,2015-03-01,t,2017-01-22,38.925627,-77.034596,Great studio in midtown DC,Columbia Heights,4,40.0,,20009.0,0.0,1.0


### Let's see the column names

In [6]:
df.columns

Index(['id', 'log_price', 'property_type', 'room_type', 'amenities',
       'accommodates', 'bathrooms', 'bed_type', 'cancellation_policy',
       'cleaning_fee', 'city', 'description', 'first_review',
       'host_has_profile_pic', 'host_identity_verified', 'host_response_rate',
       'host_since', 'instant_bookable', 'last_review', 'latitude',
       'longitude', 'name', 'neighbourhood', 'number_of_reviews',
       'review_scores_rating', 'thumbnail_url', 'zipcode', 'bedrooms', 'beds'],
      dtype='object')

### Let's have a look on the shape of the dataset

In [None]:
print(f"The Number of Rows are {df.shape[0]}, and columns are {df.shape[1]}.")

### Let's have a look on the columns and their data types using detailed info function

In [None]:
df.info()

### Count the missing values

In [None]:
df.isnull().sum()

## Cleaning Data

### First Step of Cleaning

### Second Step of Cleaning

### Third Step of Cleaning

### Fourth Step of Cleaning

### Fifth Step of Cleaning

### Restructure dateframe order.

In [None]:
# Fill in restructureed dateframe format for desired look of data.  

## Analytical Questions

Analysis Subject 1

1. 

2. 

3. 

4. 

5. 

Analysis Subject 2

6. 

7. 

8. 

9. 

10. 


## Summary

## Contact Information

Click on link below to contact/follow/correct me:

Email: joshua.farara@gmail.com

[LinkedIn](https://www.linkedin.com/in/joshuafarara/)

[Facebook](https://www.facebook.com/josh.farara/)

[Twitter](https://x.com/FararaTheArtist)

[Github](https://github.com/JoshuaFarara)
