# Proximity / Location
Where is the demand? What is occupancy rate? Which areas rent all season round? Which seasons do the best?

## Imports

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

## Data Preparation

In [2]:
# Importing data into pandas dataframes
listings = pd.read_csv('./listings_sea.csv')
calendar = pd.read_csv('./calendar_sea.csv')
reviews = pd.read_csv('./reviews_sea.csv')

In [3]:
# View a snippet of the data
calendar.head(3)

Unnamed: 0,listing_id,date,available,price
0,241032,2016-01-04,t,$85.00
1,241032,2016-01-05,t,$85.00
2,241032,2016-01-06,f,


In [4]:
# Find null data
calendar.isnull().sum()

listing_id         0
date               0
available          0
price         459028
dtype: int64

In [5]:
# Initial shape
calendar.shape

(1393570, 4)

In [6]:
# Remove rows not needed (nan rows)
revised_calendar = calendar.dropna(axis = 0)

In [7]:
# Revised shape
revised_calendar.shape

(934542, 4)

In [8]:
revised_calendar.isnull().sum()

listing_id    0
date          0
available     0
price         0
dtype: int64

In [None]:
# Fix update date types for the date and pricem

In [9]:
revised_calendar['date'] = pd.to_datetime(revised_calendar['date'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [5]:
def convert_currency(val):
    """
    Convert the string number value to a float
     - Remove $
     - Remove commas
     - Convert to float type
    """
    new_val = val.replace(',','').replace('$', '')
    return float(new_val)

In [7]:
revised_calendar['price'] = revised_calendar['price'].apply(convert_currency)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [8]:
revised_calendar.dtypes

listing_id             int64
date          datetime64[ns]
available             object
price                float64
dtype: object

In [13]:
revised_calendar['year'] = pd.DatetimeIndex(revised_calendar['date']).year

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [14]:
revised_calendar['month'] = pd.DatetimeIndex(revised_calendar['date']).month

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [15]:
revised_calendar.head(3)

Unnamed: 0,listing_id,date,available,price,month,year
0,241032,2016-01-04,t,85.0,1,2016
1,241032,2016-01-05,t,85.0,1,2016
9,241032,2016-01-13,t,85.0,1,2016


In [16]:
revised_calendar.year.unique()

array([2016, 2017], dtype=int64)

In [27]:
# Show average price by month
revised_calendar.groupby('month')['price'].mean()

month
1     122.912176
2     124.293927
3     128.644488
4     135.097005
5     139.538183
6     147.473137
7     152.094150
8     150.656594
9     143.255949
10    137.031939
11    135.688738
12    137.251835
Name: price, dtype: float64