# Examining Host Join Dates

A quickie. What's the trend like for people hosting AirBnBs in Boston? How much more popular is AirBnB among hosts now than it was two or three years ago?

**Note**: This data is an incomplete record because it only includes hosts who are still active on the site and not historical hosts who have since taken their properties offline. Without the full historical record we're limited in what we can do! Furthermore, the size of the property sample is small relative to the timespan, so we don't have enough data to e.g. measure the impact of holidays on listings or what have you. It's important to understand not just what can do with data but also what you can't!

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
pd.set_option("max_columns", None)

In [None]:
listings = pd.read_csv("../input/listings.csv")

In [None]:
listings.head()

In [None]:
join_dates = pd.to_datetime(listings['host_since']).value_counts().resample('D').mean().fillna(0)

Surprisingly, the statistical picture is dominated by a few rare events, times during which lots and lots of people signed up.

In [None]:
join_dates.plot()

In [None]:
join_dates.value_counts()

In [None]:
np.argmax(join_dates)

Oftentimes when doing time-series analyses like this when something sticks out this strongly a quick news search brings up a culprit.

Unfortunately in this case there's nothing interesting about this date in the news, at least not as far as I can tell. We don't have enough domain-specific knowledge to make sense of it!

When we expand our window size to a rolling mean for the entire month, the rare events continue to dominate the spikes and lags in the dataset.

In [None]:
join_dates.rolling(window=31).mean().plot()

Are people more likely to list on weekdays than weekends, or vice versa? No.

In [None]:
pd.to_datetime(listings['host_since']).dt.dayofweek.value_counts().sort_index().plot(kind='bar')

Finally, looking at the month-to-month picture, an expanding window mean shows that the Boston AirBnB market has averaged approximately 40 still-active listings per month over the course of its history in the city.

These days there are more like 60 or so of them per month.

In [None]:
join_dates.resample("M").sum().plot()
join_dates.resample("M").sum().expanding(min_periods=4).mean().plot()