# Market Analysis in Dublin

## Assignment
A new city manager for Airbnb has started in Dublin and wants to better understand:

what guests are searching for in Dublin,
which inquiries hosts tend to accept.
Based on the findings the new city manager will try to boost the number and quality of hosts in Dublin to fit the demands from guests. The goal of this challenge is to analyze, understand, visualize, and communicate the demand / supply in the market. For example you may want to look at the breakdown of start date day of the week, or number of nights, or room type that is searched for, and how many hosts accepted the reservation. In particular, we are interested in:

what the gaps are between guest demand and host supply that the new city manager could plug to increase the number of bookings in Dublin,
what other data would be useful to have to deepen the analysis and understanding.


## Data Description
### There are 2 datasets

 * searches.tsv - Contains a row for each set of searches that a user does for Dublin
* contacts.tsv - Contains a row for every time that an assigned visitor makes an inquiry for a stay in a listing in Dublin
searches dataset contains the following columns:

-ds - Date of the search
- id_user - Alphanumeric user_id
- ds_checkin - Date stamp of the check-in date of the search
- ds_checkout - Date stamp of the check-out date of the search
- n_searches - Number of searches in the search set
- n_nights - The number of nights the search was for
- n_guests_min - The minimum number of guests selected in a search set
- n_guests_max - The maximum number of guests selected in a search set
- origin_country - The country the search was from
- filter_price_min - The value of the lower bound of the price filter, if the
  user used it
- filter_price_max - The value of the upper bound of the price filter, if the
   user used it
- filter_room_types - The room types that the user filtered by, if the user
  used the room_types filter
- filter_neighborhoods - The neighborhoods types that the user filtered by, if
     the user used the neighborhoods filter
- contacts dataset contains the following columns:


- id_guest - Alphanumeric user_id of the guest making the inquiry
- id_host - Alphanumeric user_id of the host of the listing to which the inquiry is made
- id_listing - Alphanumeric identifier for the listing to which the inquiry is made
- ts_contact_at - UTC timestamp of the moment the inquiry is made.
- ts_reply_at - UTC timestamp of the moment the host replies to the inquiry, if so
- ts_accepted_at - UTC timestamp of the moment the host accepts the inquiry, if so
- ts_booking_at - UTC timestamp of the moment the booking is made, if so
- ds_checkin - Date stamp of the check-in date of the inquiry
- ds_checkout - Date stamp of the check-out date of the inquiry
- n_guests - The number of guests the inquiry is for
- n_messages - The total number of messages that were sent around this inquiry

## Practicalities
Analyze the provided data and answer the questions to the best of your abilities. Include the relevant tables/graphs/visualization to explain what you have learnt about the market. Make sure that the solution reflects your entire thought process including the preparation of data - it is more important how the code is structured rather than just the final result or plot. You are expected to spend no more than 3-6 hours on this project.

#### To download the dataset <a href="https://drive.google.com/drive/folders/1WPZZB7WlOZE_lwVWJoR_V5AAR888uT9P?usp=sharing"> Click here </a>

In [1]:
#importing Libraries :

import pandas as pd

In [2]:
import numpy as np

In [3]:
!pip install matplotlib
import matplotlib.pyplot as plt



In [4]:
!pip install seaborn
import seaborn as sns



In [5]:
#Loading Dataset:
dataset_1 = pd.read_csv( 'contacts.tsv' ,sep = '/t' , engine='python')

In [6]:
dataset_1.head()


Unnamed: 0,id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages
0,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...
1,00197051-c6cb-4c3a-99e9-86615b819874\t46aa3897...
2,0027538e-aa9e-4a02-8979-b8397e5d4cba\t6bbb88ca...
3,0027538e-aa9e-4a02-8979-b8397e5d4cba\t8772bc85...
4,0027538e-aa9e-4a02-8979-b8397e5d4cba\tac162061...


In [7]:
dataset_1.shape

(7823, 1)

In [8]:
dataset_2 = pd.read_csv( 'searches.tsv' ,sep = '/t' , engine='python')

In [9]:
dataset_2.shape

(35737, 1)

In [10]:
dataset_2.head()

Unnamed: 0,ds\tid_user\tds_checkin\tds_checkout\tn_searches\tn_nights\tn_guests_min\tn_guests_max\torigin_country\tfilter_price_min\tfilter_price_max\tfilter_room_types\tfilter_neighborhoods
0,2014-10-01\t0000af0a-6f26-4233-9832-27efbfb361...
1,2014-10-01\t0000af0a-6f26-4233-9832-27efbfb361...
2,2014-10-01\t000cd9d3-e05b-4016-9e09-34a6f8ba2f...
3,2014-10-01\t000cd9d3-e05b-4016-9e09-34a6f8ba2f...
4,2014-10-01\t001c04f0-5a94-4ee0-bf5d-3591265256...


In [11]:
#Preprocessing

In [12]:
dataset_1.columns

Index(['id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages'], dtype='object')

In [13]:
dataset_2.columns


Index(['ds\tid_user\tds_checkin\tds_checkout\tn_searches\tn_nights\tn_guests_min\tn_guests_max\torigin_country\tfilter_price_min\tfilter_price_max\tfilter_room_types\tfilter_neighborhoods'], dtype='object')

In [14]:
dataset_1[['id_guest', 'tid_host', 'tid_listing', 'tts_contact_at', 'tts_reply_at', 'tts_accepted_at', 'tts_booking_at', 'tds_checkin', 'tds_checkout', 'tn_guests', 'tn_messages']] = dataset_1['id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages'].str.split('\t', expand=True)


In [15]:
dataset_1.head(2)

Unnamed: 0,id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages,id_guest,tid_host,tid_listing,tts_contact_at,tts_reply_at,tts_accepted_at,tts_booking_at,tds_checkin,tds_checkout,tn_guests,tn_messages
0,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,2,13
1,00197051-c6cb-4c3a-99e9-86615b819874\t46aa3897...,00197051-c6cb-4c3a-99e9-86615b819874,46aa3897-9c00-4d76-ac66-a307593d0675,fb5ed09a-9848-4f2c-b2ef-34deb62164fb,2014-11-04 09:10:03.0,2014-11-04 09:45:50.0,2014-11-04 09:45:50.0,2014-11-04 12:20:46.0,2014-11-27,2014-11-30,1,10


In [16]:
dataset_2[['ds', 'tid_user', 'tds_checkin', 'tds_checkout', 'tn_searches', 'tn_nights', 'tn_guests_min', 'tn_guests_max', 'torigin_country', 'tfilter_price_min', 'tfilter_price_max' , 'tfilter_room_types' , 'tfilter_neighbourhoods']] = dataset_2['ds\tid_user\tds_checkin\tds_checkout\tn_searches\tn_nights\tn_guests_min\tn_guests_max\torigin_country\tfilter_price_min\tfilter_price_max\tfilter_room_types\tfilter_neighborhoods'].str.split('\t', expand=True)


In [17]:
dataset_2.head(2)

Unnamed: 0,ds\tid_user\tds_checkin\tds_checkout\tn_searches\tn_nights\tn_guests_min\tn_guests_max\torigin_country\tfilter_price_min\tfilter_price_max\tfilter_room_types\tfilter_neighborhoods,ds,tid_user,tds_checkin,tds_checkout,tn_searches,tn_nights,tn_guests_min,tn_guests_max,torigin_country,tfilter_price_min,tfilter_price_max,tfilter_room_types,tfilter_neighbourhoods
0,2014-10-01\t0000af0a-6f26-4233-9832-27efbfb361...,2014-10-01,0000af0a-6f26-4233-9832-27efbfb36148,2014-10-09,2014-10-12,16,3,2,2,IE,0,67,",Entire home/apt,Entire home/apt,Private room,...",
1,2014-10-01\t0000af0a-6f26-4233-9832-27efbfb361...,2014-10-01,0000af0a-6f26-4233-9832-27efbfb36148,2014-10-09,2014-10-19,3,10,1,2,IE,0,67,,


In [18]:
dataset_2.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35737 entries, 0 to 35736
Data columns (total 14 columns):
 #   Column                                                                                                                                                                   Non-Null Count  Dtype 
---  ------                                                                                                                                                                   --------------  ----- 
 0   ds	id_user	ds_checkin	ds_checkout	n_searches	n_nights	n_guests_min	n_guests_max	origin_country	filter_price_min	filter_price_max	filter_room_types	filter_neighborhoods  35737 non-null  object
 1   ds                                                                                                                                                                       35737 non-null  object
 2   tid_user                                                                                          

Because we have to use both dataset together for analysis let's concat them :

In [19]:
new_df = pd.merge(dataset_1 , dataset_2)

In [20]:
new_df

Unnamed: 0,id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages,id_guest,tid_host,tid_listing,tts_contact_at,tts_reply_at,tts_accepted_at,tts_booking_at,tds_checkin,tds_checkout,...,tid_user,tn_searches,tn_nights,tn_guests_min,tn_guests_max,torigin_country,tfilter_price_min,tfilter_price_max,tfilter_room_types,tfilter_neighbourhoods
0,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,...,111aff61-7560-4281-8529-f64859308bb0,4,2,2,2,DE,,,,
1,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,...,40984cb3-fd3b-40d8-ab2b-14151bac010d,8,2,5,5,GB,,,,
2,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,...,5bd43bd9-3be5-443b-b059-6c6a9b7da553,6,2,1,2,IE,0,130,Private room,",City Centre"
3,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,...,9248e8de-75b8-41d6-b077-02a246cdb2ca,6,2,1,2,ES,,,Entire home/apt,
4,000dfad9-459b-4f0b-8310-3d6ab34e4f57\t13bb24b8...,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,...,990ea921-07b4-48f3-824f-a677c8b08f96,12,2,2,2,BE,0,143,",Entire home/apt",
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
677440,fffea166-9432-43a7-8b1b-09d6f30c1c07\t6d656267...,fffea166-9432-43a7-8b1b-09d6f30c1c07,6d656267-642e-4972-bdec-a35d82b84ebb,90dddef6-23ef-4df3-b454-8fd3d0e8cade,2014-10-08 00:05:05.0,2014-10-12 20:58:12.0,,,2014-11-11,2014-11-18,...,e9606794-85f6-4478-ab6e-1d4f78be6237,2,7,2,4,IT,0,85,,
677441,fffea166-9432-43a7-8b1b-09d6f30c1c07\t6d656267...,fffea166-9432-43a7-8b1b-09d6f30c1c07,6d656267-642e-4972-bdec-a35d82b84ebb,90dddef6-23ef-4df3-b454-8fd3d0e8cade,2014-10-08 00:05:05.0,2014-10-12 20:58:12.0,,,2014-11-11,2014-11-18,...,928b9319-542a-4cc2-af04-c63df3d5fb0f,7,7,1,1,US,0,55,,
677442,fffea166-9432-43a7-8b1b-09d6f30c1c07\t6d656267...,fffea166-9432-43a7-8b1b-09d6f30c1c07,6d656267-642e-4972-bdec-a35d82b84ebb,90dddef6-23ef-4df3-b454-8fd3d0e8cade,2014-10-08 00:05:05.0,2014-10-12 20:58:12.0,,,2014-11-11,2014-11-18,...,b964af02-2301-4131-936c-5224b681069b,34,7,1,1,IT,0,850,",Entire home/apt",
677443,fffea166-9432-43a7-8b1b-09d6f30c1c07\t6d656267...,fffea166-9432-43a7-8b1b-09d6f30c1c07,6d656267-642e-4972-bdec-a35d82b84ebb,90dddef6-23ef-4df3-b454-8fd3d0e8cade,2014-10-08 00:05:05.0,2014-10-12 20:58:12.0,,,2014-11-11,2014-11-18,...,fffea166-9432-43a7-8b1b-09d6f30c1c07,3,7,2,2,AR,0,34,,


In [21]:
new_df = new_df.drop('id_guest\tid_host\tid_listing\tts_contact_at\tts_reply_at\tts_accepted_at\tts_booking_at\tds_checkin\tds_checkout\tn_guests\tn_messages',axis=1)

In [22]:
new_df = new_df.drop('ds\tid_user\tds_checkin\tds_checkout\tn_searches\tn_nights\tn_guests_min\tn_guests_max\torigin_country\tfilter_price_min\tfilter_price_max\tfilter_room_types\tfilter_neighborhoods',axis=1)

In [23]:
new_df.shape

(677445, 22)

In [24]:
new_df.head(2)

Unnamed: 0,id_guest,tid_host,tid_listing,tts_contact_at,tts_reply_at,tts_accepted_at,tts_booking_at,tds_checkin,tds_checkout,tn_guests,...,tid_user,tn_searches,tn_nights,tn_guests_min,tn_guests_max,torigin_country,tfilter_price_min,tfilter_price_max,tfilter_room_types,tfilter_neighbourhoods
0,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,2,...,111aff61-7560-4281-8529-f64859308bb0,4,2,2,2,DE,,,,
1,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,2,...,40984cb3-fd3b-40d8-ab2b-14151bac010d,8,2,5,5,GB,,,,


In [25]:
new_df.isnull().sum()

id_guest                       0
tid_host                       0
tid_listing                    0
tts_contact_at                 0
tts_reply_at                   0
tts_accepted_at                0
tts_booking_at                 0
tds_checkin                    0
tds_checkout                   0
tn_guests                      0
tn_messages                    0
ds                             0
tid_user                       0
tn_searches                    0
tn_nights                      0
tn_guests_min                  0
tn_guests_max                  0
torigin_country                0
tfilter_price_min              0
tfilter_price_max              0
tfilter_room_types        295160
tfilter_neighbourhoods    641201
dtype: int64

In [26]:
#We have null in last columns let's handle them first
new_df['tfilter_room_types'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 677445 entries, 0 to 677444
Series name: tfilter_room_types
Non-Null Count   Dtype 
--------------   ----- 
382285 non-null  object
dtypes: object(1)
memory usage: 5.2+ MB


In [27]:
new_df['tfilter_room_types'].unique()

array([None, 'Private room', 'Entire home/apt', ',Entire home/apt',
       ',Private room', ',Entire home/apt,Private room,Private room',
       'Entire home/apt,Private room', ',Entire home/apt,Private room',
       '', ',Entire home/apt,Entire home/apt,Private room',
       ',Entire home/apt,Private room,Private room,Shared room,Shared room',
       'Entire home/apt,Shared room,Entire home/apt',
       'Entire home/apt,Entire home/apt,Private room',
       ',Entire home/apt,Private room,Private room,Private room,Shared room',
       ',Entire home/apt,Entire home/apt,Private room,Private room',
       ',Private room,Private room,Shared room',
       ',Private room,Shared room',
       ',Entire home/apt,Entire home/apt,Shared room',
       'Private room,Shared room',
       ',Entire home/apt,Private room,Entire home/apt',
       ',Entire home/apt,Private room,Shared room,Shared room,Private room',
       'Entire home/apt,Private room,Entire home/apt',
       ',Entire home/apt,Private r

In [36]:
# Replace None with np.nan
new_df = new_df.replace({None: np.nan})

In [37]:
new_df['tfilter_room_types'].value_counts()

tfilter_room_types
,Entire home/apt                                                                                                                           144006
Entire home/apt                                                                                                                             77945
,Private room                                                                                                                               38820
Private room                                                                                                                                24373
,Entire home/apt,Entire home/apt,Private room                                                                                               12736
                                                                                                                                            ...  
Entire home/apt,Private room,Entire home/apt,Private room,Shared room,Private room                       

In [41]:
new_df['tfilter_room_types'].mode()[0]

In [44]:
new_df['tfilter_room_types'].fillna(new_df['tfilter_room_types'].mode()[0] , inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  new_df['tfilter_room_types'].fillna(new_df['tfilter_room_types'].mode()[0] , inplace=True)


In [46]:
new_df['tfilter_room_types'].isnull().sum()

0

In [47]:
new_df['tfilter_neighbourhoods'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 677445 entries, 0 to 677444
Series name: tfilter_neighbourhoods
Non-Null Count  Dtype 
--------------  ----- 
36244 non-null  object
dtypes: object(1)
memory usage: 5.2+ MB


In [48]:
new_df['tfilter_neighbourhoods'].unique()

array([nan, ',City Centre', ',City Centre,Old City,Temple Bar',
       ',City Centre,Temple Bar', ',Temple Bar', ',City Centre,Old City',
       'City Centre', ',Temple Bar,Temple Bar,City Centre',
       ',City Centre,Docklands', ',City Centre,Trinity College',
       ",City Centre,City Centre,North City Central/O'Connell Street",
       ',Docklands,Ringsend/Irishtown', ',Ranelagh and Rathmines',
       ",City Centre,City Centre,Old City,North City Central/O'Connell Street",
       ',City Centre,Monkstown', ',City Centre,Clontarf',
       ',City Centre,Kilmainham,Marino,Temple Bar',
       ",City Centre,Drumcondra,North City Central/O'Connell Street,City Centre,North City Central/O'Connell Street,Drumcondra,City Centre,Temple Bar",
       ',Ballsbridge,Ringsend/Irishtown,Sandymount',
       ",City Centre,City Centre,Drumcondra,Docklands,City Centre,North City Central/O'Connell Street",
       ',Drumcondra,Beaumont,Clontarf,Glasnevin', ',Ballymun',
       ',City Centre,City Centre,Fing

In [49]:
new_df['tfilter_neighbourhoods'].fillna(new_df['tfilter_neighbourhoods'].mode()[0] , inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  new_df['tfilter_neighbourhoods'].fillna(new_df['tfilter_neighbourhoods'].mode()[0] , inplace=True)


In [50]:
new_df['tfilter_neighbourhoods'].isnull().sum()

0

In [None]:
# We have nulls in some column as a value let's replace them 

In [65]:
new_df['tfilter_price_max'].dtypes

dtype('O')

In [66]:
new_df['tfilter_price_max'].mode()[0]

'NULL'

In [68]:
new_df['tfilter_price_max'].value_counts()

tfilter_price_max
NULL          339370
1073741823     28361
130             4889
100             4591
67              4119
               ...  
829                1
1095               1
958                1
1008               1
977                1
Name: count, Length: 861, dtype: int64

In [80]:
new_df['tfilter_price_max'].replace('NULL' , '1073741823' , inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  new_df['tfilter_price_max'].replace('NULL' , '1073741823' , inplace=True)


In [74]:
new_df['tfilter_price_min'].dtypes

dtype('O')

In [75]:
new_df['tfilter_price_min'].value_counts()

tfilter_price_min
NULL    339370
0       322194
23         785
29         559
48         487
         ...  
548          1
137          1
323          1
274          1
151          1
Name: count, Length: 176, dtype: int64

In [81]:
new_df['tfilter_price_min'].replace('NULL' , '322194' , inplace=True)

Analysis:

In [82]:
new_df.head(2)

Unnamed: 0,id_guest,tid_host,tid_listing,tts_contact_at,tts_reply_at,tts_accepted_at,tts_booking_at,tds_checkin,tds_checkout,tn_guests,...,tid_user,tn_searches,tn_nights,tn_guests_min,tn_guests_max,torigin_country,tfilter_price_min,tfilter_price_max,tfilter_room_types,tfilter_neighbourhoods
0,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,2,...,111aff61-7560-4281-8529-f64859308bb0,4,2,2,2,DE,322194,1073741823,",Entire home/apt",",City Centre"
1,000dfad9-459b-4f0b-8310-3d6ab34e4f57,13bb24b8-d432-43a2-9755-5ea11b43bb69,21d2b1a2-fdc3-4b4c-a1f0-0eaf0cc02370,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-04 16:26:28.0,2014-10-13,2014-10-15,2,...,40984cb3-fd3b-40d8-ab2b-14151bac010d,8,2,5,5,GB,322194,1073741823,",Entire home/apt",",City Centre"


In [52]:
# customer max/min searching room type
a = new_df['tfilter_room_types'].value_counts()

tfilter_room_types
,Entire home/apt                                                                                                                           439166
Entire home/apt                                                                                                                             77945
,Private room                                                                                                                               38820
Private room                                                                                                                                24373
,Entire home/apt,Entire home/apt,Private room                                                                                               12736
                                                                                                                                            ...  
Entire home/apt,Private room,Entire home/apt,Private room,Shared room,Private room                       

In [83]:
new_df['tfilter_price_max'].value_counts()

tfilter_price_max
1073741823    367731
130             4889
100             4591
67              4119
80              4082
               ...  
625                1
790                1
1035               1
885                1
977                1
Name: count, Length: 860, dtype: int64

In [84]:
new_df['tfilter_price_min'].value_counts()

tfilter_price_min
322194    339370
0         322194
23           785
29           559
48           487
           ...  
548            1
137            1
323            1
274            1
151            1
Name: count, Length: 176, dtype: int64