# Marriott Hotel Category Change 2020 Analysis

In [2]:
# import pandas
import pandas as pd

In [3]:
# read data from csv
df = pd.read_csv('marriott-category-changes-2020.csv')
df.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price
0,Aberdeen Marriott Hotel,Marriott,United Kingdom,3,17500,2,12500
1,AC Hotel Aitana,AC Hotels by Marriott,Spain,3,17500,4,25000
2,AC Hotel Alcala de Henares,AC Hotels by Marriott,Spain,1,7500,2,12500
3,AC Hotel Almeria,AC Hotels by Marriott,Spain,1,7500,2,12500
4,AC Hotel Aravaca,AC Hotels by Marriott,Spain,1,7500,2,12500


## Question 1
Discribe the data types for each feature/column, e.g., xxx feature's data type is String, yyy feature's data type is float, etc.

In [4]:
df.dtypes

Hotel                     object
Brand                     object
Destination               object
Current Category           int64
Current Standard Price     int64
New Category               int64
New Standard Price         int64
dtype: object

* The Hotel feature's data type is String
* The Brand feature's data type is String
* The Destination feature's data type is String
* The Current Category feature's data type is an integer
* The Current Standard Price feature's data type is integer
* The New Category feature's data type is an integer
* The New Standard Price feature's data type is integer

## Question 2
- How many hotels are in this dataset?
- The hotels are from how many unique brands?
- Which destination/country has the most hotels listed in this dataset? List the total number of hotels in that country
- How many brands in China have hotel category changes?

In [5]:
df.shape

(2185, 7)

In [6]:
df.drop_duplicates()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price
0,Aberdeen Marriott Hotel,Marriott,United Kingdom,3,17500,2,12500
1,AC Hotel Aitana,AC Hotels by Marriott,Spain,3,17500,4,25000
2,AC Hotel Alcala de Henares,AC Hotels by Marriott,Spain,1,7500,2,12500
3,AC Hotel Almeria,AC Hotels by Marriott,Spain,1,7500,2,12500
4,AC Hotel Aravaca,AC Hotels by Marriott,Spain,1,7500,2,12500
...,...,...,...,...,...,...,...
2180,Xiamen Marriott Hotel & Conference Centre,Marriott,China,2,12500,3,17500
2181,Xiangshui Bay Marriott Resort & Spa,Marriott,China,3,17500,2,12500
2182,Yogyakarta Marriott Hotel,Marriott,Indonesia,2,12500,3,17500
2183,Zhejiang Taizhou Marriott Hotel,Marriott,China,2,12500,1,7500


In [7]:
df.shape

(2185, 7)

There are 2185 hotels in this dataset.

In [5]:
df.Brand.nunique()

30

There are 30 unique brands of Marriott hotels.

In [38]:
df.Destination.value_counts().sort_values(ascending=False)

USA          1548
Canada         76
China          68
Spain          34
India          26
             ... 
Bahrain         1
Bolivia         1
Uganda          1
Hong Kong       1
Qatar           1
Name: Destination, Length: 98, dtype: int64

USA has the most hotels with 1548. 

In [7]:
df2= df[df.Destination == 'China']
df2= df2[df2['New Category'] != df2['Current Category']]
df2.Brand.nunique()

14

14 Brands in China changed category.

## Question 3
- What's the percentage of hotels worldwide with category upgrade in 2020?

In [8]:
df3= df[df['New Category'] > df['Current Category']]
df3.shape

(1686, 7)

In [9]:
percent_changed = round(df3.shape[0]/df.shape[0]*100,2)
print(percent_changed)

77.16


77.16% of hotels worldwide has category upgrades.

## Question 4
- List hotels with category changes greater than 1 if any, such as changing from category 3 to 5 or from category 7 to 4
- List all JW Marriott hotels in China that have a category upgrade

In [10]:
df['Change']= abs(df['New Category']-df['Current Category'])
df.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
0,Aberdeen Marriott Hotel,Marriott,United Kingdom,3,17500,2,12500,1
1,AC Hotel Aitana,AC Hotels by Marriott,Spain,3,17500,4,25000,1
2,AC Hotel Alcala de Henares,AC Hotels by Marriott,Spain,1,7500,2,12500,1
3,AC Hotel Almeria,AC Hotels by Marriott,Spain,1,7500,2,12500,1
4,AC Hotel Aravaca,AC Hotels by Marriott,Spain,1,7500,2,12500,1


In [11]:
df4 = df[df.Change >= 2]
df4.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
913,"Four Points by Sheraton Bali, Ungasan",Four Points,Indonesia,4,25000,2,12500,2


The Four Points by Sheraton Bali, Ungasan in Indonesia was the only hotel to have a category change greater than 1. Falling from Category 4 to 2.

In [12]:
df5= df[df.Destination == 'China']
df5= df5[df5['Brand'] == 'JW Marriott']
df5= df5[df5['New Category']> df5['Current Category']]
df5.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
1074,JW Marriott Hotel Chengdu,JW Marriott,China,3,17500,4,25000,1
1078,JW Marriott Hotel Shenzhen,JW Marriott,China,3,17500,4,25000,1
1079,JW Marriott Hotel Shenzhen Bao'an,JW Marriott,China,3,17500,4,25000,1
1083,JW Marriott Hotel Zhengzhou,JW Marriott,China,2,12500,3,17500,1


In [13]:
df5.shape

(4, 8)

4 JW Marriott hotels in China that had a category upgrade.

## Question 5
Assume you are in Feburary 2020 and the category changes will take effect on March 4, 2020. You are planning your trip to Florence, Italy and Hong Kong, China in April. You only stay in category 8 hotel (existing category 8 or future category 8) and want to optimize your point spending. Based on the data, which hotel you should book? when should you book your hotels for Florence and Hong Kong? Why?

In [14]:
df6= df[df.Destination == 'Italy']
df7= df[df.Destination == 'China']
df6.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
11,AC Hotel Brescia,AC Hotels by Marriott,Italy,1,7500,2,12500,1
51,AC Hotel Venezia,AC Hotels by Marriott,Italy,5,35000,6,50000,1
52,AC Hotel Vicenza,AC Hotels by Marriott,Italy,1,7500,2,12500,1
145,"Cervo Hotel, Costa Smeralda Resort",Sheraton,Italy,7,60000,8,85000,1
519,"Cristallo, a Luxury Collection Resort & Spa, C...",Luxury Collection,Italy,8,85000,7,60000,1


In [15]:
df7.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
82,Aloft Guangzhou Tianhe,Aloft,China,2,12500,3,17500,1
83,Aloft Guangzhou University Park,Aloft,China,1,7500,2,12500,1
469,Courtyard Shanghai-Pudong,Courtyard,China,2,12500,3,17500,1
470,Courtyard Shenzhen Bay,Courtyard,China,2,12500,3,17500,1
484,Courtyard Suzhou,Courtyard,China,2,12500,3,17500,1


In [23]:
df8= pd.concat([df6,df7])

In [34]:
df9 = df8[df8['New Category']== 8]
df10 = df8[df8['Current Category']== 8]
df11 = pd.concat([df9,df10])

In [35]:
df11.head()

Unnamed: 0,Hotel,Brand,Destination,Current Category,Current Standard Price,New Category,New Standard Price,Change
145,"Cervo Hotel, Costa Smeralda Resort",Sheraton,Italy,7,60000,8,85000,1
1968,"The Westin Excelsior, Florence",Westin,Italy,7,60000,8,85000,1
519,"Cristallo, a Luxury Collection Resort & Spa, C...",Luxury Collection,Italy,8,85000,7,60000,1
2165,W Hong Kong,W Hotels,China,8,85000,7,60000,1


I should book my hotel for the W Hong Kong before the categories change on March 4th, but will have to wait to book the The Westin Excelsior, Florence until after that date.