# Coursera Capstone Report

### Accomodation in Copenhagen, Denmark - Potential Areas for new Hotel

## Table of contents
* [Introduction: Business Problem](#Introduction)
* [Data](#Data)
* [Methodology](#Methodology)
* [Analysis](#Analysis)
* [Results](#Results)
* [Conclusion](#Conclusion)

# Introduction

As part of the Applied Data Science Capstone by IBM Coursera course, this data science project will focus on using data in order to __identify the optimal location for a new tourist accomodation service (hotel) in Copenhagen, Denmark__. To find the optimal location, the project will look into the following parameters:
- __placement__ (i.e. low competitive proximity and short distance to city center)
- __strategy__ (i.e. pricing, unique selling point etc)  

As such, this report is heavily relevant for aspiring new start-ups in the accomodation service in the capital of Denmark. 

The background for this project, is that I myself live in Copenhagen, where peer-to-peer accomodation services (primarily AirBnB) have become an increasingly popular way for tourists to visit the city. I use AirBnB myself for hosting tourists, but find that the market for accomodations are heavily skewed in the city, with areas with low or no accomodation offers. Therefore, it seems likely that their exists and unexploited potential here for new business to grow in a city, which is currently heavily renewing several neighbourhoods.



# Data

For this project, the following dataa sources will be used:

- __Foursquare__ (API)
    - This project will use the search endpoint of the Foursquare data api to extract __data on the professional accomodation service providers in Copenhagen__ (hotels, inns, hostels, bed & breakfasts etc.).  With this data, it should be possible to identifiy the competitive situation in different areas of the city (proximity, distance to city center). 
- __AirBnB__ _(can be accessed through this [link](http://insideairbnb.com/get-the-data.html))_
    - This project uses the public available AirBnB data (insideairbnb). Specifically, this project will use the __listings.csv.gz__, which is a dataset of listings in Copenhagen with data on the hosts' ratings, location and apartment/home characteristics. With this data, it is possible to identify areas with relative more accomodations (popular areas) as well as areas with competitive advantages in terms of the different ratings (placement, value for money etc.) the accomodations have received. 
    - From the AirBnB page it is also possible to download a __Neighborhoods__ geojson file, which contains info on the neighborhood data for Copenhagen. This is used in order to map the different data analysis plots on a geographical map to better visualize the differences between the different neighborhoods. 



# Methodology

To reach the goal of this report, this analysis will apply two methodological approaches: 
1. The analysis will start of with __an exploratory analysis__ of the differences between the different neighborhoods in Copenhagen. With this explorartory analysis, the neighbourhood with the greatest potential for a new professional accomodation service will be selected:
2. Then a __unsupervised machine learning model (k-means)__ will be used to identify clusters of listings in a specific neighbourhood in order to identify the optimal location for a new accomodation service (hotel) in Copehagen.

# Analysis

First, the data will be presented before the explorative analysis is applied. The two datasets are called df_fsq (Foursquare API Data, 138 professional accommodation services in Copenhagen) and df_abnb (AirBnB Listing Data, 28195 AirBnB listing in Copenhagen). The first five rows of the two dataframes are presented below:

In [10]:
df_abnb.shape, df_fsq.shape

((28195, 106), (138, 15))

In [25]:
# Foursquare Dataframe

Unnamed: 0.1,Unnamed: 0,id,location.address,location.cc,location.city,location.country,location.distance,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,Category,Distance_to_Citycenter
0,0,4adcdaf9f964a520ba5c21e3,Rådhuspladsen 57,DK,København,Danmark,85,55.676001,12.569647,,1550.0,Region Hovedstaden,Scandic Palace Hotel,Hotel,0.085209
1,1,4c1ffeb78b3aa593af5b9e5f,Løngangstræde 27,DK,København,Danmark,297,55.675367,12.572863,,1468.0,Region Hovedstaden,First Hotel Twentyseven,Hotel,0.297557
2,2,5bf19af6b9b37b00398d7d7b,H. C. Andersens Blvd. 12,DK,København,Danmark,134,55.676677,12.566421,Indre By,1553.0,Region Hovedstaden,citizenM Copenhagen Radhuspladsen hotel,Hotel,0.134206
3,3,5b28df571ffe97002cb5a0ae,Helgolandsgade 3,DK,København,Danmark,591,55.672745,12.561002,,1653.0,Region Hovedstaden,Hotel Mayfair,Hotel,0.590585
4,4,5bae844e73fe25002ca31c2e,,DK,,Danmark,2,55.676098,12.568337,,,,Hotel,Hotel,0.002331


In [24]:
# AirBnB listings Dataframe

Unnamed: 0,id,listing_url,scrape_id,last_scraped,name,summary,space,description,experiences_offered,neighborhood_overview,notes,transit,access,interaction,house_rules,thumbnail_url,medium_url,picture_url,xl_picture_url,host_id,host_url,host_name,host_since,host_location,host_about,host_response_time,host_response_rate,host_acceptance_rate,host_is_superhost,host_thumbnail_url,host_picture_url,host_neighbourhood,host_listings_count,host_total_listings_count,host_verifications,host_has_profile_pic,host_identity_verified,street,neighbourhood,neighbourhood_cleansed,neighbourhood_group_cleansed,city,state,zipcode,market,smart_location,country_code,country,latitude,longitude,is_location_exact,property_type,room_type,accommodates,bathrooms,bedrooms,beds,bed_type,amenities,square_feet,price,weekly_price,monthly_price,security_deposit,cleaning_fee,guests_included,extra_people,minimum_nights,maximum_nights,minimum_minimum_nights,maximum_minimum_nights,minimum_maximum_nights,maximum_maximum_nights,minimum_nights_avg_ntm,maximum_nights_avg_ntm,calendar_updated,has_availability,availability_30,availability_60,availability_90,availability_365,calendar_last_scraped,number_of_reviews,number_of_reviews_ltm,first_review,last_review,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value,requires_license,license,jurisdiction_names,instant_bookable,is_business_travel_ready,cancellation_policy,require_guest_profile_picture,require_guest_phone_verification,calculated_host_listings_count,calculated_host_listings_count_entire_homes,calculated_host_listings_count_private_rooms,calculated_host_listings_count_shared_rooms,reviews_per_month
0,6983,https://www.airbnb.com/rooms/6983,20200322041115,2020-03-23,Copenhagen 'N Livin',Lovely apartment located in the hip Nørrebro a...,Beautiful and cosy apartment conveniently loca...,Lovely apartment located in the hip Nørrebro a...,none,"Nice bars and cozy cafes just minutes away, ye...",,Bus 66 runs to the central station. Forum metr...,"Bedroom, living room, kitchen, and bathroom fo...","We are usually at work during day time, but wi...",No smoking allowed! No pets.,,,https://a0.muscache.com/im/pictures/42044170/f...,,16774,https://www.airbnb.com/users/show/16774,Simon,2009-05-12,"Copenhagen, Capital Region of Denmark, Denmark",I'm currently working as an environmental cons...,,,29%,f,https://a0.muscache.com/im/users/16774/profile...,https://a0.muscache.com/im/users/16774/profile...,Nørrebro,1.0,1.0,"['email', 'phone', 'reviews']",t,f,"Copenhagen, Hovedstaden, Denmark",Nørrebro,Nrrebro,,Copenhagen,Hovedstaden,2200,Copenhagen,"Copenhagen, Denmark",DK,Denmark,55.68798,12.54571,t,Apartment,Private room,2,1.0,1.0,1.0,Real Bed,"{TV,""Cable TV"",Wifi,Kitchen,""Paid parking off ...",97.0,$384.00,,,$0.00,$35.00,1,$70.00,2,15,2,2,15,15,2.0,15.0,6 weeks ago,t,0,0,20,20,2020-03-23,168,1,2009-09-04,2019-07-19,96.0,10.0,9.0,10.0,10.0,9.0,9.0,f,,,f,f,moderate,f,f,1,0,1,0,1.31
1,26057,https://www.airbnb.com/rooms/26057,20200322041115,2020-03-23,Lovely house - most attractive area,Our lovely house in the center of the city is ...,Totally charming old 150 m2 stone house from y...,Our lovely house in the center of the city is ...,none,The neighborhood is the most famous one and th...,,Walking-distance to metro/station for transpor...,You will have access to the whole house,,We will leave the house clean and in good and ...,,,https://a0.muscache.com/im/pictures/bfbca07e-4...,,109777,https://www.airbnb.com/users/show/109777,Kari,2010-04-17,"Copenhagen, Capital Region of Denmark, Denmark","We are a family with 2 children, and living in...",within a day,100%,13%,f,https://a0.muscache.com/im/users/109777/profil...,https://a0.muscache.com/im/users/109777/profil...,Indre By,1.0,1.0,"['email', 'phone', 'reviews', 'jumio', 'offlin...",t,f,"Copenhagen, Hovedstaden, Denmark",Indre By,Indre By,,Copenhagen,Hovedstaden,2100,Copenhagen,"Copenhagen, Denmark",DK,Denmark,55.69163,12.57459,t,House,Entire home/apt,6,1.5,4.0,4.0,Real Bed,"{TV,Wifi,Kitchen,""Indoor fireplace"",Heating,""F...",,"$2,403.00",,,"$5,000.00","$1,100.00",3,$350.00,3,30,3,3,30,30,3.0,30.0,2 weeks ago,t,28,57,73,348,2020-03-23,50,7,2013-12-02,2019-12-14,98.0,10.0,10.0,10.0,10.0,10.0,10.0,f,,,f,f,moderate,f,f,1,1,0,0,0.65
2,26473,https://www.airbnb.com/rooms/26473,20200322041115,2020-03-23,City Centre Townhouse Sleeps 1-12 persons,TOWN HOUSE ON KNABROSTRÆDE - located in the ab...,The house is a total of 240sqm divided on 4 fl...,TOWN HOUSE ON KNABROSTRÆDE - located in the ab...,none,,,,,,Please be respectful to the neighbors and keep...,,,https://a0.muscache.com/im/pictures/8e132ba0-b...,,112210,https://www.airbnb.com/users/show/112210,Oliver,2010-04-22,"Copenhagen, Capital Region of Denmark, Denmark","Gentle male, have travelled the world for 30 y...",within an hour,100%,100%,f,https://a0.muscache.com/im/users/112210/profil...,https://a0.muscache.com/im/users/112210/profil...,Indre By,4.0,4.0,"['email', 'phone', 'facebook', 'reviews', 'jum...",t,t,"Copenhagen, Hovedstaden, Denmark",Indre By,Indre By,,Copenhagen,Hovedstaden,1210,Copenhagen,"Copenhagen, Denmark",DK,Denmark,55.6759,12.57698,t,House,Entire home/apt,12,2.5,6.0,7.0,Real Bed,"{TV,Internet,Wifi,Kitchen,""Buzzer/wireless int...",,"$3,101.00","$17,553.00","$67,226.00","$3,735.00",$523.00,1,$0.00,3,31,3,3,1125,1125,3.0,1125.0,3 months ago,t,26,35,60,121,2020-03-23,293,45,2010-10-14,2020-03-02,91.0,10.0,9.0,10.0,10.0,10.0,9.0,f,,,f,f,moderate,f,f,1,1,0,0,2.55
3,29118,https://www.airbnb.com/rooms/29118,20200322041115,2020-03-22,Best Location in Cool Istedgade,,The apartment is situated in the middle of the...,The apartment is situated in the middle of the...,none,,,,,,Smoking is allowed on the balcony only. Pleas...,,,https://a0.muscache.com/im/pictures/236213/339...,,125230,https://www.airbnb.com/users/show/125230,Nana,2010-05-15,"Copenhagen, Capital Region of Denmark, Denmark",I have a Master of Arts in Musicology and I wo...,within a day,100%,20%,f,https://a0.muscache.com/im/users/125230/profil...,https://a0.muscache.com/im/users/125230/profil...,Vesterbro,1.0,1.0,"['email', 'phone', 'reviews']",t,f,"Copenhagen, Hovedstaden, Denmark",Vesterbro,Vesterbro-Kongens Enghave,,Copenhagen,Hovedstaden,1650,Copenhagen,"Copenhagen, Denmark",DK,Denmark,55.67069,12.5543,t,Apartment,Entire home/apt,2,1.0,1.0,1.0,Real Bed,"{Wifi,Kitchen,""Paid parking off premises"",Heat...",,$803.00,,,,$300.00,1,$0.00,7,14,3,5,14,14,4.1,14.0,2 weeks ago,t,0,0,0,8,2020-03-22,22,2,2010-06-17,2019-08-02,98.0,10.0,10.0,10.0,10.0,10.0,10.0,f,,,f,f,strict_14_with_grace_period,f,f,1,1,0,0,0.19
4,29618,https://www.airbnb.com/rooms/29618,20200322041115,2020-03-23,Artsy and familyfriendly home in lovely Copenh...,"Artsy, bright and spacious flat, close to the ...",It's a three bedroom apartment with a spacious...,"Artsy, bright and spacious flat, close to the ...",none,"The apartment is situated in Østerbro, very cl...",Please note that the bed in the second bedroom...,There are good bus connections very close to t...,,We will not be around during your stay but you...,Please respect that this is our home.,,,https://a0.muscache.com/im/pictures/547e4a06-9...,,127577,https://www.airbnb.com/users/show/127577,Simon And Anna,2010-05-18,"Copenhagen, Capital Region of Denmark, Denmark","We are Simon and Anna, 40ies, professionals, h...",,,,f,https://a0.muscache.com/im/users/127577/profil...,https://a0.muscache.com/im/users/127577/profil...,Østerbro,1.0,1.0,"['email', 'phone', 'facebook', 'reviews', 'jum...",t,t,"Copenhagen, Hovedstaden, Denmark",Østerbro,sterbro,,Copenhagen,Hovedstaden,2100,Copenhagen,"Copenhagen, Denmark",DK,Denmark,55.69375,12.56945,t,Apartment,Entire home/apt,4,1.0,3.0,3.0,Real Bed,"{TV,Internet,Wifi,Kitchen,""Buzzer/wireless int...",,$859.00,"$2,988.00","$8,963.00",,$75.00,1,$0.00,7,31,7,7,31,31,7.0,31.0,6 months ago,t,0,0,0,0,2020-03-23,90,0,2010-08-16,2017-06-03,94.0,10.0,9.0,10.0,9.0,10.0,9.0,f,,,t,f,moderate,f,f,1,1,0,0,0.77


# Results

First, we have a look at the Foursquare data. In the map-chart below, the number of professional accomodation services (hotels, hostels, inns etc) per neighbourhood in Copenhagen are visualized. It is clear that the professional accomodations services are centered in the inner-city neighbourhood. Vesterbro and Frederiksberg follows along, while areas like Østerbro has a signifcantly lower competite density - i.e. this neighbourhood definetely has potential for a new professional accomodation service when we look solely on the competitive situation

In [516]:
# Map visualization of commercial accomodate services that are located in each Copenhagen neighbourhood



If do the same plot of the volume of AirBnB in the different Copenhagen neighbourhoods (see map below), there is a clear pattern here. The inner city, Veserbro and Nørrebro neighbourhoods both have a high concentration of professiona accommodation services (hotels, hostels, inns etc.) and of AirBnB listings. In contrast, Østerbro has a very low competitive density of professional accommodation services, but have relative more AirBnB-listings. Again this neighbourhood shows great potential, as it is well-used for AirBnB accommodations, but with a low density of professional competitions.

In [183]:
# Map visualization of AirBnB listings in each Copenhagen neighbourhood

Furthermore, Østerbro also have the lowest availability (within the past 365 days) amongst the Copenhagen neighbourhoods (see table below). This indicates that the even though Østerbro actually has an above-average volume of AirBnB listings, but with the lowest availability among these listings. As a result, one could suspect that increasing the availability of accommodation services in Østerbro will prove lucrative as the demand seems to exist.

In [511]:
# Table for the average days of AirBnB listing availability per Copenhagen neighbourhood

Unnamed: 0,neighbourhood_cleansed,availability_365
6,Valby,132.0
2,Bispebjerg,124.5
4,Indre By,117.14
0,Amager Vest,116.87
3,Frederiksberg,109.54
1,Amager st,98.44
7,Vesterbro-Kongens Enghave,80.16
5,Nrrebro,67.26
8,sterbro,66.67


From the map below, we can see that Østerbro does not have the best score on the AirBnB-review score regarding location of the listing. This neighbourhood does not have a particular poor location, but it is outcompeted by the Inner City, Frederiksberg, Valby and Amager Vest.

In [184]:
# A visualization map for the AirBnB-review score for listing location per Copenhagen Neighbourhood

At last, we have a look at the average price per AirBnB-listing in the different Copehagen neighbourhoods. Again, Østerbro proves lucrative as it has the second highest average price per listing - only behind the Inner City-neighbourhood. When we combine this with "value for money"-score (see table below the map) for the different neighbourhoods, Østerbro is further supported as a lucrative accommodation area within Copenhagen. The tourists are willing to pay a high price for accomodations in this area - and they find that the accommodations in Østerbro is worth the investment. 

In [538]:
# Map viualization of the average AirBnB listing price per Copenhagen neighbourhood



In [510]:
# Table over the "value for money"-review score for the AirBnB listings per Copenhagen neighbourhood

Unnamed: 0,neighbourhood_cleansed,review_scores_value
7,Vesterbro-Kongens Enghave,9.59
8,sterbro,9.58
1,Amager st,9.56
3,Frederiksberg,9.54
0,Amager Vest,9.53
5,Nrrebro,9.4
6,Valby,9.29
4,Indre By,9.26
2,Bispebjerg,9.0


Based on the explorative analysis above, we have identified Østerbro as a lucrative neighbourhood for a potential new professional accommodation service in Copenhagen. In the following, we will use a unsupervised machine learning clustering analysis (K-means) to identify hidden groupings of the AirBnB-listings in this neighbourhood. Based on a K-means clustering, it is possible to identify three clusters based on the coordinates of the listings in Østerbro. The centers of these three clusters are visualized in the map below.

In [537]:
# Map visualization of the centers of the three Østerbro clusters

Of the three clusters, the one furthest to the south is the most lucrative in terms of the potential for a new professional accommodation service. This cluster has the shortest distance to the city center, the highest average listing price and review-scores (especially on the "good location"-reviewscore. Furthermore, this cluster has a relative lower average reviews per month and availability among the its AirBnB-listings. This indicates a potential for a new accommodation service here to meet the demand. Furthermore, the placement furthest to the south in Østerbro, is close to some of the biggest tourist attractions and locally-beloved areas such as the little mermaid statue, the lakes and Fælledparken. Thus, the Østerbro neighbourhood is highly recommended for new accommodation service providers in the Danish capital area. Focus should therefore be on the location attractions with a slightly below-average price-strategy to make the new accommodation service lucrative in this area. Furthermore, the value for money is definetly a unique selling point that will help the new service to thrive in this area

In [23]:
# Table over the 3 clusters in Østerbro (Labels). 
# Label 0 is the cluster with the center furthest to east 
# Label 1 is the cluster with the center furthest to north
# Label 2 is the cluster with the center furthest to south

Unnamed: 0_level_0,Distance_to_Citycenter,price_num,reviews_per_month,availability_365,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value
Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0,3.38,828.84,0.53,49.66,95.48,9.74,9.39,9.82,9.85,9.57,9.47
1,4.07,840.22,0.63,47.24,94.81,9.7,9.34,9.81,9.84,9.39,9.4
2,2.58,963.99,0.53,47.83,95.93,9.82,9.49,9.85,9.89,9.76,9.52


# Conclusion

In this report, we have used data to investigate and identify the optimal location for a new professional tourist accomodation service (fx hotel, hostel, inn) in Copenhagen, Denmark. To find the optimal location, this project has leveraged data from Foursquare (professional accommodation services like hotels, hostels, inns etc.) and AirBnB listings data. With these data sources, we were able to identify Østerbro as a highly relevant and lucrative neighbourhood for aspiring new start-ups in the professional accomodation service in the capital of Denmark. 

Østerbro has the a relative low competitive density of other professional accommodation services. However, the volume of AirBnB listings are relative higher indicating a demand for accommodation options in the area. Furthremore, Østerbro has the second highest average listing price on AirBnB as well as the second highest score on the "value for money"-review metric. This indicates a room in the market for a slightly below average price option which will provide the tourists with true value for money. 

Within Østerbro, this project has identified a specific lucrative area in the south part of the neighbourhood. This area has the highest review-scores and average listing price, but with relative lower availability and accommodations volumes - i.e. this area is particularly in need of a new accommodation service that can meet this demand. 
