![Kayak](https://seekvectorlogo.com/wp-content/uploads/2018/01/kayak-vector-logo.png)

# Plan your trip with Kayak 

## Company's description 📇

<a href="https://www.kayak.com" target="_blank">Kayak</a> is a travel search engine that helps user plan their next trip at the best price.

The company was founded in 2004 by Steve Hafner & Paul M. English. After a few rounds of fundraising, Kayak was acquired by <a href="https://www.bookingholdings.com/" target="_blank">Booking Holdings</a> which now holds: 

* <a href="https://booking.com/" target="_blank">Booking.com</a>
* <a href="https://kayak.com/" target="_blank">Kayak</a>
* <a href="https://www.priceline.com/" target="_blank">Priceline</a>
* <a href="https://www.agoda.com/" target="_blank">Agoda</a>
* <a href="https://Rentalcars.com/" target="_blank">RentalCars</a>
* <a href="https://www.opentable.com/" target="_blank">OpenTable</a>

With over \$300 million revenue a year, Kayak operates in almost all countries and all languages to help their users book travels accros the globe. 

## Project 🚧

The marketing team needs help on a new project. After doing some user research, the team discovered that **70% of their users who are planning a trip would like to have more information about the destination they are going to**. 

In addition, user research shows that **people tend to be defiant about the information they are reading if they don't know the brand** which produced the content. 

Therefore, Kayak Marketing Team would like to create an application that will recommend where people should plan their next holidays. The application should be based on real data about:

* Weather 
* Hotels in the area 

The application should then be able to recommend the best destinations and hotels based on the above variables at any given time. 

## Goals 🎯

As the project has just started, your team doesn't have any data that can be used to create this application. Therefore, your job will be to: 

* Scrape data from destinations 
* Get weather data from each destination 
* Get hotels' info about each destination
* Store all the information above in a data lake
* Extract, transform and load cleaned data from your datalake to a data warehouse

## Scope of this project 🖼️

Marketing team wants to focus first on the best cities to travel to in France. According <a href="https://one-week-in.com/35-cities-to-visit-in-france/" target="_blank">One Week In.com</a> here are the top-35 cities to visit in France: 

```python 
["Mont Saint Michel",
"St Malo",
"Bayeux",
"Le Havre",
"Rouen",
"Paris",
"Amiens",
"Lille",
"Strasbourg",
"Chateau du Haut Koenigsbourg",
"Colmar",
"Eguisheim",
"Besancon",
"Dijon",
"Annecy",
"Grenoble",
"Lyon",
"Gorges du Verdon",
"Bormes les Mimosas",
"Cassis",
"Marseille",
"Aix en Provence",
"Avignon",
"Uzes",
"Nimes",
"Aigues Mortes",
"Saintes Maries de la mer",
"Collioure",
"Carcassonne",
"Ariege",
"Toulouse",
"Montauban",
"Biarritz",
"Bayonne",
"La Rochelle"]
```

Your team should focus **only on the above cities for your project**. 


## Helpers 🦮

To help you achieve this project, here are a few tips that should help you

### Get weather data with an API 

*   Use https://nominatim.org/ to get the gps coordinates of all the cities (no subscription required) Documentation : https://nominatim.org/release-docs/develop/api/Search/

*   Use https://openweathermap.org/appid (you have to subscribe to get a free apikey) and https://openweathermap.org/api/one-call-api to get some information about the weather for the 35 cities and put it in a DataFrame

*   Determine the list of cities where the weather will be the nicest within the next 7 days For example, you can use the values of daily.pop and daily.rain to compute the expected volume of rain within the next 7 days... But it's only an example, actually you can have different opinions on a what a nice weather would be like 😎 Maybe the most important criterion for you is the temperature or humidity, so feel free to change the rules !

*   Save all the results in a `.csv` file, you will use it later 😉 You can save all the informations that seem important to you ! Don't forget to save the name of the cities, and also to create a column containing a unique identifier (id) of each city (this is important for what's next in the project)

*   Use plotly to display the best destinations on a map

### Scrape Booking.com 

Since BookingHoldings doesn't have aggregated databases, it will be much faster to scrape data directly from booking.com 

You can scrap as many information asyou want, but we suggest that you get at least:

*   hotel name,
*   Url to its booking.com page,
*   Its coordinates: latitude and longitude
*   Score given by the website users
*   Text description of the hotel


### Create your data lake using S3 

Once you managed to build your dataset, you should store into S3 as a csv file. 

### ETL 

Once you uploaded your data onto S3, it will be better for the next data analysis team to extract clean data directly from a Data Warehouse. Therefore, create a SQL Database using AWS RDS, extract your data from S3 and store it in your newly created DB. 

## Deliverable 📬

To complete this project, your team should deliver:

* A `.csv` file in an S3 bucket containing enriched information about weather and hotels for each french city

* A SQL Database where we should be able to get the same cleaned data from S3 

* Two maps where you should have a Top-5 destinations and a Top-20 hotels in the area. You can use plotly or any other library to do so. It should look something like this: 

![Map](https://full-stack-assets.s3.eu-west-3.amazonaws.com/images/Kayak_best_destination_project.png)

In [447]:
# IMPORT librairies

import pandas as pd
import requests
import plotly.express as px
import plotly.io as pio

pio.renderers.default = "iframe_connected"


In [363]:
# PREPARE city LIST for data collection from API

original_city_list = ["Mont Saint Michel",
"St Malo",
"Bayeux",
"Le Havre",
"Rouen",
"Paris",
"Amiens",
"Lille",
"Strasbourg",
"Chateau du Haut Koenigsbourg",
"Colmar",
"Eguisheim",
"Besancon",
"Dijon",
"Annecy",
"Grenoble",
"Lyon",
"Gorges du Verdon",
"Bormes les Mimosas",
"Cassis",
"Marseille",
"Aix en Provence",
"Avignon",
"Uzes",
"Nimes",
"Aigues Mortes",
"Saintes Maries de la mer",
"Collioure",
"Carcassonne",
"Ariege",
"Toulouse",
"Montauban",
"Biarritz",
"Bayonne",
"La Rochelle"]

In [364]:
original_city_list[0]

'Mont Saint Michel'

In [365]:
# CITY LIST to be AMENDED to manage spaces into anmes and being well interpretated by API

amended_city_list = []
for s in original_city_list:
    new_list = s.replace(" ", "+")
# Modify old string
    amended_city_list.append(new_list)


In [366]:
# TEST amended CITY LIST -> ok
len(amended_city_list)

35

In [367]:
# SELECT 1st element
amended_city_list

['Mont+Saint+Michel',
 'St+Malo',
 'Bayeux',
 'Le+Havre',
 'Rouen',
 'Paris',
 'Amiens',
 'Lille',
 'Strasbourg',
 'Chateau+du+Haut+Koenigsbourg',
 'Colmar',
 'Eguisheim',
 'Besancon',
 'Dijon',
 'Annecy',
 'Grenoble',
 'Lyon',
 'Gorges+du+Verdon',
 'Bormes+les+Mimosas',
 'Cassis',
 'Marseille',
 'Aix+en+Provence',
 'Avignon',
 'Uzes',
 'Nimes',
 'Aigues+Mortes',
 'Saintes+Maries+de+la+mer',
 'Collioure',
 'Carcassonne',
 'Ariege',
 'Toulouse',
 'Montauban',
 'Biarritz',
 'Bayonne',
 'La+Rochelle']

In [368]:
# PREPARE URL for API

url = ('https://nominatim.openstreetmap.org/search?q={}&format=json').format(amended_city_list[0])

In [369]:
# TEST url -> ok
# url

In [370]:
# GET values from API

r = requests.get(url)
soup = r.json()


In [371]:
# TEST json answer -> ok
soup

[{'place_id': 151486647,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'way',
  'osm_id': 211285890,
  'boundingbox': ['48.6349172', '48.637031', '-1.5133292', '-1.5094796'],
  'lat': '48.6359541',
  'lon': '-1.511459954959514',
  'display_name': 'Mont Saint-Michel, Le Mont-Saint-Michel, Avranches, Manche, Normandie, France métropolitaine, 50170, France',
  'class': 'place',
  'type': 'islet',
  'importance': 0.755436556781574},
 {'place_id': 282955240,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'relation',
  'osm_id': 7360493,
  'boundingbox': ['46.7308335', '46.9105744', '-75.4379111', '-75.1592812'],
  'lat': '46.7798558',
  'lon': '-75.336261',
  'display_name': 'Mont-Saint-Michel, Antoine-Labelle, Laurentides, Québec, J0W 1P0, Canada',
  'class': 'boundary',
  'type': 'administrative',
  'importance': 0.7301414688716277,
  'icon': 'https://nominatim.openstreetmap.org/ui

In [372]:
# check lat data collection -> OK
latitude = soup[0]["lat"]

# latitude

In [373]:
# check lon data collection 1st element -> OK
longitude = soup[0]["lon"]
# longitude

In [374]:
# EXTRACT data to load from API

# data_to_load = city + longitude + latitude
# data_to_load

In [375]:
# PREPAPRE data collection OUTPUT table

cities_geo_coordinates = pd.DataFrame(columns = ["place_id","city", "latitude", "longitude"])

# cities_geo_coordinates

In [376]:
# test OUTPUT table

cities_geo_coordinates

Unnamed: 0,place_id,city,latitude,longitude


In [377]:
# ASSEMBLE code

for c in range (0,len(amended_city_list)):
    
    #append city geocoordinates
    city = amended_city_list[c]
    url_1 = ('https://nominatim.openstreetmap.org/search?q={}&format=json&countrycodes=fr&dedupe=[1]').format(city)
    r_1 = requests.get(url_1)
    soup_1 = r_1.json()
    cities_geo_coordinates = cities_geo_coordinates.append({"place_id":soup_1[0]["place_id"], "city":amended_city_list[c], "latitude":soup_1[0]["lat"], "longitude":soup_1[0]["lon"]}, ignore_index=True)
#     if c == len(amended_city_list)+1:
#         print(cities_geo_coordinates)

In [378]:
cities_geo_coordinates

Unnamed: 0,place_id,city,latitude,longitude
0,151486647,Mont+Saint+Michel,48.6359541,-1.511459954959514
1,121999,St+Malo,48.649518,-2.0260409
2,3126290,Bayeux,49.2764624,-0.7024738
3,17564290,Le+Havre,49.4938975,0.1079732
4,281721777,Rouen,49.4404591,1.0939658
5,281739181,Paris,48.8588897,2.3200410217200766
6,281746639,Amiens,49.8941708,2.2956951
7,282028769,Lille,50.6365654,3.0635282
8,121990,Strasbourg,48.584614,7.7507127
9,117144990,Chateau+du+Haut+Koenigsbourg,48.249489800000006,7.34429620253195


In [379]:
cities_infos_detailled = pd.DataFrame(columns=['city_id','city','latitude','longitude',
                                     'current_temp_feels_like',"current_humidity",
                                     'J+1_temp_feels_like',
                                     'J+2_temp_feels_like',
                                     'J+3_temp_feels_like',
                                     'J+4_temp_feels_like',
                                     'J+5_temp_feels_like',
                                     'J+6_temp_feels_like',
                                     'J+7_temp_feels_like',
                                     'temp_mean'
                                     'J+1_humidity',
                                     'J+2_humidity',
                                     'J+3_humidity',
                                     'J+4_humidity',
                                     'J+5_humidity',
                                     'J+6_humidity',
                                     'J+7_humidity',
                                     'humidity_mean'])
cities_infos_detailled

Unnamed: 0,city_id,city,latitude,longitude,current_temp_feels_like,current_humidity,J+1_temp_feels_like,J+2_temp_feels_like,J+3_temp_feels_like,J+4_temp_feels_like,...,J+6_temp_feels_like,J+7_temp_feels_like,temp_meanJ+1_humidity,J+2_humidity,J+3_humidity,J+4_humidity,J+5_humidity,J+6_humidity,J+7_humidity,humidity_mean


In [380]:
# APPEND city needed infos

for c in range (0,len(cities_geo_coordinates)):
    
    city_id = cities_geo_coordinates.loc[c, "place_id"]
    latitude = cities_geo_coordinates.loc[c, "latitude"]
    longitude = cities_geo_coordinates.loc[c, "longitude"]
    api_key = "06c201b8b437d60655925748fa2efc34"
    url_2 = ("https://api.openweathermap.org/data/2.5/onecall?lat={}&lon={}&appid={}&exclude=hourly,minutely&units=metric").format(latitude,longitude,api_key)
    r_2 = requests.get(url_2)
    city_info = r_2.json()
    cities_infos_detailled = cities_infos_detailled.append({
                                        "city_id": city_id,
                                        "city": original_city_list[c],
                                        "latitude": latitude,
                                        "longitude": longitude,
                                        'current_temp_feels_like': city_info["current"]["feels_like"],
                                        'current_humidity': city_info["current"]["humidity"],
                                        'J+1_temp_feels_like': city_info["daily"][1]['feels_like']['day'],
                                        'J+2_temp_feels_like':city_info["daily"][2]['feels_like']['day'],
                                        'J+3_temp_feels_like':city_info["daily"][3]['feels_like']['day'],
                                        'J+4_temp_feels_like':city_info["daily"][4]['feels_like']['day'],
                                        'J+5_temp_feels_like':city_info["daily"][5]['feels_like']['day'],
                                        'J+6_temp_feels_like':city_info["daily"][6]['feels_like']['day'],
                                        'J+7_temp_feels_like':city_info["daily"][7]['feels_like']['day'],
                                        "J+1_humidity": city_info["daily"][1]['humidity'],
                                        "J+2_humidity": city_info["daily"][2]['humidity'],                                        
                                        "J+3_humidity": city_info["daily"][3]['humidity'],                                        
                                        "J+4_humidity": city_info["daily"][4]['humidity'],                                        
                                        "J+5_humidity": city_info["daily"][5]['humidity'],                                        
                                        "J+6_humidity": city_info["daily"][6]['humidity'],                                        
                                        "J+7_humidity": city_info["daily"][7]['humidity']
                                        },
                                        ignore_index=True)
    print(c, end=' ')


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 

In [381]:
 city_info = r_2.json()
# city_info

In [382]:
cities_infos_detailled.sort_values(by=['current_temp_feels_like','current_humidity'], ascending=False)

Unnamed: 0,city_id,city,latitude,longitude,current_temp_feels_like,current_humidity,J+1_temp_feels_like,J+2_temp_feels_like,J+3_temp_feels_like,J+4_temp_feels_like,...,J+7_temp_feels_like,temp_meanJ+1_humidity,J+2_humidity,J+3_humidity,J+4_humidity,J+5_humidity,J+6_humidity,J+7_humidity,humidity_mean,J+1_humidity
20,124969,Marseille,43.2961743,5.3699525,12.89,78,4.04,5.33,4.53,3.88,...,8.68,,44,48,47,57,35,28,,62.0
14,121462,Annecy,45.8992348,6.1288847,12.82,54,0.82,-0.51,0.75,2.65,...,-0.26,,68,75,71,94,83,81,,88.0
31,26251971,Montauban,44.0175835,1.3549991,12.75,73,2.92,1.17,1.62,7.34,...,8.85,,68,76,80,68,97,60,,62.0
19,281762295,Cassis,43.2181778,5.553394005675274,12.7,83,4.93,6.92,5.7,5.06,...,9.48,,37,44,41,56,29,24,,61.0
16,281934488,Lyon,45.7578137,4.8320114,12.55,71,0.93,-1.12,0.94,4.56,...,3.01,,78,86,68,69,90,68,,61.0
27,129118,Collioure,42.52505,3.0831554,12.42,89,3.94,3.01,8.85,10.96,...,9.21,,47,50,62,76,96,74,,52.0
15,123256,Grenoble,45.1875602,5.7357819,12.34,55,0.89,-1.32,2.28,0.75,...,3.95,,82,81,83,92,83,84,,98.0
18,282091180,Bormes les Mimosas,43.1506968,6.3419285,12.12,89,10.55,7.17,8.05,6.66,...,9.3,,54,48,43,41,40,24,,55.0
28,126816,Carcassonne,43.2130358,2.3491069,11.54,83,1.9,-0.39,-1.02,5.56,...,9.42,,68,86,83,83,96,67,,66.0
21,126708,Aix en Provence,43.5298424,5.4474738,11.46,89,4.08,4.73,4.41,4.06,...,9.37,,41,41,35,54,31,24,,59.0


In [383]:
cities_infos_synthesis = cities_infos_detailled.loc[0:,["city"]]
cities_infos_synthesis

Unnamed: 0,city
0,Mont Saint Michel
1,St Malo
2,Bayeux
3,Le Havre
4,Rouen
5,Paris
6,Amiens
7,Lille
8,Strasbourg
9,Chateau du Haut Koenigsbourg


In [384]:
cities_infos_synthesis = pd.DataFrame(columns = ['city_id', 'city', 'latitude', 'longitude', 'temp_mean', 'humidity_mean'])
cities_infos_synthesis

Unnamed: 0,city_id,city,latitude,longitude,temp_mean,humidity_mean


In [385]:

for c in range(0,len(cities_infos_detailled)):
    cities_infos_synthesis = cities_infos_synthesis.append({"city_id":  cities_infos_detailled.loc[c,'city_id'],
                                                            "city": cities_infos_detailled.loc[c,'city'],
                                                            "latitude": cities_infos_detailled.loc[c,'latitude'],
                                                            "longitude": cities_infos_detailled.loc[c,'longitude'],
                                                            "temp_mean": cities_infos_detailled.loc[c,['current_temp_feels_like','J+1_temp_feels_like','J+2_temp_feels_like','J+3_temp_feels_like','J+4_temp_feels_like','J+5_temp_feels_like','J+6_temp_feels_like','J+7_temp_feels_like']].mean(),
                                                            "humidity_mean": cities_infos_detailled.loc[c,['current_humidity', 'J+1_humidity', 'J+2_humidity', 'J+3_humidity', 'J+4_humidity', 'J+5_humidity', 'J+6_humidity', 'J+7_humidity']].mean()
                                                           },
                                                           ignore_index=True)
    print(c, end=" ")
cities_infos_synthesis

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 

Unnamed: 0,city_id,city,latitude,longitude,temp_mean,humidity_mean
0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5
1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125
2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5
3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25
4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75
5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5
6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75
7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5
8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0
9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125


In [386]:


cities_infos_synthesis.sort_values(by=['temp_mean','humidity_mean'], ascending=False)

Unnamed: 0,city_id,city,latitude,longitude,temp_mean,humidity_mean
32,15759375,Biarritz,43.4832523,-1.5592776,9.04625,83.625
33,15548981,Bayonne,43.4933379,-1.475099,9.03125,84.375
18,282091180,Bormes les Mimosas,43.1506968,6.3419285,8.59375,49.25
34,123543,La Rochelle,46.1591126,-1.1520434,8.02375,77.375
27,129118,Collioure,42.52505,3.0831554,7.59125,68.25
19,281762295,Cassis,43.2181778,5.553394005675274,7.15,46.875
25,122788,Aigues Mortes,43.5658225,4.1912837,7.12625,53.875
31,26251971,Montauban,44.0175835,1.3549991,6.3825,73.0
20,124969,Marseille,43.2961743,5.3699525,6.27625,49.875
26,125804,Saintes Maries de la mer,43.4522771,4.4287172,6.20875,56.375


### and the winner are ... Bayonne / Biarritz / La Rochelle because I prefer south west ;-)

In [387]:
cities_infos_synthesis.columns

Index(['city_id', 'city', 'latitude', 'longitude', 'temp_mean',
       'humidity_mean'],
      dtype='object')

### let's create a custom city ID in case of needed later on


In [388]:
cities_infos_synthesis['city_id_df']=''

In [389]:
cities_infos_synthesis

Unnamed: 0,city_id,city,latitude,longitude,temp_mean,humidity_mean,city_id_df
0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5,
1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125,
2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5,
3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25,
4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75,
5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5,
6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75,
7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5,
8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0,
9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125,


In [390]:
cities_infos_synthesis.index.values

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34])

In [391]:
# permet de créer une colonne avec les index

cities_infos_synthesis.reset_index(inplace=True)
cities_infos_synthesis

Unnamed: 0,index,city_id,city,latitude,longitude,temp_mean,humidity_mean,city_id_df
0,0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5,
1,1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125,
2,2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5,
3,3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25,
4,4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75,
5,5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5,
6,6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75,
7,7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5,
8,8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0,
9,9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125,


### CUSTOM CITY ID - contenation

In [392]:
type(cities_infos_synthesis)
cities_infos_synthesis['city_id_df'] = 'city_'+ cities_infos_synthesis["index"].astype(str)
# cities_infos_synthesis = cities_infos_synthesis[['city_id', 'city', 'latitude', 'longitude', 'temp_mean', 'humidity_mean']]
cities_infos_synthesis

Unnamed: 0,index,city_id,city,latitude,longitude,temp_mean,humidity_mean,city_id_df
0,0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5,city_0
1,1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125,city_1
2,2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5,city_2
3,3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25,city_3
4,4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75,city_4
5,5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5,city_5
6,6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75,city_6
7,7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5,city_7
8,8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0,city_8
9,9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125,city_9


In [393]:
type(cities_infos_synthesis)

pandas.core.frame.DataFrame

In [394]:
cities_infos_synthesis.columns = ['index_pa', 'city_id', 'city', 'latitude', 'longitude', 'temp_mean', 'humidity_mean', 'city_id_df']

In [395]:
cities_infos_synthesis

Unnamed: 0,index_pa,city_id,city,latitude,longitude,temp_mean,humidity_mean,city_id_df
0,0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5,city_0
1,1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125,city_1
2,2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5,city_2
3,3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25,city_3
4,4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75,city_4
5,5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5,city_5
6,6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75,city_6
7,7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5,city_7
8,8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0,city_8
9,9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125,city_9


In [396]:
# cities_infos_synthesis.pop('index_pa')
cities_infos_synthesis

Unnamed: 0,index_pa,city_id,city,latitude,longitude,temp_mean,humidity_mean,city_id_df
0,0,151486647,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5,city_0
1,1,121999,St Malo,48.649518,-2.0260409,5.66375,78.125,city_1
2,2,3126290,Bayeux,49.2764624,-0.7024738,4.74,82.5,city_2
3,3,17564290,Le Havre,49.4938975,0.1079732,4.36375,79.25,city_3
4,4,281721777,Rouen,49.4404591,1.0939658,4.37125,77.75,city_4
5,5,281739181,Paris,48.8588897,2.3200410217200766,4.565,72.5,city_5
6,6,281746639,Amiens,49.8941708,2.2956951,2.7475,77.75,city_6
7,7,282028769,Lille,50.6365654,3.0635282,2.47625,75.5,city_7
8,8,121990,Strasbourg,48.584614,7.7507127,0.98375,73.0,city_8
9,9,117144990,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125,city_9


In [397]:
cities_infos_synthesis = cities_infos_synthesis[['city_id', 'city_id_df', 'city', 'latitude', 'longitude', 'temp_mean', 'humidity_mean']]
cities_infos_synthesis

Unnamed: 0,city_id,city_id_df,city,latitude,longitude,temp_mean,humidity_mean
0,151486647,city_0,Mont Saint Michel,48.6359541,-1.511459954959514,5.60625,81.5
1,121999,city_1,St Malo,48.649518,-2.0260409,5.66375,78.125
2,3126290,city_2,Bayeux,49.2764624,-0.7024738,4.74,82.5
3,17564290,city_3,Le Havre,49.4938975,0.1079732,4.36375,79.25
4,281721777,city_4,Rouen,49.4404591,1.0939658,4.37125,77.75
5,281739181,city_5,Paris,48.8588897,2.3200410217200766,4.565,72.5
6,281746639,city_6,Amiens,49.8941708,2.2956951,2.7475,77.75
7,282028769,city_7,Lille,50.6365654,3.0635282,2.47625,75.5
8,121990,city_8,Strasbourg,48.584614,7.7507127,0.98375,73.0
9,117144990,city_9,Chateau du Haut Koenigsbourg,48.249489800000006,7.34429620253195,-1.5725,78.125


## BOOKING scrapy !!

In [398]:
!pip install Scrapy

Collecting Scrapy
  Downloading Scrapy-2.5.1-py2.py3-none-any.whl (254 kB)
[K     |████████████████████████████████| 254 kB 10.2 MB/s eta 0:00:01
[?25hCollecting itemadapter>=0.1.0
  Using cached itemadapter-0.4.0-py3-none-any.whl (10 kB)
Collecting parsel>=1.5.0
  Using cached parsel-1.6.0-py2.py3-none-any.whl (13 kB)
Collecting itemloaders>=1.0.1
  Using cached itemloaders-1.0.4-py3-none-any.whl (11 kB)
Processing /home/jovyan/.cache/pip/wheels/d1/d7/61/11b5b370ee487d38b5408ecb7e0257db9107fa622412cbe2ff/PyDispatcher-2.0.5-py3-none-any.whl
Collecting service-identity>=16.0.0
  Using cached service_identity-21.1.0-py2.py3-none-any.whl (12 kB)
Collecting cssselect>=0.9.1
  Using cached cssselect-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting Twisted[http2]>=17.9.0
  Using cached Twisted-21.7.0-py3-none-any.whl (3.1 MB)
Collecting w3lib>=1.17.0
  Using cached w3lib-1.22.0-py2.py3-none-any.whl (20 kB)
Collecting h2<4.0,>=3.0
  Using cached h2-3.2.0-py2.py3-none-any.whl (65 kB)
Collecting

In [399]:
import os 
import logging
import scrapy
from scrapy.crawler import CrawlerProcess

### TARGET REMINDER Scrape Booking.com

Since BookingHoldings doesn't have aggregated databases, it will be much faster to scrape data directly from booking.com

You can scrap as many information asyou want, but we suggest that you get at least:

hotel name,
Url to its booking.com page,
Its coordinates: latitude and longitude
Score given by the website users
Text description of the hotel

In [None]:
class BookingSpider(scrapy.Spider): 
  
    name = "booking"
    cities_urls = []
    for i in range (0, len(amended_city_list)):
        cities_urls.append('https://www.booking.com/searchresults.fr.html?order=class&nflt=distance%3D1000%3B&ss={}'.format(amended_city_list[i]))

        start_urls = cities_urls

    def parse(self, response):
        print(response.url)
        hotels = response.css('div.sr_property_block_main_row') 
        for hotel in hotels:
            yield {
                'hotel_city_extract': hotel.css('a.bui-link::text').get(),
                'hotel_url' : hotel.css('a::attr(href)').get(),
                'hotel_name': hotel.css('a span.sr-hotel__name::text').get(),
                'hotel_coordinates_lon_lat' : hotel.css('a::attr(data-coords)').get(),
                'hotel_score' : hotel.css('div.bui-review-score__badge::text').get(),
                'hotel_description' : hotel.css('div.hotel_desc::text').get()
            }

In [None]:
filename = "booking_test.json"
if filename in os.listdir('json_booking/'):
        os.remove('json_booking/' + filename)

process = CrawlerProcess(settings = {
    'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
    'LOG_LEVEL': logging.INFO,
    "FEEDS": {
        'json_booking/' + filename : {"format": "json"},
    },
    "AUTOTHROTTLE_ENABLED": True,
    'CONCURRENT_REQUESTS': 5
})

process.crawl(BookingSpider)
process.start()

### comment

The results of my booking scrapping are not directly visible HERE as booking changed his website since intial scapping

![image.png](attachment:ad5de047-4cdb-4a39-9454-ee4578b34236.png)

### REPROCESS from scrap result json file

REIMPORT library and rerun cleaning from json in order to avoid to scrap again as booking websit & balises changed from initial scrap

In [401]:
# read json output from scrap

df = pd.read_json('booking_test.json')

# Cleanup data collected

df['hotel_city_extract']=df['hotel_city_extract'].str.strip()
df['hotel_name']=df['hotel_name'].str.strip()
df['hotel_name']=df['hotel_name'].replace('\n','')
df['hotel_url'] = 'https://www.booking.com/'+df['hotel_url'].str.strip()
df['hotel_description'] = df['hotel_description'].str.strip()
df

Unnamed: 0,hotel_city_extract,hotel_url,hotel_name,hotel_coordinates_lon_lat,hotel_score,hotel_description
0,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/les-terrasse...,Les Terrasses Poulard,"-1.51037871837616,48.6353494256412",73,Occupant 2 bâtiments différents au cœur du Mon...
1,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/le-mouton-bl...,Le Mouton Blanc,"-1.50989592075348,48.6360229844471",72,"Situé au pied de l'abbaye, le Mouton Blanc Hot..."
2,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/ha-el-la-cro...,Hôtel la Croix Blanche,"-1.50986105203629,48.6357340642713",76,Installé au cœur du village médiéval du Mont-S...
3,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/auberge-sain...,Auberge Saint Pierre,"-1.5098825097084,48.6356879786914",81,L'Auberge Saint-Pierre occupe une maison à col...
4,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/la-mere-poul...,La Mère Poulard,"-1.51053965091705,48.635085317234",73,"Occupant un bâtiment historique, l'hôtel La Mè..."
...,...,...,...,...,...,...
829,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/pierre-vacan...,Résidence Pierre & Vacances Centre,"-1.14639222621918,46.1557694290086",75,La Résidence Pierre & Vacances Centre propose ...
830,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/comforthotel...,Hôtel Saint Nicolas,"-1.14891277107745,46.1581345696256",86,L'Hôtel Saint Nicolas dispose d'une connexion ...
831,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/le-yachtman....,Hôtel Le Yachtman,"-1.14996761083603,46.1572836509306",76,"Situé à La Rochelle, l'Hôtel Le Yachtman possè..."
832,La Rochelle,https://www.booking.com//hotel/fr/le-manoir-la...,Le Manoir Hôtel,"-1.16061061620712,46.1627047811366",86,Occupant un bâtiment du XIXe siècle à La Roche...


In [402]:
# clean up hotel GPS coordinates

df_coor = df['hotel_coordinates_lon_lat']

df['longitude']=''
df['latitude']=''

for y in range(0,len(df_coor)):
    df['longitude'][y] = (df['hotel_coordinates_lon_lat'].str.split(',')[y][0])
    df['latitude'][y] = (df['hotel_coordinates_lon_lat'].str.split(',')[y][1])
    
df    

Unnamed: 0,hotel_city_extract,hotel_url,hotel_name,hotel_coordinates_lon_lat,hotel_score,hotel_description,longitude,latitude
0,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/les-terrasse...,Les Terrasses Poulard,"-1.51037871837616,48.6353494256412",73,Occupant 2 bâtiments différents au cœur du Mon...,-1.51037871837616,48.6353494256412
1,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/le-mouton-bl...,Le Mouton Blanc,"-1.50989592075348,48.6360229844471",72,"Situé au pied de l'abbaye, le Mouton Blanc Hot...",-1.50989592075348,48.6360229844471
2,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/ha-el-la-cro...,Hôtel la Croix Blanche,"-1.50986105203629,48.6357340642713",76,Installé au cœur du village médiéval du Mont-S...,-1.50986105203629,48.6357340642713
3,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/auberge-sain...,Auberge Saint Pierre,"-1.5098825097084,48.6356879786914",81,L'Auberge Saint-Pierre occupe une maison à col...,-1.5098825097084,48.6356879786914
4,Le Mont-Saint-Michel,https://www.booking.com//hotel/fr/la-mere-poul...,La Mère Poulard,"-1.51053965091705,48.635085317234",73,"Occupant un bâtiment historique, l'hôtel La Mè...",-1.51053965091705,48.635085317234
...,...,...,...,...,...,...,...,...
829,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/pierre-vacan...,Résidence Pierre & Vacances Centre,"-1.14639222621918,46.1557694290086",75,La Résidence Pierre & Vacances Centre propose ...,-1.14639222621918,46.1557694290086
830,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/comforthotel...,Hôtel Saint Nicolas,"-1.14891277107745,46.1581345696256",86,L'Hôtel Saint Nicolas dispose d'une connexion ...,-1.14891277107745,46.1581345696256
831,"Centre-ville de La Rochelle, La Rochelle",https://www.booking.com//hotel/fr/le-yachtman....,Hôtel Le Yachtman,"-1.14996761083603,46.1572836509306",76,"Situé à La Rochelle, l'Hôtel Le Yachtman possè...",-1.14996761083603,46.1572836509306
832,La Rochelle,https://www.booking.com//hotel/fr/le-manoir-la...,Le Manoir Hôtel,"-1.16061061620712,46.1627047811366",86,Occupant un bâtiment du XIXe siècle à La Roche...,-1.16061061620712,46.1627047811366


In [403]:
# reorder columns

df= df[['hotel_city_extract','hotel_name', 'hotel_score', 'hotel_url', 'longitude', 'latitude', 'hotel_description']]
df.columns

Index(['hotel_city_extract', 'hotel_name', 'hotel_score', 'hotel_url',
       'longitude', 'latitude', 'hotel_description'],
      dtype='object')

## Check city name list from booking scrap

In [404]:
x = list(df['hotel_city_extract'])
hotel_city_extract_list = set(x)
hotel_city_extract_list

{'1er arr., Lyon',
 '1er arr., Paris',
 '2e arr., Lyon',
 '2e arr., Paris',
 '3e arr., Lyon',
 '3e arr., Paris',
 '4e arr., Paris',
 '5e arr., Lyon',
 '5e arr., Paris',
 '6e arr., Paris',
 'Aigues-Mortes',
 'Aigues-Mortes Medieval City, Aigues-Mortes',
 'Aiguines',
 'Aix-en-Provence',
 'Allemagne-en-Provence',
 'Amiens',
 'Annecy',
 'Ascou',
 'Aston',
 'Aulos',
 'Aulus-les-Bains',
 'Avignon',
 'Ax-les-Thermes',
 'Bauduen',
 'Bayeux',
 'Bayonne',
 'Besancon Old Town, Besançon',
 'Besançon',
 'Biarritz',
 'Bormes-les-Mimosas',
 'Bourse-Esplanade, Strasbourg',
 'Carcassonne',
 'Cassis',
 'Castellane',
 'Centre de Biarritz, Biarritz',
 'Centre de Lille, Lille',
 'Centre de Strasbourg - Petite France - Cathédrale, Strasbourg',
 'Centre de Toulouse, Toulouse',
 "Centre historique d'Aix-en-Provence, Aix-en-Provence",
 "Centre-ville d'Annecy, Annecy",
 "Centre-ville d'Avignon, Avignon",
 'Centre-ville de Colmar, Colmar',
 'Centre-ville de Dijon, Dijon',
 'Centre-ville de La Rochelle, La Rochel

### Flag city name list with issues

In [405]:
df['city_name_test'] = ''

for x in range(0, len(df['hotel_city_extract'])):
    if df['hotel_city_extract'][x].find(',')>0:
        df['city_name_test'][x] = 'pb'
    else:
        df['city_name_test'][x]='ok'
    print(x, end=(' '))



0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 27

In [406]:
# check

df

Unnamed: 0,hotel_city_extract,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description,city_name_test
0,Le Mont-Saint-Michel,Les Terrasses Poulard,73,https://www.booking.com//hotel/fr/les-terrasse...,-1.51037871837616,48.6353494256412,Occupant 2 bâtiments différents au cœur du Mon...,ok
1,Le Mont-Saint-Michel,Le Mouton Blanc,72,https://www.booking.com//hotel/fr/le-mouton-bl...,-1.50989592075348,48.6360229844471,"Situé au pied de l'abbaye, le Mouton Blanc Hot...",ok
2,Le Mont-Saint-Michel,Hôtel la Croix Blanche,76,https://www.booking.com//hotel/fr/ha-el-la-cro...,-1.50986105203629,48.6357340642713,Installé au cœur du village médiéval du Mont-S...,ok
3,Le Mont-Saint-Michel,Auberge Saint Pierre,81,https://www.booking.com//hotel/fr/auberge-sain...,-1.5098825097084,48.6356879786914,L'Auberge Saint-Pierre occupe une maison à col...,ok
4,Le Mont-Saint-Michel,La Mère Poulard,73,https://www.booking.com//hotel/fr/la-mere-poul...,-1.51053965091705,48.635085317234,"Occupant un bâtiment historique, l'hôtel La Mè...",ok
...,...,...,...,...,...,...,...,...
829,"Centre-ville de La Rochelle, La Rochelle",Résidence Pierre & Vacances Centre,75,https://www.booking.com//hotel/fr/pierre-vacan...,-1.14639222621918,46.1557694290086,La Résidence Pierre & Vacances Centre propose ...,pb
830,"Centre-ville de La Rochelle, La Rochelle",Hôtel Saint Nicolas,86,https://www.booking.com//hotel/fr/comforthotel...,-1.14891277107745,46.1581345696256,L'Hôtel Saint Nicolas dispose d'une connexion ...,pb
831,"Centre-ville de La Rochelle, La Rochelle",Hôtel Le Yachtman,76,https://www.booking.com//hotel/fr/le-yachtman....,-1.14996761083603,46.1572836509306,"Situé à La Rochelle, l'Hôtel Le Yachtman possè...",pb
832,La Rochelle,Le Manoir Hôtel,86,https://www.booking.com//hotel/fr/le-manoir-la...,-1.16061061620712,46.1627047811366,Occupant un bâtiment du XIXe siècle à La Roche...,ok


In [407]:
# size
df.groupby('city_name_test').count()

Unnamed: 0_level_0,hotel_city_extract,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description
city_name_test,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ok,439,439,366,439,439,439,439
pb,395,395,365,395,395,395,395


In [408]:
# Correct city name

df['city_name']=''

for x in range(0, len(df['hotel_city_extract'])):
    if df['city_name_test'][x] == 'pb':
        df['city_name'][x] = df['hotel_city_extract'].str.split(',')[x][1]
    else:
        df['city_name'][x] = df['hotel_city_extract'].str.split(',')[x][0]
    print(x, end=(' '))

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 27

In [409]:
# Check

df

Unnamed: 0,hotel_city_extract,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description,city_name_test,city_name
0,Le Mont-Saint-Michel,Les Terrasses Poulard,73,https://www.booking.com//hotel/fr/les-terrasse...,-1.51037871837616,48.6353494256412,Occupant 2 bâtiments différents au cœur du Mon...,ok,Le Mont-Saint-Michel
1,Le Mont-Saint-Michel,Le Mouton Blanc,72,https://www.booking.com//hotel/fr/le-mouton-bl...,-1.50989592075348,48.6360229844471,"Situé au pied de l'abbaye, le Mouton Blanc Hot...",ok,Le Mont-Saint-Michel
2,Le Mont-Saint-Michel,Hôtel la Croix Blanche,76,https://www.booking.com//hotel/fr/ha-el-la-cro...,-1.50986105203629,48.6357340642713,Installé au cœur du village médiéval du Mont-S...,ok,Le Mont-Saint-Michel
3,Le Mont-Saint-Michel,Auberge Saint Pierre,81,https://www.booking.com//hotel/fr/auberge-sain...,-1.5098825097084,48.6356879786914,L'Auberge Saint-Pierre occupe une maison à col...,ok,Le Mont-Saint-Michel
4,Le Mont-Saint-Michel,La Mère Poulard,73,https://www.booking.com//hotel/fr/la-mere-poul...,-1.51053965091705,48.635085317234,"Occupant un bâtiment historique, l'hôtel La Mè...",ok,Le Mont-Saint-Michel
...,...,...,...,...,...,...,...,...,...
829,"Centre-ville de La Rochelle, La Rochelle",Résidence Pierre & Vacances Centre,75,https://www.booking.com//hotel/fr/pierre-vacan...,-1.14639222621918,46.1557694290086,La Résidence Pierre & Vacances Centre propose ...,pb,La Rochelle
830,"Centre-ville de La Rochelle, La Rochelle",Hôtel Saint Nicolas,86,https://www.booking.com//hotel/fr/comforthotel...,-1.14891277107745,46.1581345696256,L'Hôtel Saint Nicolas dispose d'une connexion ...,pb,La Rochelle
831,"Centre-ville de La Rochelle, La Rochelle",Hôtel Le Yachtman,76,https://www.booking.com//hotel/fr/le-yachtman....,-1.14996761083603,46.1572836509306,"Situé à La Rochelle, l'Hôtel Le Yachtman possè...",pb,La Rochelle
832,La Rochelle,Le Manoir Hôtel,86,https://www.booking.com//hotel/fr/le-manoir-la...,-1.16061061620712,46.1627047811366,Occupant un bâtiment du XIXe siècle à La Roche...,ok,La Rochelle


In [410]:
df.columns

Index(['hotel_city_extract', 'hotel_name', 'hotel_score', 'hotel_url',
       'longitude', 'latitude', 'hotel_description', 'city_name_test',
       'city_name'],
      dtype='object')

In [411]:
df = df[['city_name','hotel_city_extract','city_name_test','hotel_name', 'hotel_score', 'hotel_url', 'longitude', 'latitude', 'hotel_description']]
df

Unnamed: 0,city_name,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description
0,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Les Terrasses Poulard,73,https://www.booking.com//hotel/fr/les-terrasse...,-1.51037871837616,48.6353494256412,Occupant 2 bâtiments différents au cœur du Mon...
1,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Le Mouton Blanc,72,https://www.booking.com//hotel/fr/le-mouton-bl...,-1.50989592075348,48.6360229844471,"Situé au pied de l'abbaye, le Mouton Blanc Hot..."
2,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Hôtel la Croix Blanche,76,https://www.booking.com//hotel/fr/ha-el-la-cro...,-1.50986105203629,48.6357340642713,Installé au cœur du village médiéval du Mont-S...
3,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Auberge Saint Pierre,81,https://www.booking.com//hotel/fr/auberge-sain...,-1.5098825097084,48.6356879786914,L'Auberge Saint-Pierre occupe une maison à col...
4,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,La Mère Poulard,73,https://www.booking.com//hotel/fr/la-mere-poul...,-1.51053965091705,48.635085317234,"Occupant un bâtiment historique, l'hôtel La Mè..."
...,...,...,...,...,...,...,...,...,...
829,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Résidence Pierre & Vacances Centre,75,https://www.booking.com//hotel/fr/pierre-vacan...,-1.14639222621918,46.1557694290086,La Résidence Pierre & Vacances Centre propose ...
830,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Hôtel Saint Nicolas,86,https://www.booking.com//hotel/fr/comforthotel...,-1.14891277107745,46.1581345696256,L'Hôtel Saint Nicolas dispose d'une connexion ...
831,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Hôtel Le Yachtman,76,https://www.booking.com//hotel/fr/le-yachtman....,-1.14996761083603,46.1572836509306,"Situé à La Rochelle, l'Hôtel Le Yachtman possè..."
832,La Rochelle,La Rochelle,ok,Le Manoir Hôtel,86,https://www.booking.com//hotel/fr/le-manoir-la...,-1.16061061620712,46.1627047811366,Occupant un bâtiment du XIXe siècle à La Roche...


In [412]:
df['city_name'] = (df['city_name']).str.lstrip()

In [413]:
df['city_name'] = (df['city_name']).str.lstrip()
df['city_name'].sort_values()
df.groupby('city_name').count()

Unnamed: 0_level_0,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description
city_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Aigues-Mortes,25,25,25,19,25,25,25,25
Aiguines,1,1,1,1,1,1,1,1
Aix-en-Provence,25,25,25,19,25,25,25,25
Allemagne-en-Provence,1,1,1,1,1,1,1,1
Amiens,25,25,25,25,25,25,25,25
...,...,...,...,...,...,...,...,...
Tignac,1,1,1,1,1,1,1,1
Toulouse,25,25,25,25,25,25,25,25
Uzès,25,25,25,21,25,25,25,25
Varilhes,1,1,1,0,1,1,1,1


In [414]:
x = list(df['city_name'])
city_name_list = set(x)
city_name_list

{'Aigues-Mortes',
 'Aiguines',
 'Aix-en-Provence',
 'Allemagne-en-Provence',
 'Amiens',
 'Annecy',
 'Ascou',
 'Aston',
 'Aulos',
 'Aulus-les-Bains',
 'Avignon',
 'Ax-les-Thermes',
 'Bauduen',
 'Bayeux',
 'Bayonne',
 'Besançon',
 'Biarritz',
 'Bormes-les-Mimosas',
 'Carcassonne',
 'Cassis',
 'Castellane',
 'Collioure',
 'Colmar',
 'Dijon',
 'Eguisheim',
 'Fougax-et-Barrineuf',
 'Grenoble',
 'Gréoux-les-Bains',
 'Ignaux',
 'La Martre',
 'La Palud-sur-Verdon',
 'La Rochelle',
 'Lasserre',
 'Le Havre',
 'Le Mont-Saint-Michel',
 'Les Cabannes',
 'Les Saintes-Maries-de-la-Mer',
 'Lille',
 'Lorp Sentaraille',
 'Lyon',
 'Léran',
 'Marseille',
 'Mirepoix',
 'Moissac-Bellevue',
 'Montagnac',
 'Montauban',
 'Moustiers-Sainte-Marie',
 'Nîmes',
 'Orschwiller',
 'Paris',
 'Riez',
 'Rouen',
 'Saint-Girons',
 'Saint-Malo',
 'Saint-Martin-de-Brômes',
 'Sainte-Croix-de-Verdon',
 'Saverdun',
 'Strasbourg',
 'Tarascon-sur-Ariège',
 'Tignac',
 'Toulouse',
 'Uzès',
 'Varilhes',
 'Vicdessos'}

In [415]:
# A voir manque colmar/les saintes maries de la mar

filter_pa = df['hotel_city_extract']=='Les+Saintes+Maries+'

# df['city_name'].loc[filter_pa,:]

df[filter_pa]

Unnamed: 0,city_name,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude,latitude,hotel_description


In [416]:
import pandas as pd

In [417]:
dfw = pd.read_csv('Jedha_1_KAYAC_cities_infos_synthesis.csv')

In [418]:
dfw

Unnamed: 0,city_id,city_id_df,city,latitude,longitude,temp_mean,humidity_mean
0,151486647,city_0,Mont Saint Michel,48.635954,-1.51146,5.285,77.125
1,257985771,city_0,St Malo,48.649518,-2.026041,17.38625,67.875
2,257654882,city_1,Bayeux,49.276462,-0.702474,18.3175,54.375
3,256418097,city_2,Le Havre,49.493897,0.107973,15.5,72.125
4,303984676,city_3,Rouen,49.440459,1.093966,19.27125,52.5
5,111607,city_4,Paris,48.856697,2.351462,19.84875,50.5
6,259023929,city_5,Amiens,49.894171,2.295695,19.0925,55.25
7,256373580,city_6,Lille,50.636565,3.063528,19.46125,52.25
8,258573835,city_7,Strasbourg,48.584614,7.750713,19.01,56.25
9,106552831,city_8,Chateau du Haut Koenigsbourg,48.24949,7.344296,16.39125,56.25


In [419]:
x = list(dfw['city'])
city_name_list = set(x)
city_name_list

{'Aigues Mortes',
 'Aix en Provence',
 'Amiens',
 'Annecy',
 'Ariege',
 'Avignon',
 'Bayeux',
 'Bayonne',
 'Besancon',
 'Biarritz',
 'Bormes les Mimosas',
 'Carcassonne',
 'Cassis',
 'Chateau du Haut Koenigsbourg',
 'Collioure',
 'Colmar',
 'Dijon',
 'Eguisheim',
 'Gorges du Verdon',
 'Grenoble',
 'La Rochelle',
 'Le Havre',
 'Lille',
 'Lyon',
 'Marseille',
 'Mont Saint Michel',
 'Montauban',
 'Nimes',
 'Paris',
 'Rouen',
 'Saintes Maries de la mer',
 'St Malo',
 'Strasbourg',
 'Toulouse',
 'Uzes'}

In [420]:
cmu = pd.read_csv('city_mapping_used.csv')

In [421]:
cmu

Unnamed: 0,kayac_city_name_list,dfw_city_mapped
0,Aigues-Mortes,Aigues Mortes
1,Aiguines,Gorges du Verdon
2,Aix-en-Provence,Aix en Provence
3,Allemagne-en-Provence,Gorges du Verdon
4,Amiens,Amiens
...,...,...
59,Tignac,Ariege
60,Toulouse,Toulouse
61,Uzès,Uzes
62,Varilhes,Ariege


In [422]:
dfw_mapped = pd.merge(dfw, cmu, left_on= "city", right_on= "dfw_city_mapped",how='outer')

In [423]:
dfw_mapped

Unnamed: 0,city_id,city_id_df,city,latitude,longitude,temp_mean,humidity_mean,kayac_city_name_list,dfw_city_mapped
0,151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
1,257985771,city_0,St Malo,48.649518,-2.026041,17.38625,67.875,Saint-Malo,St Malo
2,257654882,city_1,Bayeux,49.276462,-0.702474,18.31750,54.375,Bayeux,Bayeux
3,256418097,city_2,Le Havre,49.493897,0.107973,15.50000,72.125,Le Havre,Le Havre
4,303984676,city_3,Rouen,49.440459,1.093966,19.27125,52.500,Rouen,Rouen
...,...,...,...,...,...,...,...,...,...
59,256315994,city_29,Toulouse,43.604462,1.444247,22.33000,48.875,Toulouse,Toulouse
60,258326981,city_30,Montauban,44.017584,1.354999,22.03125,51.750,Montauban,Montauban
61,259303061,city_31,Biarritz,43.471144,-1.552727,20.40500,66.500,Biarritz,Biarritz
62,258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne


In [424]:
df_final = pd.merge(df, dfw_mapped, left_on= "city_name", right_on= "kayac_city_name_list",how='outer')

In [425]:
df_final

Unnamed: 0,city_name,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude_x,latitude_x,hotel_description,city_id,city_id_df,city,latitude_y,longitude_y,temp_mean,humidity_mean,kayac_city_name_list,dfw_city_mapped
0,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Les Terrasses Poulard,73,https://www.booking.com//hotel/fr/les-terrasse...,-1.51037871837616,48.6353494256412,Occupant 2 bâtiments différents au cœur du Mon...,151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
1,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Le Mouton Blanc,72,https://www.booking.com//hotel/fr/le-mouton-bl...,-1.50989592075348,48.6360229844471,"Situé au pied de l'abbaye, le Mouton Blanc Hot...",151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
2,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Hôtel la Croix Blanche,76,https://www.booking.com//hotel/fr/ha-el-la-cro...,-1.50986105203629,48.6357340642713,Installé au cœur du village médiéval du Mont-S...,151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
3,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,Auberge Saint Pierre,81,https://www.booking.com//hotel/fr/auberge-sain...,-1.5098825097084,48.6356879786914,L'Auberge Saint-Pierre occupe une maison à col...,151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
4,Le Mont-Saint-Michel,Le Mont-Saint-Michel,ok,La Mère Poulard,73,https://www.booking.com//hotel/fr/la-mere-poul...,-1.51053965091705,48.635085317234,"Occupant un bâtiment historique, l'hôtel La Mè...",151486647,city_0,Mont Saint Michel,48.635954,-1.511460,5.28500,77.125,Le Mont-Saint-Michel,Mont Saint Michel
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
829,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Résidence Pierre & Vacances Centre,75,https://www.booking.com//hotel/fr/pierre-vacan...,-1.14639222621918,46.1557694290086,La Résidence Pierre & Vacances Centre propose ...,258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.250,La Rochelle,La Rochelle
830,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Hôtel Saint Nicolas,86,https://www.booking.com//hotel/fr/comforthotel...,-1.14891277107745,46.1581345696256,L'Hôtel Saint Nicolas dispose d'une connexion ...,258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.250,La Rochelle,La Rochelle
831,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Hôtel Le Yachtman,76,https://www.booking.com//hotel/fr/le-yachtman....,-1.14996761083603,46.1572836509306,"Situé à La Rochelle, l'Hôtel Le Yachtman possè...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.250,La Rochelle,La Rochelle
832,La Rochelle,La Rochelle,ok,Le Manoir Hôtel,86,https://www.booking.com//hotel/fr/le-manoir-la...,-1.16061061620712,46.1627047811366,Occupant un bâtiment du XIXe siècle à La Roche...,258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.250,La Rochelle,La Rochelle


### filtering on top 3 cities selected from weather dataset

In [426]:
df_final_filtered = df_final[
    (df_final['city_name']=='Bayonne') | 
    (df_final['city_name']=='Biarritz') | 
    (df_final['city_name']=='La Rochelle')
]

In [427]:
df_final_filtered = df_final_filtered.sort_values(by=['hotel_score','temp_mean','humidity_mean'], ascending=False)

In [428]:
df_final_filtered['hotel_score'] = df_final_filtered['hotel_score'].str.replace(',','.')


In [429]:
df_final_filtered['hotel_score'] = pd.to_numeric(df_final_filtered['hotel_score'])

In [430]:
selection = df_final_filtered[df_final_filtered['hotel_score']>=9]

In [438]:
selection

Unnamed: 0,city_name,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude_x,latitude_x,hotel_description,city_id,city_id_df,city,latitude_y,longitude_y,temp_mean,humidity_mean,kayac_city_name_list,dfw_city_mapped
819,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Escale Rochelaise B&B,9.8,https://www.booking.com//hotel/fr/escale-roche...,-1.151562,46.16496,L’Escale Rochelaise B&B possède un jardin ains...,258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
799,Bayonne,Bayonne,ok,Péniche DJEBELLE,9.6,https://www.booking.com//hotel/fr/peniche-djeb...,-1.473749,43.496335,"Située à Bayonne, à moins de 2,3 km de la cath...",258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
812,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,La Belle Amarre,9.6,https://www.booking.com//hotel/fr/la-belle-ama...,-1.155056,46.158662,"Situé à La Rochelle, à moins de 2,2 km des Min...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
824,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,21 Dupaty Le Studio,9.6,https://www.booking.com//hotel/fr/centre-histo...,-1.1529362,46.160046,"Situé dans un quartier central de La Rochelle,...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
822,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Le Cours Ladauge,9.5,https://www.booking.com//hotel/fr/le-cours-lad...,-1.1483601,46.157467,"Situé à La Rochelle, à 700 mètres du parc des ...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
791,Bayonne,Bayonne,ok,Parc 709 Bayonne,9.4,https://www.booking.com//hotel/fr/parc-709-bay...,-1.481317,43.495606,"Situé à Bayonne, à moins de 1 km de la cathédr...",258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
800,Bayonne,Bayonne,ok,5 Rue des Faures,9.4,https://www.booking.com//hotel/fr/5-rue-des-fa...,-1.4790953,43.4893908,Le 5 Rue des Faures propose un hébergement ave...,258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
778,Biarritz,"Centre de Biarritz, Biarritz",pb,Appartement Centre de Biarritz,9.4,https://www.booking.com//hotel/fr/appartement-...,-1.55849993389211,43.4803863594993,"Situé à 3,9 km de la gare de Biarritz La Négre...",259303061,city_31,Biarritz,43.471144,-1.552727,20.405,66.5,Biarritz,Biarritz
817,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,"Le Villemarais, site d'exception",9.4,https://www.booking.com//hotel/fr/le-villemara...,-1.155067257672,46.16001032455,"Doté d'un jardin, Le Villemarais, site d'excep...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
784,Bayonne,Bayonne,ok,Large and confortable apartment on Bayonne cit...,9.3,https://www.booking.com//hotel/fr/large-and-co...,-1.46441379999999,43.4940723,Le Large and confortable apartment on Bayonne ...,258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne


### AWS push

In [461]:
csv = selection.to_csv('selection.csv')

In [462]:
import boto3
import awswrangler as wr
import io

ACCESS_ID="AKIAT7YRVEMFJKQ7QJRT" 
ACCESS_KEY="QBgX5b32yXIDc5+agTdXZXwdxkYIWDWpVbHRToF3"
    
session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY)
# s3 = boto3.resource('s3',
#          aws_access_key_id=ACCESS_ID,
#          aws_secret_access_key= ACCESS_KEY)

In [463]:
s3 = session.client("s3")

In [464]:
bucket = s3.create_bucket(Bucket="1-kayak-files")

In [465]:
# put_object = bucket.put_object(Key="selection.csv", Body=csv)

s3.upload_file(Filename='selection.csv',Bucket='1-kayak-files',Key='selection.csv')

### AWS read

In [466]:
# s3 = boto3.client('s3')
obj = s3.get_object(Bucket='1-kayak-files',Key='selection.csv')
df = pd.read_csv(io.BytesIO(obj['Body'].read()))

In [467]:
df

Unnamed: 0.1,Unnamed: 0,city_name,hotel_city_extract,city_name_test,hotel_name,hotel_score,hotel_url,longitude_x,latitude_x,hotel_description,city_id,city_id_df,city,latitude_y,longitude_y,temp_mean,humidity_mean,kayac_city_name_list,dfw_city_mapped
0,819,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Escale Rochelaise B&B,9.8,https://www.booking.com//hotel/fr/escale-roche...,-1.151562,46.16496,L’Escale Rochelaise B&B possède un jardin ains...,258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
1,799,Bayonne,Bayonne,ok,Péniche DJEBELLE,9.6,https://www.booking.com//hotel/fr/peniche-djeb...,-1.473749,43.496335,"Située à Bayonne, à moins de 2,3 km de la cath...",258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
2,812,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,La Belle Amarre,9.6,https://www.booking.com//hotel/fr/la-belle-ama...,-1.155056,46.158662,"Situé à La Rochelle, à moins de 2,2 km des Min...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
3,824,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,21 Dupaty Le Studio,9.6,https://www.booking.com//hotel/fr/centre-histo...,-1.152936,46.160046,"Situé dans un quartier central de La Rochelle,...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
4,822,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,Le Cours Ladauge,9.5,https://www.booking.com//hotel/fr/le-cours-lad...,-1.14836,46.157467,"Situé à La Rochelle, à 700 mètres du parc des ...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
5,791,Bayonne,Bayonne,ok,Parc 709 Bayonne,9.4,https://www.booking.com//hotel/fr/parc-709-bay...,-1.481317,43.495606,"Situé à Bayonne, à moins de 1 km de la cathédr...",258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
6,800,Bayonne,Bayonne,ok,5 Rue des Faures,9.4,https://www.booking.com//hotel/fr/5-rue-des-fa...,-1.479095,43.489391,Le 5 Rue des Faures propose un hébergement ave...,258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne
7,778,Biarritz,"Centre de Biarritz, Biarritz",pb,Appartement Centre de Biarritz,9.4,https://www.booking.com//hotel/fr/appartement-...,-1.5585,43.480386,"Situé à 3,9 km de la gare de Biarritz La Négre...",259303061,city_31,Biarritz,43.471144,-1.552727,20.405,66.5,Biarritz,Biarritz
8,817,La Rochelle,"Centre-ville de La Rochelle, La Rochelle",pb,"Le Villemarais, site d'exception",9.4,https://www.booking.com//hotel/fr/le-villemara...,-1.155067,46.16001,"Doté d'un jardin, Le Villemarais, site d'excep...",258418538,city_33,La Rochelle,46.159113,-1.152043,19.67625,56.25,La Rochelle,La Rochelle
9,784,Bayonne,Bayonne,ok,Large and confortable apartment on Bayonne cit...,9.3,https://www.booking.com//hotel/fr/large-and-co...,-1.464414,43.494072,Le Large and confortable apartment on Bayonne ...,258745823,city_32,Bayonne,43.493338,-1.475099,21.07875,61.875,Bayonne,Bayonne


In [448]:
!pip install plotly

Collecting plotly
  Downloading plotly-5.5.0-py2.py3-none-any.whl (26.5 MB)
[K     |████████████████████████████████| 26.5 MB 8.5 MB/s eta 0:00:01     |███████████████▍                | 12.8 MB 8.5 MB/s eta 0:00:02
Collecting tenacity>=6.2.0
  Using cached tenacity-8.0.1-py3-none-any.whl (24 kB)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.5.0 tenacity-8.0.1


### Let's select a city based on temperature

In [492]:
# Hotels results

map = px.scatter_mapbox(df, lat="latitude_y", lon="longitude_y", color="temp_mean",hover_name="hotel_name", size="temp_mean", size_max=20,
                        mapbox_style="carto-positron",zoom=5)
map.show()

### ok the winner is basque country !

### Let's dive into these cities to check wich hotel to pickup based on their booking scores !

In [488]:
df_otherscities = df[df['city_name']!='La Rochelle']

map = px.scatter_mapbox(df_otherscities, lat="latitude_x", lon="longitude_x", color="hotel_score",hover_name="hotel_name", size="temp_mean",
                        mapbox_style="carto-positron",zoom=12)
map.show()

### Based on score, we are going to bayonne peniche ... 

In [479]:
# Hotels results La Rochelle
df_city1 = df[df['city_name']=='La Rochelle']

map = px.scatter_mapbox(df_city1, lat="latitude_x", lon="longitude_x", color="hotel_score",hover_name="hotel_name", size="temp_mean",
                        mapbox_style="carto-positron",zoom=14)
map.show()