# Plan your trip with booking

## Project 🚧

The marketing team needs help on a new project. After doing some user research, the team discovered that **70% of their users who are planning a trip would like to have more information about the destination they are going to**. 

In addition, user research shows that **people tend to be defiant about the information they are reading if they don't know the brand** which produced the content. 

Therefore, Kayak Marketing Team would like to create an application that will recommend where people should plan their next holidays. The application should be based on real data about:

* Weather 
* Hotels in the area 

The application should then be able to recommend the best destinations and hotels based on the above variables at any given time.

To complete this project we will deliver :

* A `.csv` file in an S3 bucket containing enriched information about weather and hotels for each french city

* A SQL Database where we should be able to get the same cleaned data from S3 

* Two maps with Top-5 destinations and a Top-20 hotels in the area.

In [1]:
!pip install plotly -q
!pip install scrapy -q
#Importing libraries
import requests
import pandas as pd
import requests
# Import os => Library used to easily manipulate operating systems
## More info => https://docs.python.org/3/library/os.html
import os 
# Import logging => Library used for logs manipulation 
## More info => https://docs.python.org/3/library/logging.html
import logging

# Import scrapy and scrapy.crawler 
import scrapy
from scrapy.crawler import CrawlerProcess

import pandas as pd

import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as pyp
import plotly.io as pio
pd.options.display.max_columns = None
pio.renderers.default = "iframe_connected"

### Cities list

In [2]:
# Cities 
desired_booking_searches = ["Mont Saint Michel","St Malo","Bayeux","Le Havre","Rouen","Paris","Amiens",
"Lille","Strasbourg","Chateau du Haut Koenigsbourg","Colmar","Eguisheim","Besancon","Dijon","Annecy","Grenoble","Lyon",
"Gorges du Verdon","Bormes les Mimosas","Cassis","Marseille","Aix en Provence","Avignon","Uzes","Nimes","Aigues Mortes",
"Saintes Maries de la mer","Collioure","Carcassonne","Toulouse","Montauban","Biarritz","Bayonne","La Rochelle"]


#desired_booking_searches.apply(lambda x :" ","+") 
desired_booking_searches2 = [] #These are the citie with a + replacing ' ' spaces to add it to the URL for requesting the API.
for string in desired_booking_searches:
    new_string = string.replace(" ", "+")
    desired_booking_searches2.append(new_string)
print(desired_booking_searches2)                        

['Mont+Saint+Michel', 'St+Malo', 'Bayeux', 'Le+Havre', 'Rouen', 'Paris', 'Amiens', 'Lille', 'Strasbourg', 'Chateau+du+Haut+Koenigsbourg', 'Colmar', 'Eguisheim', 'Besancon', 'Dijon', 'Annecy', 'Grenoble', 'Lyon', 'Gorges+du+Verdon', 'Bormes+les+Mimosas', 'Cassis', 'Marseille', 'Aix+en+Provence', 'Avignon', 'Uzes', 'Nimes', 'Aigues+Mortes', 'Saintes+Maries+de+la+mer', 'Collioure', 'Carcassonne', 'Toulouse', 'Montauban', 'Biarritz', 'Bayonne', 'La+Rochelle']


### Getting informations from the API

In [3]:
#Let's take a look at the JSON file for Paris only.
city_url = "https://nominatim.openstreetmap.org/search?q={}&format=json".format("Paris")
r = requests.get(city_url)

In [4]:
r.json()

[{'place_id': 281739181,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'relation',
  'osm_id': 7444,
  'boundingbox': ['48.8155755', '48.902156', '2.224122', '2.4697602'],
  'lat': '48.8588897',
  'lon': '2.3200410217200766',
  'display_name': 'Paris, Île-de-France, France métropolitaine, France',
  'class': 'boundary',
  'type': 'administrative',
  'importance': 0.9417101715588673,
  'icon': 'https://nominatim.openstreetmap.org/ui/mapicons//poi_boundary_administrative.p.20.png'},
 {'place_id': 281718298,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'relation',
  'osm_id': 71525,
  'boundingbox': ['48.8155755', '48.902156', '2.224122', '2.4697602'],
  'lat': '48.8566969',
  'lon': '2.3514616',
  'display_name': 'Paris, Île-de-France, France métropolitaine, France',
  'class': 'boundary',
  'type': 'administrative',
  'importance': 0.9417101715588673,
  'icon': 'https://nominat

In [5]:
#Here, we've located the latitude and longitude infos
latitude = r.json()[0]['lat']
longitude = r.json()[0]['lon']

In [6]:
#Ley's create a dataframe to store every new information we get.
df = pd.DataFrame(desired_booking_searches)
df = df.rename(columns={0: 'Cities'})
df.head()

Unnamed: 0,Cities
0,Mont Saint Michel
1,St Malo
2,Bayeux
3,Le Havre
4,Rouen


GETTING CITIES COORDONNATES

In [7]:
#Let's Loop on Cities names to add latitudes and longitudes in our dataframe df.
for i in range(len(df)):

    city_url = "https://nominatim.openstreetmap.org/search?q={}&format=json".format(desired_booking_searches2[i])
    r = requests.get(city_url)
    latitude = r.json()[0]["lat"]
    longitude = r.json()[0]["lon"]
    df.loc[i, 'Latitude'] = latitude #Adds Latitude infos to the Dataframe
    df.loc[i, 'Longitude'] = longitude #Adds Longitude infos to the Dataframe
    
df.head()

Unnamed: 0,Cities,Latitude,Longitude
0,Mont Saint Michel,48.6359541,-1.511459954959514
1,St Malo,48.649518,-2.0260409
2,Bayeux,49.2764624,-0.7024738
3,Le Havre,49.4938975,0.1079732
4,Rouen,49.4404591,1.0939658


ADDING WEATHER INFOS

In [8]:
#Getting infos from Weather API
city_weather = "https://api.openweathermap.org/data/2.5/onecall?lat=48.63552322&lon=-1.5102571&units=metric&exclude=hourly,minutely,alerts&appid=43bc8aba840c225183db5d27a92fe653"
r2 = requests.get(city_weather)
r2

<Response [200]>

In [9]:
r2.json()['daily']

[{'dt': 1641556800,
  'sunrise': 1641542247,
  'sunset': 1641572799,
  'moonrise': 1641553680,
  'moonset': 1641594120,
  'moon_phase': 0.17,
  'temp': {'day': 6.74,
   'min': 5.45,
   'max': 8.97,
   'night': 8.65,
   'eve': 8.54,
   'morn': 6.55},
  'feels_like': {'day': 2.19, 'night': 4.84, 'eve': 3.9, 'morn': 2.01},
  'pressure': 1019,
  'humidity': 70,
  'dew_point': 1.66,
  'wind_speed': 11.92,
  'wind_deg': 257,
  'wind_gust': 19.7,
  'weather': [{'id': 501,
    'main': 'Rain',
    'description': 'moderate rain',
    'icon': '10d'}],
  'clouds': 75,
  'pop': 1,
  'rain': 10.73,
  'uvi': 0.61},
 {'dt': 1641643200,
  'sunrise': 1641628626,
  'sunset': 1641659270,
  'moonrise': 1641641160,
  'moonset': 0,
  'moon_phase': 0.21,
  'temp': {'day': 9.77,
   'min': 6.6,
   'max': 10.31,
   'night': 6.6,
   'eve': 7.2,
   'morn': 8.51},
  'feels_like': {'day': 5.7, 'night': 1.34, 'eve': 1.42, 'morn': 4.93},
  'pressure': 1008,
  'humidity': 88,
  'dew_point': 7.66,
  'wind_speed': 15.41,

In [10]:
r2.json()['daily'][0]['dt']

1641556800

In [11]:
#Let's locate some temperatures informations
day_temp = r2.json()['daily'][0]['temp']['day']
night_temp = r2.json()['daily'][0]['temp']['night']

In [12]:
#Daily Weather conditions
weather_conditions = r2.json()['daily'][0]['weather'][0]['id']

In [13]:
#Weather for current day (day, night and overall weather description)
for i in range(len(df)):

    city_weather = "https://api.openweathermap.org/data/2.5/onecall?lat={}2&lon={}&units=metric&exclude=hourly,minutely,alerts&appid=43bc8aba840c225183db5d27a92fe653".format(df['Latitude'][i],df['Longitude'][i])
    r2 = requests.get(city_weather)
    #Let's locate some temperatures informations
    day_temp = r2.json()['daily'][0]['temp']['day']
    night_temp = r2.json()['daily'][0]['temp']['night']
    #Daily Weather conditions
    weather_conditions = r2.json()['daily'][0]['weather'][0]['description']
    df.loc[i, 'dtf_0'] = day_temp #Adds Day temperature infos to the Dataframe and '0' for current day
    df.loc[i, 'ntf_0'] = night_temp #Adds Night temperature infos to the Dataframe  and '0' for current day
    df.loc[i, 'wf_0'] = weather_conditions #As wf for Weather forecast  and '0' for current day
    
df.head()

Unnamed: 0,Cities,Latitude,Longitude,dtf_0,ntf_0,wf_0
0,Mont Saint Michel,48.6359541,-1.511459954959514,6.89,8.65,moderate rain
1,St Malo,48.649518,-2.0260409,7.04,9.23,moderate rain
2,Bayeux,49.2764624,-0.7024738,6.82,6.42,light rain
3,Le Havre,49.4938975,0.1079732,6.69,6.53,light rain
4,Rouen,49.4404591,1.0939658,6.35,4.34,moderate rain


In [14]:
#Let's get the forecast for the next 7 days.
for i in range(len(df)):
    
    city_weather = "https://api.openweathermap.org/data/2.5/onecall?lat={}2&lon={}&units=metric&exclude=hourly,minutely,alerts&appid=43bc8aba840c225183db5d27a92fe653".format(df['Latitude'][i],df['Longitude'][i])
    r2 = requests.get(city_weather)
    
    for dt in range(1,7): #Description for the next 7 days
        get_weather = r2.json()['daily'][dt]['weather'][0]['description']
        df.loc[i, 'wf_{}'.format(dt)] = get_weather #As wf for weather forecast
        
    for dt in range(1,7): #Forecasted day temp
        get_temp = r2.json()['daily'][dt]['temp']['day'] #As tf for day temp forecast
        df.loc[i, 'dtf_{}'.format(dt)] = get_temp
        
    for dt in range(1,7): #Forecasted night temp
        get_temp = r2.json()['daily'][dt]['temp']['night'] #As ntf for night temp forecast and '0' for current day
        df.loc[i, 'ntf_{}'.format(dt)] = get_temp
        
df.head()

Unnamed: 0,Cities,Latitude,Longitude,dtf_0,ntf_0,wf_0,wf_1,wf_2,wf_3,wf_4,wf_5,wf_6,dtf_1,dtf_2,dtf_3,dtf_4,dtf_5,dtf_6,ntf_1,ntf_2,ntf_3,ntf_4,ntf_5,ntf_6
0,Mont Saint Michel,48.6359541,-1.511459954959514,6.89,8.65,moderate rain,heavy intensity rain,light rain,overcast clouds,scattered clouds,broken clouds,clear sky,9.77,8.21,9.39,6.03,6.55,4.0,6.61,8.43,7.56,2.85,1.9,-1.12
1,St Malo,48.649518,-2.0260409,7.04,9.23,moderate rain,heavy intensity rain,light rain,overcast clouds,scattered clouds,broken clouds,clear sky,10.32,8.56,9.56,6.31,6.64,4.08,7.31,9.66,8.48,4.56,4.7,1.22
2,Bayeux,49.2764624,-0.7024738,6.82,6.42,light rain,heavy intensity rain,light rain,overcast clouds,broken clouds,overcast clouds,clear sky,9.01,8.09,8.67,6.04,5.93,3.38,5.35,5.5,6.24,2.45,2.44,-1.75
3,Le Havre,49.4938975,0.1079732,6.69,6.53,light rain,heavy intensity rain,moderate rain,light rain,overcast clouds,overcast clouds,clear sky,9.23,8.06,8.49,5.37,6.32,4.11,7.49,8.3,5.81,4.03,3.38,1.15
4,Rouen,49.4404591,1.0939658,6.35,4.34,moderate rain,heavy intensity rain,light rain,overcast clouds,overcast clouds,overcast clouds,clear sky,7.98,5.47,7.01,4.69,4.99,3.67,5.72,3.87,2.01,1.02,0.44,-0.9


In [15]:
#Let's create a specific ID for every city
ID = 1
df['id'] = ""
for i in range(len(df)):
    df['id'][i] = ID
    ID += 1

# shift column 'C' to first position
first_column = df.pop('id')
# insert column using insert(position,column_name,first_column) function
df.insert(0, 'id', first_column)

df.head()



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,id,Cities,Latitude,Longitude,dtf_0,ntf_0,wf_0,wf_1,wf_2,wf_3,wf_4,wf_5,wf_6,dtf_1,dtf_2,dtf_3,dtf_4,dtf_5,dtf_6,ntf_1,ntf_2,ntf_3,ntf_4,ntf_5,ntf_6
0,1,Mont Saint Michel,48.6359541,-1.511459954959514,6.89,8.65,moderate rain,heavy intensity rain,light rain,overcast clouds,scattered clouds,broken clouds,clear sky,9.77,8.21,9.39,6.03,6.55,4.0,6.61,8.43,7.56,2.85,1.9,-1.12
1,2,St Malo,48.649518,-2.0260409,7.04,9.23,moderate rain,heavy intensity rain,light rain,overcast clouds,scattered clouds,broken clouds,clear sky,10.32,8.56,9.56,6.31,6.64,4.08,7.31,9.66,8.48,4.56,4.7,1.22
2,3,Bayeux,49.2764624,-0.7024738,6.82,6.42,light rain,heavy intensity rain,light rain,overcast clouds,broken clouds,overcast clouds,clear sky,9.01,8.09,8.67,6.04,5.93,3.38,5.35,5.5,6.24,2.45,2.44,-1.75
3,4,Le Havre,49.4938975,0.1079732,6.69,6.53,light rain,heavy intensity rain,moderate rain,light rain,overcast clouds,overcast clouds,clear sky,9.23,8.06,8.49,5.37,6.32,4.11,7.49,8.3,5.81,4.03,3.38,1.15
4,5,Rouen,49.4404591,1.0939658,6.35,4.34,moderate rain,heavy intensity rain,light rain,overcast clouds,overcast clouds,overcast clouds,clear sky,7.98,5.47,7.01,4.69,4.99,3.67,5.72,3.87,2.01,1.02,0.44,-0.9


### Rearrangin the dataframe to show weather forecasts vertically, by city.

In [16]:
#Flipping the dataframe into multiindex df to create animated maps for Day temp forecast + Night temp forecast and weather description
df_flip = pd.wide_to_long(df, 
                          stubnames=['dtf', 'ntf', 'wf'], 
                          sep ='_',
                          i=['id', 'Cities', 'Latitude', 'Longitude'], 
                          j='day')
df_flip

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,dtf,ntf,wf
id,Cities,Latitude,Longitude,day,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,Mont Saint Michel,48.6359541,-1.511459954959514,0,6.89,8.65,moderate rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,1,9.77,6.61,heavy intensity rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,2,8.21,8.43,light rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,3,9.39,7.56,overcast clouds
1,Mont Saint Michel,48.6359541,-1.511459954959514,4,6.03,2.85,scattered clouds
...,...,...,...,...,...,...,...
34,La Rochelle,46.1591126,-1.1520434,2,9.64,10.02,moderate rain
34,La Rochelle,46.1591126,-1.1520434,3,9.59,8.86,light rain
34,La Rochelle,46.1591126,-1.1520434,4,6.44,3.82,clear sky
34,La Rochelle,46.1591126,-1.1520434,5,5.92,3.60,broken clouds


In [17]:
df_flip.reset_index(level=['Cities', 'Latitude', 'Longitude', 'day'], inplace = True) #Multiindexes as columns 
df_flip

Unnamed: 0_level_0,Cities,Latitude,Longitude,day,dtf,ntf,wf
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,Mont Saint Michel,48.6359541,-1.511459954959514,0,6.89,8.65,moderate rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,1,9.77,6.61,heavy intensity rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,2,8.21,8.43,light rain
1,Mont Saint Michel,48.6359541,-1.511459954959514,3,9.39,7.56,overcast clouds
1,Mont Saint Michel,48.6359541,-1.511459954959514,4,6.03,2.85,scattered clouds
...,...,...,...,...,...,...,...
34,La Rochelle,46.1591126,-1.1520434,2,9.64,10.02,moderate rain
34,La Rochelle,46.1591126,-1.1520434,3,9.59,8.86,light rain
34,La Rochelle,46.1591126,-1.1520434,4,6.44,3.82,clear sky
34,La Rochelle,46.1591126,-1.1520434,5,5.92,3.60,broken clouds


In [19]:
#Export du dataframe
df_flip.to_csv('cities_weather.csv')

#### Open 'PROJECT Plan_your_trip_part2' to read more 