<p>In this last phase, we will : </p> 
<ul> 
    <li> retrieve the structured data from the database   </li> 
    <li> Vizualize recommendations </li> 
</ul>

### Table of Contents

* [1. Relational Data Storage](#section1)
* [2. Cities where the weather will be the nicest](#section2)
    * [2.1. Visualize cities with mapBox](#section21)
* [3. Discover hotels in the chosen cities](#section3)
    * [3.1. Visualize hotels with mapBox](#section31)

In [None]:
# access to RDS
from sqlalchemy import create_engine, text
import psycopg2

# plotting
import plotly.express as px
import plotly.io as pio
pio.renderers.default = "iframe_connected"

# 1. Relational Data Storage <a class="anchor" id="section1"></a>

In [8]:
dbuser = ''
dbpass = ''
dbhost = ''
dbname = ''

engine = create_engine(f"postgresql+psycopg2://{dbuser}:{dbpass}@{dbhost}/{dbname}", echo=True)

# 2. Cities where the weather will be the nicest <a class="anchor" id="section2"></a> ☀️ 😎

🗒 Comparing two cities with reference to the three criteria (temperature, humidity and percipitation_p) depends on one's perception of what could be a good weather. It depends on one's life style and especially its region climate.   
🗒  We could have used the Universal Thermal Climate Index (UTCI), however, the One Call API doesn't supply such data.   
🗒  According to this article ([Weather perception and its impact on out-of-home leisure activity participation decisions](https://www.tandfonline.com/doi/full/10.1080/21680566.2020.1733703)), temperature, percipitation and UTCI are the most important factors that may influence someone's decision of going out. That's why, we will sort our data giving three available criteria in this order :  temperature, percipitation, humidity 

In [74]:
n_best = int(input('How many cities would you like to get in your recommendation: '))

How many cities would you like to get in your recommendation:  6


In [78]:
# get the data from the data lake
conn = engine.connect()

query = f'SELECT city.id, city.name, city.latitude, city.longitude, AVG(weather.temperature), AVG(weather.precipitation_p), AVG(weather.humidity)\
         FROM weather\
         join city\
         on city.id=weather.cid\
         GROUP BY  weather.cid, city.id \
         ORDER BY AVG(weather.temperature) desc, AVG(weather.precipitation_p) desc, AVG(weather.humidity) desc\
         LIMIT {n_best}'

stmt = text(query)
result = conn.execute(stmt)
result.fetchmany
best_cities_df = pd.DataFrame(result.fetchall(), columns=['id', 'city', 'latitude', 'longitude','avg_temperature','avg_precipitation','avg_humidity'])
best_cities_df

2022-01-16 22:17:17,232 INFO sqlalchemy.engine.base.Engine SELECT city.id, city.name, city.latitude, city.longitude, AVG(weather.temperature), AVG(weather.precipitation_p), AVG(weather.humidity)         FROM weather         join city         on city.id=weather.cid         GROUP BY  weather.cid, city.id          ORDER BY AVG(weather.temperature) desc, AVG(weather.precipitation_p) desc, AVG(weather.humidity) desc         LIMIT 6
2022-01-16 22:17:17,234 INFO sqlalchemy.engine.base.Engine {}


Unnamed: 0,id,city,latitude,longitude,avg_temperature,avg_precipitation,avg_humidity
0,18,Bormes les Mimosas,43.157217,6.329254,11.664286,0.024286,51.14285714285714
1,27,Collioure,42.52505,3.083155,11.21,0.0,49.285714285714285
2,19,Cassis,43.214036,5.539632,11.094286,0.007143,49.857142857142854
3,20,Marseille,43.296174,5.369953,10.222857,0.002857,51.57142857142857
4,25,Aigues Mortes,43.565823,4.191284,9.621429,0.0,49.42857142857143
5,26,Saintes Maries de la mer,43.452277,4.428717,9.427143,0.0,53.857142857142854


## 2.1. Visualize results with mapBox 📊 <a class="anchor" id="section21"></a>

In [6]:
fig = px.scatter_mapbox(best_cities_df, lat="latitude", lon="longitude", hover_name='city', color="avg_temperature", size =1 - best_cities_df['avg_precipitation'],
                        hover_data=['avg_temperature','avg_precipitation'], zoom=6, height=500, width=900, color_continuous_scale='bluered')

fig.update_layout(mapbox_style="open-street-map")

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

fig.show()

![image](data/best_city.png) 

In [1]:
import webbrowser
import os

filename = 'file:///'+os.getcwd()+'/' + 'maps/best_cities.html'
webbrowser.open_new_tab(filename)

True

# 3. Discover hotels 🏩 in the chosen cities 🌇 <a class="anchor" id="section3"></a>

In [97]:
# this is the very simple way to interact with user without any cheking for erroneous input but
# we could use Tkinter package to make more ergonomic GUIs 
def interface():
    n_cities = int(input('Enter the number of cities to discover:\n'))
    list_cities = tuple(input("Enter the city name and press enter: \n") for _ in range(n_cities))
    criterion = input("Choose one criterion to filter hotels: \n s for star rating \n g for guests reviews \n p for price per night \n ")
    
    return list_cities, criterion

In [98]:
list_cities, criterion = interface()

Enter the number of cities to discover:
 2
Enter the city name and press enter: 
 Bormes les mimosas
Enter the city name and press enter: 
 Collioure
Choose one criterion to filter hotels: 
 s for star rating 
 g for guests reviews 
 p for price per night 
  s


In [109]:
# we suppose that there is no erroneous input
list_cities = tuple(s.lower() for s in list_cities)

filter_col = 'star'
if criterion == 'g':
    filter_col = 'rating'
elif criterion == 'p':
    filter_col = 'price'

In [110]:
# get the data from the data lake
conn = engine.connect()

query = f'select hotels.name, hotels.star, hotels.rating, hotels.rating_title, hotels.price, hotels.lat, hotels.lon, city.name\
          from hotels\
          join city\
          on hotels.cid = city.id\
          where lower(city.name) in {list_cities}\
          and hotels.{filter_col} is not null\
          order by hotels.{filter_col} desc'

stmt = text(query)
result = conn.execute(stmt)
result.fetchmany
best_hotels_df = pd.DataFrame(result.fetchall(), columns=['hotel_name', 'star', 'rating', 'rating_title','price','latitude','longitude', 'city_name'])
best_hotels_df

2022-01-16 23:36:00,310 INFO sqlalchemy.engine.base.Engine select hotels.name, hotels.star, hotels.rating, hotels.rating_title, hotels.price, hotels.lat, hotels.lon, city.name          from hotels          join city          on hotels.cid = city.id          where lower(city.name) in ('bormes les mimosas', 'collioure')          and hotels.star is not null          order by hotels.star desc
2022-01-16 23:36:00,312 INFO sqlalchemy.engine.base.Engine {}


Unnamed: 0,hotel_name,star,rating,rating_title,price,latitude,longitude,city_name
0,Hôtel La Casa Pairal,4.0,8.7,Superbe,145.5,42.526167,3.082356,Collioure
1,Eden Rose Grand Hotel BW Premier Collection,4.0,8.7,Superbe,203.4,43.152872,6.342654,Bormes les Mimosas
2,Hotel Les Jardins de Bormes,3.0,8.6,Superbe,93.3,43.148934,6.303266,Bormes les Mimosas
3,Hotel La Voile,3.0,7.9,Bien,,43.12549,6.3573,Bormes les Mimosas
4,Hotel Méditerranée,3.0,7.9,Bien,,42.527083,3.080263,Collioure
5,Hostellerie du Cigalou - Les Collectionneurs,3.0,8.0,Très bien,83.3,43.151958,6.343284,Bormes les Mimosas
6,Hôtel Princes de Catalogne,3.0,8.1,Très bien,,42.525914,3.082682,Collioure
7,La Frégate,3.0,8.1,Très bien,85.3,42.526386,3.083257,Collioure
8,Le Madeloc Hôtel & Spa,3.0,8.8,Superbe,106.48,42.528765,3.078654,Collioure
9,Le Mas des Citronniers,3.0,8.0,Très bien,,42.525382,3.082564,Collioure


## 3.1. Visualize results with mapBox 📊 <a class="anchor" id="section31"></a>

In [34]:
fig = px.scatter_mapbox(best_hotels_df, lat="latitude", lon="longitude", hover_name='hotel_name', color=filter_col, size = filter_col,
                        hover_data=['city_name', 'star', 'rating', 'rating_title','price'], height=500, width=900, 
                        color_continuous_scale='oranges', zoom=7)

fig.update_layout(mapbox_style="open-street-map")

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

fig.show()

![image](data/best_hotel.png) 

In [5]:
import webbrowser
import os

filename = 'file:///'+os.getcwd()+'/' + 'maps/best_hotels.html'
webbrowser.open_new_tab(filename)

True