# Getting Started with Bokeh and Pandas

-------------------------------------------------

## Outline

Here we look at populous cities from each continent, except Antartica. We will use:

- Bokeh mapping function to show their location


- Scatter plot to compare climate of the cities


- hbar to plot population of cities


- vbar to plot highest natural altitude in the Urban Areas


- Utilize Bokeh Palette to color each city


- Tooltips usage across various plots


- Use box_select or lasso_select tool to filter cities


## Dataset

- Data is gathered from [Wikipedia list of largest cities](https://en.wikipedia.org/wiki/List_of_largest_cities). Considering Urban Area Population. Sydney was an outlier and data was gathered from this [Wikipedia Page](https://en.wikipedia.org/wiki/Sydney).


- Climate data has the maximum average temperature for the hottest summer month and minimum average temperature for the coldest month for each city.


- Altitude is the highest natural elevation found in the Urban Area of each city.


## Disclaimer

This is not a scientific study, just a showcase how we can combine Bokeh and Pandas to create an interactive plot.

---------------------------------------------------------

# Import Libraries
---

In [None]:
import numpy as np
import pandas as pd
from bokeh.io import show, output_notebook
from bokeh.models import ColumnDataSource, Range1d
from bokeh.plotting import figure, output_file
from bokeh.layouts import column, row, gridplot
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
from bokeh.palettes import Colorblind6
from bokeh.transform import factor_cmap

output_notebook()

# Import Dataset
---

In [None]:
file_path = ".//data//"
file_name = "most_populated_cities_world.csv"
df_world_cities = pd.read_csv(file_path+file_name)
df_world_cities.city = df_world_cities.city.astype(str)
print(len(df_world_cities.index))
print(df_world_cities)

# Function to convert Latitude and Longitude to Mercator
---

In [None]:
def convert_lat_long_to_mercator(pandas_dataframe):
    conversion_constant = 6378137
    pandas_dataframe["longitude"] = pandas_dataframe["longitude"] * (conversion_constant * np.pi/180.0)
    pandas_dataframe["latitude"] = np.log(np.tan((90 + pandas_dataframe["latitude"]) * 
                                                 np.pi/360.0)) * conversion_constant
    
    return pandas_dataframe

# Common Plot Properties
---

In [None]:
convert_lat_long_to_mercator(df_world_cities)
data_source = ColumnDataSource(df_world_cities)
TOOLS = "pan,wheel_zoom,box_zoom,reset,save,box_select,lasso_select"
TOOLTIPS = [ ("City_Name", "@city"), ("Continent", "@continent"), ("Population (millions)", "@population_million{0,0.0}")]
output_file(".//world_cities_visualization.html")
# Setup colors for each city
city_colors = factor_cmap('city', palette=Colorblind6, factors=df_world_cities['city'].unique()) 

# Setup Map Plot
---

In [None]:
tile_provider = get_provider(CARTODBPOSITRON)
mercator_extent = dict(start=-15000000, end=15000000,bounds=None)
map_x_range= Range1d(**mercator_extent)
map_y_range= Range1d(**mercator_extent)
# range bounds supplied in web mercator coordinates
map_plot = figure(title="City locations", x_range=map_x_range, y_range=map_y_range, tools=TOOLS, 
                  tooltips = TOOLTIPS, x_axis_type="mercator", y_axis_type="mercator", 
                  plot_width = 800, plot_height = 600)
map_plot.add_tile(tile_provider)
map_plot.circle("longitude","latitude", source = data_source, color=city_colors, fill_alpha=0.75, size=8, 
                line_color='black')

# Set up Climate Plot
---

In [None]:
climate_plot = figure(title="Climate Conditions", tools=TOOLS, tooltips = TOOLTIPS,  
                  plot_width = 800, plot_height = 600)
climate_plot.circle("avg_low_Winter_C","avg_high_Summer_C", source = data_source, color=city_colors, 
                    fill_alpha=0.75, size=8, line_color='black')
# Setup properties of figure handle
climate_plot.xaxis.axis_label = "Coldest Winter Month Avg Temp (C)"
climate_plot.yaxis.axis_label = "Hottest Summer Month Avg Temp (C)"

# Set up Population Plot
---

In [None]:
population_plot = figure(title="Population in millions", tools=TOOLS, tooltips = TOOLTIPS, y_range=df_world_cities["city"], 
                  plot_width = 800, plot_height = 600)
population_plot.hbar("city", left = 0, right="population_million", height = .9 , source = data_source, 
                     color=city_colors, fill_alpha=0.75)
population_plot.xaxis.axis_label = "Population in Millions"

# Set up Altitude Plot
---

In [None]:
altitude_plot = figure(title="Highest Natural Altitude in m", tools=TOOLS, tooltips = TOOLTIPS, 
                       x_range=df_world_cities["city"], plot_width = 800, plot_height = 600)
altitude_plot.vbar(x="city", top="highest_altitude_m", width=0.9, source = data_source, color=city_colors, 
                    fill_alpha=0.75)
# Setup properties of figure handle
altitude_plot.yaxis.axis_label = "Highest Altitude (m)"

# Setup Layout and Output Plot
---

In [None]:
output_html_layout = column(row(map_plot, climate_plot),
                     row(population_plot, altitude_plot))
show(output_html_layout)