## Activity n: City density by first letter using an interactive custom layer

In this last activity for geoplotlib, you'll combine all methodologies learned in the previous exercises and activity create an interactive visualization that displays the cities that start with a given letter, by simply pushing the left and right arrow keys on your keyboard.

Since we use the same setup to create custom layers as the library does, you will be able to understand the library implementations of most of the layers provided by geoplotlib after this activity: https://github.com/andrea-cuttone/geoplotlib/blob/master/geoplotlib/layers.py.

Before we can start, however, we need to import our dataset.   
For this activity, we'll work with geo-spatial data that contains all cities with their coordinates and their population.

**Note:**   
This time the dataset is not yet added into the data folder. You have to download it from here:   
https://www.kaggle.com/max-mind/world-cities-database#worldcitiespop.csv

#### Loading the dataset

In [1]:
# importing the necessary dependencies


Again, provide the `dtype` argument to tell pandas that the `'Region'` column has the dtype `np.str`.

In [2]:
# loading the Dataset (make sure to have the dataset downloaded)


**Note:**   
If we import our dataset without defining the dtype of column *Region* as String, we will get a warning telling out the it has a mixed datatype.   
We can get rid of this warning by explicitly defining the type of the values in this column by using the `dtype` parameter.   
`dtype={'Region': np.str}`

In [3]:
# looking at the data types of each column


---

#### Mapping `Latitude` and `Longitude` to `lat` and `lon`

As we have learned in Activity27, the thing we have to do is prepare our dataset to be usable by geoplotlib by assigning two new columns, `lat` and `lon`.

Map the `Latitude` and `Longitude` columns into `lat` and `lon` columns which are used by geoplotlib.

In [4]:
# mapping Latitude to lat and Longitude to lon


---

#### Filtering our dataset for cities in Europe

We want to focus our attention on the european countries and their cities.   
A list of all european countries are given below.

In [100]:
# 2 letter country codes of europe without russia
europe_country_codes = ['al', 'ad', 'at', 'by', 'be', 'ba', 'bg', 'hr', 'cy', 'cz', 'dk', 'ee', 'fo', 'fi', 'fr', 'de'
                        , 'gi', 'gr', 'hu', 'is', 'ie', 'im', 'it', 'xk', 'lv', 'li', 'lt', 'lu', 'mk', 'mt', 'md', 'mc'
                        , 'me', 'nl', 'no', 'pl', 'pt', 'ro', 'sm', 'rs', 'sk', 'si', 'es', 'se', 'ch', 'ua', 'gb'
                        , 'va']

Given this list, we want to use filtering to get a dataset that only contains european cities.   
The filtering works exactly how we learned it in the data wrangling chapter.

Use the `europe_country_codes` to filter down our dataset by using the `isin()` method as a condition for our DataFrame.   

In [5]:
# filtering the dataset for countries in europe


Print both, the length of our whole dataset and the filtered down dataset.

In [6]:
# printing the length of both datasets


---

#### Oberserving cities that start with a Z

As a preparation for our interactive visualization, we want to do a test run with cities that start with the letter Z.

Filter down our europe dataset by using `europe_dataset['AccentCity'].str.startswith('Z')` as a filter condition.   
Print out the number of cities starting with Z and the first 5 rows of our filtered dataset.

In [7]:
# plotting the whole dataset with dots


We want to take a quick look at the cities with Z dataset using a DotDensity plot and also get some information about the cities using the previously seen `f_tooltip` argument. In order to use the `f_tooltip` we need to wrap our dataset in `DataAccessObject`.

Create a new `DataAccessObject` from our cities with Z dataset and visualize it with the dot plot.   
Use a tooltip that outputs the Country and City name separated by a `-` (e.g. Ch - Zürich). 

In [9]:
# using dot density to plot a point for each city
from geoplotlib.utils import DataAccessObject


As a second step, we want to use a voronoi plot to display the density of cities with the letter Z.

Create a new voronoi plot using a color map of `Reds_r`, max area of `1e5` and alpha value of `50` so we can still see the mapping peeking through.

In [10]:
# displaying the density of cities stating with z using a voronoi plot 


Here we can see that city names with Z are more popular in the eastern europen countries.

#### Creating our interactive custom layer

We now want to create an interactive visualization that displays each city, as a dot, that starts with the currently selected first letter. The by default selected letter will be `A`.   
We need a way to iterate through the letters using the left and right arrow.   
As described in the introduction section of custom layers, we can make use of the `on_key_release` method that is specifically designed for this.

In order to create our custom layer:
- Filter the dataset (`self.data`) in the `invalidate` method using the current letter gotten from the start_letters array using indexing (`self.start_letter`)
- Create a new `BatchPainter()` and project the `lon` and `lat` values to `x` and `y` values.
- Use the BatchPainter to paint the points on the map with a size of 2.

- Call the `batch_draw()` method in the `draw` method and use the `ui_manager` to add an `info` dialog to the screen telling the user which start letter is currently used.

- Check which key is pressed using pyglet (`pyglet.window.key.RIGHT`). If right or left is pressed, increment or decrement the start_letter value of the FilterLayer class accordingly. (Use modulo to allow rotation which should happen when A->Z or Z->A)
- Make sure to `return True` in the `on_key_release` mehtod if you changed the `start_letter` to trigger a re-draw of the points.

> Note: Make use of the previous activity and the examples at https://github.com/andrea-cuttone/geoplotlib/tree/master/examples

In [11]:
# custom layer creation
import pyglet
import geoplotlib
from geoplotlib.layers import BaseLayer
from geoplotlib.core import BatchPainter
from geoplotlib.utils import BoundingBox

start_letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'H', 'K', 'L'
                , 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W'
                , 'X', 'Y', 'Z']

class FilterLayer (BaseLayer):
    def __init__(self, dataset, bbox=BoundingBox.WORLD):
        self.data = dataset
        self.start_letter = 0
        self.view = bbox
        
    def invalidate(self, proj):
        pass
        
    def draw(self, proj, mouse_x, mouse_y, ui_manager):  
        pass
        
    def on_key_release(self, key, modifiers):
        return False
        
    # bounding box that gets used when layer is created
    def bbox(self):
        return self.view

Once you've created the custom layer we only need to call the `add_layer()` method of geoplotlib, providing our custom layer with the given BoundingBox of europe.

In [12]:
# using Delaunay triangulation to find the most dense aree
europe_bbox = BoundingBox(north=68.574309, west=-25.298424, south=34.266013, east=47.387123)
