# Simple Pydeck Guide 



**Pydeck** is a pacakge to visualize geographic data. This packages is optimized to Jupyter notebook(not Colab!). 

This notebook features the most frequently used layers included in pydeck, using [NYC taxi trip duration data](https://www.kaggle.com/c/nyc-taxi-trip-duration/data)


### Table of Contents
 + [Data description & preparation](#section-one)
 + [Scatterplot Layer](#section-two)
 + [Heatmap Layer](#section-three)
 + [ArcLayer / LineLayer](#section-four)
 + [Grid / Hexagon Layer](#section-five)


Reference: <https://pydeck.gl/index.html>

In [None]:
pip install pydeck

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
%matplotlib inline

import scipy.cluster.hierarchy as spc

import geopandas as gpd
from IPython.display import display
from shapely.geometry import Polygon
from shapely.geometry import LineString, shape
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon
import pydeck as pdk
import pydeck
from itertools import product

import warnings
warnings.filterwarnings('ignore')

ModuleNotFoundError: No module named 'pydeck'

<a id="section-one"></a>
# Data description

In this notebook, I'll use NYC taxi trip duration data, which contains location-related variables. To apply pydeck package to data, dataset should contain variables related to longitude, latitude, polygon, and multipolygon.

 + pickup_longitude: longitude of taxi pickup location
 + pickup_latitude: latitude of taxi pickup location
 + dropoff_longitude: longitude of taxi dropoff location
 + dropoff_latitude: latitude of taxi dropoff location
 + passenger_count: the number of passengers in taxi
 + trip_duration: taxi trip duration minutes

In [None]:
# load data
df = pd.read_csv('../input/nyc-taxi-trip-duration/train.zip', header = 0)
df.head(3)

In [None]:
df.shape

Original dataset has over 1400000 data. If there're too many data in a dataset, pydeck works really slowly. Therefore I'm going to use first 10000 data for simple visualization! This reduced data is **sample** data.

In [None]:
# reduce size of dataset
sample = df[:10000]

# Structure of pydeck layer
Below is basic structure to create visualization layer of pydeck pacakge. You can customize each layer depending on layer type.


---
**Layer(pdk.Layer)**
* layer_type: Type of layer to apply    ex) 'ScatterplotLayer', 'HeatmapLayer'
* input_data: your own data
* get_position: recieves longitude and latitude (location data)
* pickable, auto_hightlight: show information box of data if you put cursor on a graph


**View state(pdk.ViewState)**
* longitude: longitude of center you choose
* latitude: latitude of center you choose
* zoom: maginication level of the map (if zoom increases, map shows closer view)
* pitch: up/down angle relative to map's plane. (0: looking directly at the map)
* bearing: left/right angle relative to map's true north. (0: aligned to true north)

**Rendering(pdk.Deck)**
* layer: put list of layers you want to show in a map
* initial_view_state: put view state you set


```python
layer = pdk.Layer(layer_type, 
                  input_data,
                  get_position = [longitude, latitude],
                  pickable = True,
                  auto_hightlight = True)

view_state = pdk.ViewState(latitude = -73, 
                           longitude = 34, 
                           zoom = 10,
                           pitch = 40,
                           bearing = 30)

r = pdk.Deck(layer = [layer_list], initial_view_state = view_state)
```

<a id="section-two"></a>
## Scatterplot Layer

Scatterplot layer shows overall distribution of geo data. Use this layer when you want to plot each location of data! I plotted pickup longitude & latitude on the map.

---

* get_fill_color: color of dots (RGBA color)
* get_line_color: color of border line of dots. (RGBA color)


In [None]:
layer = pdk.Layer('ScatterplotLayer', # Layer type
                  sample, # input data
                  get_position = '[pickup_longitude, pickup_latitude]', # location data
                  get_fill_color = '[255, 0, 255]',
                  get_line_color = '[0, 0, 0]',
                  get_radius = 30,
                  pickable = True,
                  auto_hightlight = True)

center = [-73.8827157874292, 40.755388418135865] # center: lon, lat of centor of NYC
view_state = pdk.ViewState(latitude = center[1], 
                           longitude = center[0], 
                           zoom = 10)

# rendering
r = pdk.Deck(layers = [layer], initial_view_state = view_state)
r.to_html('demo.html')

You can also plot multiple layers at once! This time, I'll plot pickup layer and dropoff layer together. Purple is for pickup, green is for dropoff layer.

**How to?**
* Create layers you want to show by using pdk.Layer multiple times: layer1, layer2, layer3
* In rendering part, put layer names in a list: [layer1, layer2, layer3]

That's it!

In [None]:
pickup_layer = pdk.Layer('ScatterplotLayer',
                  sample,
                  get_position = '[pickup_longitude, pickup_latitude]',
                  get_fill_color = '[255, 0, 255]',
                  get_line_color = '[0, 0, 0]',
                  get_radius = 50,
                  pickable = True,
                  auto_hightlight = True,
                  filled = True)

dropoff_layer = pdk.Layer('ScatterplotLayer',
                  sample,
                  get_position = '[dropoff_longitude, dropoff_latitude]',
                  get_fill_color = '[0, 255, 0]',
                  get_line_color = '[0, 0, 0]',
                  get_radius = 50,
                  pickable = True,
                  auto_hightlight = True,
                  filled = True)

center = [-73.8827157874292, 40.755388418135865]
view_state = pdk.ViewState(latitude = center[1], 
                           longitude = center[0], 
                           zoom = 10)


r = pdk.Deck(layers = [pickup_layer, dropoff_layer], initial_view_state = view_state)
r.to_html('scatter_2.html')

<a id="section-three"></a>
## Heatmap Layer
Use heatmap layer when you want to analyze density of region in a map. You can custmoize color of heatmap by using `color_range`. I used standard color map of heatmap layer.

If the color of heatmap is closer to red color, this means that density of data is relatively hight than other region.

### 1. Heatmap Layer

In [None]:
layer = pdk.Layer('HeatmapLayer',
                  sample,
                  get_position = '[dropoff_longitude, dropoff_latitude]',
                  #color_range = your customized color,
                  pickable = True,
                  auto_hightlight = True)

center = [-73.8827157874292, 40.755388418135865]
view_state = pdk.ViewState(latitude = center[1], 
                           longitude = center[0], 
                           zoom = 10)


r = pdk.Deck(layers = [layer], initial_view_state = view_state)
r.to_html('heatmap.html')

### 2. ScreenGridLayer
First heatmap shows density of data well, but its border of colors are quite vague. Instead we can apply **ScreenGridLayer**. This layer shows heatmap in pixels.

---
* cell_size_picels: adjusts size of pixel.

In [None]:
layer = pdk.Layer('ScreenGridLayer',
                  sample,
                  get_position = '[dropoff_longitude, dropoff_latitude]',
                  cell_size_pixels = 5,
                  pickable = True,
                  auto_hightlight = True)

center = [-73.8827157874292, 40.755388418135865]
view_state = pdk.ViewState(latitude = center[1], 
                           longitude = center[0], 
                           zoom = 9)


r = pdk.Deck(layers = [layer], initial_view_state = view_state)
r.to_html('h2.html')

<a id="section-four"></a>
## Line/Arc Layer
Line layer and arc layer shows lines that connects start and end locations. Line layer plots direct line but Arc layer plots arch shaped lines from start position to end position. (Personal opinion: Arc layer is more easy to see!)


I differed color of lines depending on the number of passengers in each data. Similary you can differ color of lines by using some other variables.

**How to?**
* put "passenger_count * number to multiply" in one of RGB colors. Remember that maximum color range is 255.

---
* get_source_position: start location(lon, lat)
* get_target_position: end location(lon, lat)
* get_width: width of lines

### 1. Line Layer

In [None]:
sample2 = df[:1000] # use 1000 data

layer = pdk.Layer(
    'LineLayer',
    sample2,
    get_source_position='[pickup_longitude, pickup_latitude]',
    get_target_position='[dropoff_longitude, dropoff_latitude]',
    get_width = 2,
    get_color = '[255, passenger_count * 130, 0, 40]',
    pickable=True,
    auto_highlight=True)
    
view_state = pydeck.data_utils.compute_view(sample[['pickup_longitude', 'pickup_latitude']].values)
view_state.zoom = 10

r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('h.html')

### 2. Arc Layer

---
* get_source_color: color of source position
* get_target_color: color of target position

In [None]:
layer = pdk.Layer(
    'ArcLayer',
    sample2,
    get_source_position='[pickup_longitude, pickup_latitude]',
    get_target_position='[dropoff_longitude, dropoff_latitude]',
    get_width = 'passenger_count', # differs width of line by number of passengers
    get_source_color='[255, 255, 120]',
    get_target_color='[255, 0, 0]',
    pickable=True,
    auto_highlight=True)
    

view_state = pydeck.data_utils.compute_view(sample[['pickup_longitude', 'pickup_latitude']].values)
view_state.zoom = 8

r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('h.html')

<a id="section-five"></a>
## Grid/Hexagon Layer
Grid and Hexagon layer are similar to heatmap layer, but different from heatmap layer in that those two layers can make elevations depending on density of regions in a map. Grid layer makes heatmap on squared pixels, whereas hexagon layer makes heatmp on hexgon shpaed pixels.

---
* elevation_scale: adjusts height of regions


### 1. Grid Layer

In [None]:
layer = pdk.Layer(
    "GridLayer",
    sample,
    get_position = '[pickup_longitude, pickup_latitude]',
    auto_highlight = True,
    elevation_scale = 10,
    pickable=True,
    extruded=True
)

# Set the viewport location
view_state = pdk.ViewState(
    longitude = center[0], latitude = center[1], zoom = 9, pitch = 50, bearing = -45,
)

# Render
r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('r.html')

### 2. Hexagon Layer

In [None]:
layer = pdk.Layer(
    "HexagonLayer",
    sample,
    get_position = '[pickup_longitude, pickup_latitude]',
    auto_highlight = True,
    elevation_scale = 10,
    pickable=True,
    extruded=True
)

# Set the viewport location
view_state = pdk.ViewState(
    longitude = center[0], latitude = center[1], zoom = 9, pitch = 50, bearing = -45,
)

# Render
r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('r.html')