<a href="https://colab.research.google.com/github/vitorgaboardi/data-science/blob/master/Dash_Uber_Data_App.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1.0 Import Libraries

In [1]:
# Install Plotly
!pip install Plotly==4.12

# Install Dash
!pip install dash
!pip install dash-html-components
!pip install dash-core-components
!pip install dash-table

Collecting Plotly==4.12
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/af86e9d9bf1a3e4f2dabebeabd02a32e8ddf671a5d072b3af2b011efea99/plotly-4.12.0-py2.py3-none-any.whl (13.1MB)
[K     |████████████████████████████████| 13.1MB 226kB/s 
Installing collected packages: Plotly
  Found existing installation: plotly 4.4.1
    Uninstalling plotly-4.4.1:
      Successfully uninstalled plotly-4.4.1
Successfully installed Plotly-4.12.0
Collecting dash
[?25l  Downloading https://files.pythonhosted.org/packages/69/91/ae029886dda55b93b60ac04377bcb2ab9209dd73244e3b5e513124cc6778/dash-1.17.0.tar.gz (75kB)
[K     |████████████████████████████████| 81kB 3.8MB/s 
Collecting flask-compress
  Downloading https://files.pythonhosted.org/packages/b2/7a/9c4641f975fb9daaf945dc39da6a52fd5693ab3bbc2d53780eab3b5106f4/Flask_Compress-1.8.0-py3-none-any.whl
Collecting dash_renderer==1.8.3
[?25l  Downloading https://files.pythonhosted.org/packages/72/fe/59a322edb128ad15205002c7b81e3f5e580f6791c4a

In [2]:
import os.path
import sys, json
import requests
import subprocess

import numpy as np
import pandas as pd
import plotly.express as px

from requests.exceptions import RequestException
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

from collections import namedtuple

# 2.0 Configuring the Ngrok

[Ngrok](https://ngrok.com/) will be used to create the Dash app URL.

First, you need to download of Ngrok. The function below downloads and starts the external URL creation process:

In [3]:
def download_ngrok():
    if not os.path.isfile('ngrok'):
        !wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
        !unzip -o ngrok-stable-linux-amd64.zip
    pass

In [4]:
Response = namedtuple('Response', ['url', 'error'])

def get_tunnel():
    try:
        Tunnel = subprocess.Popen(['./ngrok','http','8050'])

        session = requests.Session()
        retry = Retry(connect=3, backoff_factor=0.5)
        adapter = HTTPAdapter(max_retries=retry)
        session.mount('http://', adapter)

        res = session.get('http://localhost:4040/api/tunnels')
        res.raise_for_status()

        tunnel_str = res.text
        tunnel_cfg = json.loads(tunnel_str)
        tunnel_url = tunnel_cfg['tunnels'][0]['public_url']

        return Response(url=tunnel_url, error=None)
    except RequestException as e:
        return Response(url=None, error=str(e))

#3.0 Dataset preparation

In this application, we are going to plot the UBER occurance in NYC using a dataset that brings information from May 2014.

In [176]:
import pandas as pd

# Reading DataFrame
data = pd.read_csv("uber-raw-data-may14.csv")
data.head()

Unnamed: 0,Date/Time,Lat,Lon,Base
0,5/1/2014 0:02:00,40.7521,-73.9914,B02512
1,5/1/2014 0:06:00,40.6965,-73.9715,B02512
2,5/1/2014 0:15:00,40.7464,-73.9838,B02512
3,5/1/2014 0:17:00,40.7463,-74.0011,B02512
4,5/1/2014 0:17:00,40.7594,-73.9734,B02512


In [100]:
# Data Information
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 652435 entries, 0 to 652434
Data columns (total 4 columns):
 #   Column     Non-Null Count   Dtype  
---  ------     --------------   -----  
 0   Date/Time  652435 non-null  object 
 1   Lat        652435 non-null  float64
 2   Lon        652435 non-null  float64
 3   Base       652435 non-null  object 
dtypes: float64(2), object(2)
memory usage: 19.9+ MB


In [177]:
## Splitting Date and Time information
# Convert to timestamp
data['Date/Time'] = pd.to_datetime(data['Date/Time'])    

# Getting information
data['Day'] = data['Date/Time'].dt.day
data['Hour'] = data['Date/Time'].dt.hour
data['Size'] = 1                            #Necessary to set the size of each circle
#data['Month'] = data['Date/Time'].dt.month
#data['Minute'] = data['Date/Time'].dt.minute

# Droping "Data/Time" column
data = data.drop(['Date/Time', 'Base'], axis=1)


In [178]:
data.head()

Unnamed: 0,Lat,Lon,Day,Hour,Size
0,40.7521,-73.9914,1,0,1
1,40.6965,-73.9715,1,0,1
2,40.7464,-73.9838,1,0,1
3,40.7463,-74.0011,1,0,1
4,40.7594,-73.9734,1,0,1


## 3.1 - Selecting information based on day

In [124]:
day = 25
data_day = data.loc[data['Day'] == day]
data_day.head()

Unnamed: 0,Lat,Lon,Base,Day,Hour,Size
29868,40.7669,-73.9676,B02512,25,0,1
29869,40.7325,-73.9935,B02512,25,0,1
29870,40.7426,-74.0072,B02512,25,0,1
29871,40.7471,-73.9864,B02512,25,0,1
29872,40.7278,-73.9857,B02512,25,0,1


# 3.2 The Dash apps

Dash apps are composed of [layout](https://dash.plotly.com/layout), it describes what the application looks like, and [interactivity](https://dash.plotly.com/basic-callbacks) of the application.

For layout, We will use the `dash_core_components` and the `dash_html_components` library. But you can use also build your own with JavaScript and React.js.


- The layout is composed of componenets like `html.Div` and `dcc.Graph`.
- The `dash_core_components` library has components for every HTML tag. For example: Div, H6 and Br.
- The `dash_html_components`library describe components that are interactive. For example: Graph, Input and Slider.

But, first we will work with the graph we want and then we will work with Dash:

In [164]:
print(px.colors.sequential.Sunset)

['rgb(243, 231, 155)', 'rgb(250, 196, 132)', 'rgb(248, 160, 126)', 'rgb(235, 127, 134)', 'rgb(206, 102, 147)', 'rgb(160, 89, 160)', 'rgb(92, 83, 165)']


In [163]:
#Creating histogram using the dataset created.
fig = px.histogram(data_day, x="Hour", color="Hour", 
                   color_discrete_sequence=px.colors.qualitative.Light24)
fig.show()

In [179]:
fig = px.scatter_mapbox(data_day, 
                        lat='Lat', 
                        lon='Lon', 
                        size='Size',
                        color='Hour',
                        color_continuous_scale=px.colors.sequential.Sunset,   #px.colors.cyclical.HSV,
                        size_max=5, 
                        zoom=12,
                        mapbox_style='carto-positron')
fig.update_layout(margin={'r':0,'t':0,'l':0,'b':0})
fig.show()

The final script:

In [173]:
%%writefile my_dash_app.py
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import pandas as pd


external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

#Dataset
day = 25
data = pd.read_csv("uber-raw-data-may14.csv")

data['Date/Time'] = pd.to_datetime(data['Date/Time'])    
data['Day'] = data['Date/Time'].dt.day
data['Hour'] = data['Date/Time'].dt.hour
data['Size'] = 1                            

data = data.drop(['Date/Time'], axis=1)
data_day = data.loc[data['Day'] == day]

fig = px.scatter_mapbox(data_day, 
                        lat='Lat', 
                        lon='Lon', 
                        size='Size',
                        color='Hour',
                        color_continuous_scale=px.colors.sequential.Sunset,
                        size_max=5, 
                        zoom=12,
                        mapbox_style='carto-positron',
                        height=400)

fig2 = px.histogram(data_day, x="Hour", color="Hour", 
                   color_discrete_sequence=px.colors.qualitative.Light24)

app.layout = html.Div([
    html.Div([
        dcc.Graph(
          id='hour-uber',
          figure=fig
        )
    ]),
    html.Div([
        dcc.Graph( 
          id='histogram',
          figure=fig2
        )
    ])
])

if __name__ == '__main__':
    app.run_server(debug=True, use_reloader=False)

Overwriting my_dash_app.py


Run the dash app:

In [120]:
download_ngrok()

--2020-11-23 20:00:50--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
Resolving bin.equinox.io (bin.equinox.io)... 34.233.2.239, 54.236.74.205, 35.174.46.144, ...
Connecting to bin.equinox.io (bin.equinox.io)|34.233.2.239|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13773305 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.zip’


2020-11-23 20:00:51 (16.8 MB/s) - ‘ngrok-stable-linux-amd64.zip’ saved [13773305/13773305]

Archive:  ngrok-stable-linux-amd64.zip
  inflating: ngrok                   


In [174]:
retorno = get_tunnel()
print(retorno)
!python my_dash_app.py

Response(url='https://6130ba0981ba.ngrok.io', error=None)
Dash is running on http://127.0.0.1:8050/

 * Serving Flask app "my_dash_app" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: on
Exception ignored in: <module 'threading' from '/usr/lib/python3.6/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 1279, in _shutdown
    def _shutdown():
KeyboardInterrupt


In case you want to build a dash with user interaction, use [Dash Callbacks](https://dash.plotly.com/basic-callbacks).

# Reference

* [Andressa Stéfany](https://github.com/AndressaStefany)
* [Video: Tableros Dash en Colab y tuneles con ngrok](https://www.youtube.com/watch?v=g6M3mAHFcyU)
*   [ngrok](https://ngrok.com/)
*   [Dash Plotly](https://dash.plotly.com/)
*   [Interactive Graphing](https://dash.plotly.com/interactive-graphing)
*   [Dash for Beginners - DataCamp](https://www.datacamp.com/community/tutorials/learn-build-dash-python)
