<a href="https://colab.research.google.com/github/ipeirotis/dealing_with_data/blob/master/11-Flask/B-Create_API_call_and_Connecting_to_MySQL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Creating of a Flask application showing Citibike stations


In this segment we will create a basic app, where we will connect to the Citibike database, and display the list of stations.



In [None]:
!pip install -U PyMySQL sqlalchemy flask pyngrok geopandas

## A Refresher

Let's remember first how to get data from the database.

In [None]:
# This code creates a connection to the database
from sqlalchemy import create_engine, text
import pandas as pd
import geopandas as gpd

conn_string = 'mysql+pymysql://{user}:{password}@{host}/{db}?charset={encoding}'.format(
    host = 'db.ipeirotis.org',
    user = 'student',
    db = 'citibike_fall2017',
    password = 'dwdstudent2015',
    encoding = 'utf8mb4')

engine = create_engine(conn_string)

In [None]:
sql = "SELECT DISTINCT id, name, capacity, lat, lon  FROM status_fall2017"

with engine.connect() as connection:
    stations = pd.read_sql(text(sql), con=connection)

stations.to_dict(orient='records')

In [None]:
stations.plot(kind='scatter', x='lon', y='lat')

In [None]:
# Dataset from NYC Open Data: https://data.cityofnewyork.us/City-Government/Neighborhood-Tabulation-Areas/cpf4-rkhq
!curl 'https://data.cityofnewyork.us/api/geospatial/cpf4-rkhq?method=export&format=GeoJSON' -o nyc-neighborhoods.geojson

# Load the shapefile
df_nyc = gpd.GeoDataFrame.from_file('nyc-neighborhoods.geojson')

# Create the map of NYC neighborhoods
nyc_map = df_nyc.plot(linewidth=0.5, color='White', edgecolor='Black', figsize=(7, 5))

# Plot the stations on top of the map
stations.plot(kind='scatter', x='lon', y='lat', s=1, ax = nyc_map)

In [None]:
# The code below is purely optional.
# For comparison, let's get the current data from the Citibike API
import requests

# This gives information for each station that remains stable over time
url_stations = "https://gbfs.citibikenyc.com/gbfs/en/station_information.json"

# We fetch for now just the time-invariant data
results = requests.get(url_stations).json()

# Put the results from the Citibike API in a dataframe
df = pd.DataFrame(results['data']['stations'])

# Remove the noisy lon/lat points with coordinates (0,0)
df = df.query("lon!=0  and lat!=0")

# Create the map of NYC neighborhoods
nyc_map = df_nyc.plot(linewidth=0.5, color='White', edgecolor='Black', figsize=(7, 5))

# Plot the stations on top of the map
df.plot(kind='scatter', x='lon', y='lat', s=1, ax = nyc_map)

## Creating an API endpoint

In [None]:
import os
import threading

from flask import Flask
from pyngrok import ngrok

from flask import render_template, jsonify
from sqlalchemy import create_engine
import pandas as pd
import base64
from io import BytesIO
import matplotlib.pyplot as plt

In [None]:
# Setup Flask and ngrok

os.environ["FLASK_DEBUG"] = "true"

app = Flask(__name__)
port = 5000

# Open a ngrok tunnel to the HTTP server
ngrok_authtoken = '2WgDffgQcSJcesOPKNnZ1jvwxXJ_5sR4FFXtByxhjgkFB62QP'
ngrok.set_auth_token(ngrok_authtoken)
public_url = ngrok.connect(port).public_url
print(f" * ngrok tunnel '{public_url}' -> 'http://127.0.0.1:{port}'")

# Update any base URLs to use the public ngrok URL
app.config["BASE_URL"] = public_url


In [None]:
# Setup a connection to the database
conn_string = 'mysql+pymysql://{user}:{password}@{host}/{db}?charset={encoding}'.format(
    host = 'db.ipeirotis.org',
    user = 'student',
    db = 'citibike_fall2017',
    password = 'dwdstudent2015',
    encoding = 'utf8mb4')
engine = create_engine(conn_string)

A few things to notice.

First, notice the `@app.route('/citibike_api',  methods=['GET'])` command. This part specifies that our API endpoint will be accessible under the `http://our_web_server_address/citibike_api`.


Then, we define the function `def citibike_stations():` that will create the response of that API call. What the function returns is what the API call will return back.

Notice that insider the `citibike_stations` function, we connect to the database, and issue an SQL query to the database.

Finally, we get back the results of the query, we put the results in a Python dictionary, and we use the `jsonify` function to convert our dictionary to JSON and return it as the API result.


In [None]:
# Define Flask routes
@app.route("/")
def index():
    return "Hello from Colab!"

In [None]:
@app.route('/citibike_api',  methods=['GET'])
def citibike_stations():

    sql = "SELECT DISTINCT id, name, capacity, lat, lon  FROM status_fall2017"
    # Connect to the database, execute the query, and get back the results
    with engine.connect() as connection:
        stations = pd.read_sql(text(sql), con=connection)

    # Create the response. We will put the retrieved data as a list of
    # dictionaries, under the key "stations".
    list_of_stations = stations.to_dict(orient='records')

    api_results = {"stations": list_of_stations}

    # We JSON-ify our dictionary and return it as the API response
    return jsonify(api_results)

The next api call `/station_map` is a bit different. Instead of returning back the raw data, creates a plot, and then returns the image data in the JSON response. The image data is encoded in "base64" format. We will examine how we can use such API calls to serve dynamically generated images for our website.

In [None]:
@app.route('/station_map',  methods=['GET'])
def station_map():

    # Connect to the database, execute the query, and get back the results
    sql = "SELECT DISTINCT id, name, capacity, lat, lon  FROM status_fall2017"
    with engine.connect() as connection:
        stations = pd.read_sql(text(sql), con=connection)

    fig, ax = plt.subplots()
    ax = stations.plot(kind='scatter', x='lon', y='lat', ax=ax)

    buf = BytesIO()
    fig.savefig(buf, format="png")
    # Embed the result in the html output.
    data = base64.b64encode(buf.getbuffer()).decode("ascii")

    # Create the response. We will put the retrieved data as a list of
    # dictionaries, under the key "stations".
    results = {"image": data}

    # We JSON-ify our dictionary and return it as the API response
    return jsonify(results)

    # return f"<img src='data:image/png;base64,{data}'/>"



In [None]:
@app.route('/station_image',  methods=['GET'])
def station_image():

    # Connect to the database, execute the query, and get back the results
    sql = "SELECT DISTINCT id, name, capacity, lat, lon  FROM status_fall2017"
    with engine.connect() as connection:
        stations = pd.read_sql(text(sql), con=connection)

    fig, ax = plt.subplots()
    ax = stations.plot(kind='scatter', x='lon', y='lat', ax=ax)

    buf = BytesIO()
    fig.savefig(buf, format="png")
    # Embed the result in the html output.
    data = base64.b64encode(buf.getbuffer()).decode("ascii")

    # Create the response. We will put the retrieved data as a list of
    # dictionaries, under the key "stations".
    results = {"image": data}

    # We JSON-ify our dictionary and return it as the API response
    # return jsonify(results)

    return f"<img src='data:image/png;base64,{data}'/>"



In [None]:
# Once you run your app, check the /citibike_api and the /station_map API calls

print(f" * ngrok tunnel '{public_url}' -> 'http://127.0.0.1:{port}'")
app.run(use_reloader=False, port=port)

## Accepting Parameters

Now let's see how we can query for the status of a Citibike station over time.

For this part, we want to create a new function, where we will pass the `station_id` as a **parameter**. Then our code will read the value of the parameter `station_id` and then will query the database to get the status of that station.

The related pieces of code for reading a parameter are the following

> `from flask import request`

> `station_id = request.args.get('station_id')`

In [None]:
@app.route('/station_status')
def station_status():
  """
    API endpoint to get the status of a specific Citibike station.
    """
  param = request.args.get('station_id')
  try:
    param_value = int(param)
  except:
    return jsonify({"error": "No station_id parameter given or other problem"})

  sql = '''SELECT available_bikes,
                      available_docks,
                      capacity,
                      available_bikes / capacity AS percent_full,
                      communication_time
               FROM status_fall2017
               WHERE id = :station_id'''

  with engine.connect() as con:
    station_status = pd.read_sql(text(sql),
                                 con=con,
                                 params={"station_id": param_value})

  station_status_over_time = station_status.to_dict(orient='records')

  api_results = {
      "station_id": param_value,
      "status_over_time": station_status_over_time
  }

  # We JSON-ify our dictionary and return it as the API response
  return jsonify(api_results)

In [None]:
import sqlalchemy
sqlalchemy.__version__

In [None]:
# Now, in addition to the /citibike_api and the /station_map API calls
# you also have the /station_status API call.

# You can visit then https://<YOURIP>/station_status?station_id=72
# and see the results

print(f" * ngrok tunnel '{public_url}' -> 'http://127.0.0.1:{port}'")
print(f" * ngrok tunnel '{public_url}/station_status?station_id=72 ' -> 'http://127.0.0.1:{port}'")
app.run(use_reloader=False, port=port)

## Exercise

a. Connect to a database of your choice and create an API call that returns the results from the database in JSON format. (For example, return the recipe entries from your database, in JSON format.)

b. Using the results from your query is part (a), create a plot and create an API call that return the "base64" of the plot.

c. Create an API call that receives a parameter.