## REST Payload to DB Table

Let us perform an exercise to get the REST Payload into a database table.
* REST API URL: https://gbfs.citibikenyc.com/gbfs/en/station_information.json
* Database Name: **{username}_sms_db**
* First Table Name: **stations**
* Create table for following fields. Make sure to use appropriate data types
* You can change eightd_station_services as delimited string before loading the data


  * id - Sequence generated primary key
  * station_id - Uniqueness needs to be enforced.
  * station_type
  * name
  * short_name
  * capacity
  * external_id
  * has_kiosk
  * legacy_id
  * region_id
  * electric_bike_surcharge_waiver
  * eightd_station_services
* Get the data from the REST payload into the table **stations** created.
* Run queries for following scenarios.
  * Get distinct station types.
  * Get number of stations per region_id.
  * Get top 10 stations by capacity.
  * Get number of stations where there are no kiosks.
* Second Table Name: **station_rental_types**
* Create table with following fields
  * station_id
  * rental_type - the source field is of type list. The target column in the table should be of type VARCHAR.
  * station_rental_type_id - sequence generated primary key.
  * Combination of station_id and rental_type is supposed to be unique.
* For all station ids where there is one or more rental_types, the data should be inserted into the table separately with rental_type.
* Sample input record `{'station_id': 1, 'rental_types': ['KEY', 'CREDIT CARD]}`
* Sample data in the table

|station_id|rental_type|
|---|---|
|1|KEY|
|1|CREDIT CARD|

* Run queries for following scenarios.
  * Get number of records from **station_rental_types**
  * Get number of stations where rental_type is **KEY**
  * Get number of stations where rental_type is **CREDIT CARD**
  * Get number stations by rental_type.
  * Get the stations where there is no rental type.

In [1]:
%load_ext sql

In [2]:
%env DATABASE_URL=postgresql://deepan:DB_PASSWORD@localhost:5432/sms_db

env: DATABASE_URL=postgresql://deepan:DB_PASSWORD@localhost:5432/sms_db


In [3]:
%sql DROP TABLE IF EXISTS stations

Done.


[]

In [4]:
%%sql

CREATE TABLE stations(
    id  SERIAL PRIMARY KEY,
    station_id INT NOT NULL UNIQUE,
    station_type VARCHAR(100),
    name VARCHAR(80),
    short_name VARCHAR(50),
    capacity INT,
    external_id VARCHAR(90),
    has_kiosk BOOLEAN,
    legacy_id VARCHAR(50),
    region_id VARCHAR(50),
    electric_bike_surcharge_waiver BOOLEAN,
    eightd_station_services VARCHAR ARRAY
)

 * postgresql://deepan:***@localhost:5432/sms_db
Done.


[]

In [5]:
import requests, json, psycopg2

conn = psycopg2.connect(user="deepan",
                        password="DB_PASSWORD",
                        host="localhost",
                        port="5432",
                        database="sms_db")

res = requests.get('https://gbfs.citibikenyc.com/gbfs/en/station_information.json').json()

res = res['data']['stations']

In [6]:
data = list(map(lambda x:(int(x['station_id']), x['station_type'], x['name'], x['short_name'],
                        x['capacity'], x['external_id'], x['has_kiosk'], x['legacy_id'],
                        x['region_id'], x['electric_bike_surcharge_waiver'],
                        [str(x['eightd_station_services'])] if (x['eightd_station_services']) else x['eightd_station_services']
                        
        ), res))

query = """
INSERT INTO stations(station_id, station_type, name,short_name, capacity,external_id,
    has_kiosk, legacy_id, region_id, electric_bike_surcharge_waiver,
    eightd_station_services) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
"""


In [7]:
cur = conn.cursor()
cur.executemany(query, data)
conn.commit()
cur.close()
conn.close()

In [8]:
%%sql 

SELECT count(*) FROM stations

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


count
1705


### Distinct station types

In [9]:
%%sql

SELECT DISTINCT station_type FROM stations

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


station_type
classic


In [10]:
%%sql 

SELECT * FROM stations LIMIT 2;

 * postgresql://deepan:***@localhost:5432/sms_db
2 rows affected.


id,station_id,station_type,name,short_name,capacity,external_id,has_kiosk,legacy_id,region_id,electric_bike_surcharge_waiver,eightd_station_services
1,72,classic,W 52 St & 11 Ave,6926.01,32,66db237e-0aca-11e7-82f6-3863bb44ef7c,True,72,71,False,[]
2,79,classic,Franklin St & W Broadway,5430.08,33,66db269c-0aca-11e7-82f6-3863bb44ef7c,True,79,71,False,[]


### Number of stations per region_id

In [11]:
%%sql 

SELECT region_id, count(station_id) FROM stations GROUP BY region_id

 * postgresql://deepan:***@localhost:5432/sms_db
3 rows affected.


region_id,count
311,28
71,1624
70,53


### Top 10 stations by capacity

In [12]:
%%sql 

SELECT * FROM stations GROUP BY capacity, ID ORDER BY capacity DESC LIMIT 10;

 * postgresql://deepan:***@localhost:5432/sms_db
10 rows affected.


id,station_id,station_type,name,short_name,capacity,external_id,has_kiosk,legacy_id,region_id,electric_bike_surcharge_waiver,eightd_station_services
174,445,classic,E 10 St & Avenue A,5659.05,107,66dc1beb-0aca-11e7-82f6-3863bb44ef7c,True,445,71,False,[]
159,422,classic,W 59 St & 10 Ave,7023.04,100,66dc0dab-0aca-11e7-82f6-3863bb44ef7c,True,422,71,False,[]
58,281,classic,Grand Army Plaza & Central Park S,6839.1,96,66db5fae-0aca-11e7-82f6-3863bb44ef7c,True,281,71,False,[]
63,293,classic,Lafayette St & E 8 St,5788.13,91,66db65aa-0aca-11e7-82f6-3863bb44ef7c,True,293,71,False,[]
1567,4675,classic,E 40 St & Park Ave,6432.11,87,c638ec67-9ac0-416f-944f-619926144931,True,4675,71,False,[]
181,455,classic,1 Ave & E 44 St,6379.03,87,66dc2172-0aca-11e7-82f6-3863bb44ef7c,True,455,71,False,[]
38,248,classic,Laight St & Hudson St,5539.06,87,66db402c-0aca-11e7-82f6-3863bb44ef7c,True,248,71,False,[]
192,469,classic,Broadway & W 53 St,6779.05,85,66dc2c78-0aca-11e7-82f6-3863bb44ef7c,True,469,71,False,[]
193,470,classic,W 20 St & 8 Ave,6224.05,84,66dc36c3-0aca-11e7-82f6-3863bb44ef7c,True,470,71,False,[]
217,501,classic,FDR Drive & E 35 St,6230.04,83,66dc7659-0aca-11e7-82f6-3863bb44ef7c,True,501,71,False,[]


### Number of stations with no kiosks

In [13]:
%%sql 

SELECT count(station_id) FROM stations
WHERE has_kiosk IN ('False')

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


count
35


In [14]:
import requests
import pandas as pd

payload =  requests.get('https://gbfs.citibikenyc.com/gbfs/en/station_information.json').json()
payload_df = pd.DataFrame(payload)


In [15]:
payload_df.head()

Unnamed: 0,data,last_updated,ttl
stations,"[{'eightd_has_key_dispenser': False, 'external...",1660916036,5


In [16]:
%sql DROP TABLE IF EXISTS station_rental_types

 * postgresql://deepan:***@localhost:5432/sms_db
Done.


[]

In [17]:
%%sql

CREATE TABLE station_rental_types(
        station_rental_type_id SERIAL PRIMARY KEY,
        station_id VARCHAR(60),
        rental_type VARCHAR(60),
        station_id_rental_type VARCHAR(60) NOT NULL UNIQUE
)

 * postgresql://deepan:***@localhost:5432/sms_db
Done.


[]

In [18]:
station_rental_types= list(map( lambda a: {'station_id': a['station_id'],
                                           'rental_types':a['rental_methods']},
                               payload['data']['stations']))

In [19]:
import pandas as pd 
b=pd.DataFrame.from_dict(station_rental_types)
b.head()
new =b.explode('rental_types')
new

Unnamed: 0,station_id,rental_types
0,72,KEY
0,72,CREDITCARD
1,79,KEY
1,79,CREDITCARD
2,82,KEY
...,...,...
1702,4873,CREDITCARD
1703,4876,KEY
1703,4876,CREDITCARD
1704,4885,KEY


In [20]:
b.head()

Unnamed: 0,station_id,rental_types
0,72,"[KEY, CREDITCARD]"
1,79,"[KEY, CREDITCARD]"
2,82,"[KEY, CREDITCARD]"
3,83,"[KEY, CREDITCARD]"
4,116,"[KEY, CREDITCARD]"


In [21]:
new['station_id_rental_type'] = new['station_id']+ " "+new['rental_types']

In [22]:
new.head()

Unnamed: 0,station_id,rental_types,station_id_rental_type
0,72,KEY,72 KEY
0,72,CREDITCARD,72 CREDITCARD
1,79,KEY,79 KEY
1,79,CREDITCARD,79 CREDITCARD
2,82,KEY,82 KEY


In [23]:
new_df =[tuple(i) for i in new.values] 

In [24]:
new_df[:5]

[('72', 'KEY', '72 KEY'),
 ('72', 'CREDITCARD', '72 CREDITCARD'),
 ('79', 'KEY', '79 KEY'),
 ('79', 'CREDITCARD', '79 CREDITCARD'),
 ('82', 'KEY', '82 KEY')]

In [25]:
import requests, json, psycopg2  
payloads =  requests.get('https://gbfs.citibikenyc.com/gbfs/en/station_information.json').json()
conn = psycopg2.connect(user="deepan",
                                password="DB_PASSWORD",
                                host="localhost",
                                port="5432",
                                database="sms_db")

cursor = conn.cursor()


query = ("""INSERT into station_rental_types
    (station_id,rental_type,station_id_rental_type) 
    VALUES
    (%s,%s,%s)""")

 
cursor.executemany(query,new_df)

conn.commit()

cursor.close()


In [26]:
%sql SELECT * FROM station_rental_types limit 10

 * postgresql://deepan:***@localhost:5432/sms_db
10 rows affected.


station_rental_type_id,station_id,rental_type,station_id_rental_type
1,72,KEY,72 KEY
2,72,CREDITCARD,72 CREDITCARD
3,79,KEY,79 KEY
4,79,CREDITCARD,79 CREDITCARD
5,82,KEY,82 KEY
6,82,CREDITCARD,82 CREDITCARD
7,83,KEY,83 KEY
8,83,CREDITCARD,83 CREDITCARD
9,116,KEY,116 KEY
10,116,CREDITCARD,116 CREDITCARD


### Number of records in station_rental_types

In [27]:
%%sql

SELECT count(station_rental_type_id) FROM station_rental_types

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


count
3410


In [28]:
%%sql

SELECT count(station_rental_type_id) FROM station_rental_types
WHERE rental_type = 'KEY'

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


count
1705


In [29]:
%%sql

SELECT count(station_rental_type_id) FROM station_rental_types
WHERE rental_type = 'CREDITCARD'

 * postgresql://deepan:***@localhost:5432/sms_db
1 rows affected.


count
1705


In [30]:
%%sql

SELECT rental_type,count(station_rental_type_id) FROM station_rental_types
GROUP BY rental_type

 * postgresql://deepan:***@localhost:5432/sms_db
2 rows affected.


rental_type,count
CREDITCARD,1705
KEY,1705


In [32]:
%%sql

SELECT * FROM station_rental_types
WHERE rental_type is NULL

 * postgresql://deepan:***@localhost:5432/sms_db
0 rows affected.


station_rental_type_id,station_id,rental_type,station_id_rental_type
