## REST Payload to DB Table

Let us perform an exercise to get the REST Payload into a database table.
* REST API URL: https://gbfs.citibikenyc.com/gbfs/en/station_information.json
* Database Name: **{username}_sms_db**
* First Table Name: **stations**
* Create table for following fields. Make sure to use appropriate data types
  * id - Sequence generated primary key
  * station_id - Uniqueness needs to be enforced.
  * station_type
  * name
  * short_name
  * capacity
  * external_id
  * has_kiosk
  * legacy_id
  * region_id
  * electric_bike_surcharge_waiver
  * eightd_station_services
* Get the data from the REST payload into the table **stations** created.
* Run queries for following scenarios.
  * Get distinct station types.
  * Get number of stations per region_id.
  * Get top 10 stations by capacity.
  * Get number of stations where there are no kiosks.
* Second Table Name: **station_rental_types**
* Create table with following fields
  * station_id
  * rental_type - the source field is of type list. The target column in the table should be of type VARCHAR.
  * station_rental_type_id - sequence generated primary key.
  * Combination of station_id and rental_type is supposed to be unique.
* For all station ids where there is one or more rental_types, the data should be inserted into the table separately with rental_type.
* Sample input record `{'station_id': 1, 'rental_types': ['KEY', 'CREDIT CARD]}`
* Sample data in the table

|station_id|rental_type|
|---|---|
|1|KEY|
|1|CREDIT CARD|

* Run queries for following scenarios.
  * Get number of records from **station_rental_types**
  * Get number of stations where rental_type is **KEY**
  * Get number of stations where rental_type is **CREDIT CARD**
  * Get number stations by rental_type.
  * Get the stations where there is no rental type.

## Create Table stations in sms_db

In [59]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [60]:
%env DATABASE_URL=postgresql://itv002461_sms_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_sms_db

env: DATABASE_URL=postgresql://itv002461_sms_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_sms_db


In [5]:
%%sql
CREATE TABLE stations(
    id SERIAL PRIMARY KEY,
    station_id VARCHAR(10) NOT NULL,
    station_type VARCHAR(20) NOT NULL,
    name VARCHAR(20) NOT NULL,
    short_name VARCHAR(10) NOT NULL,
    capacity INTEGER NOT NULL,
    external_id VARCHAR(100) NOT NULL,
    has_kiosk BOOLEAN,
    legacy_id VARCHAR(10),
    region_id VARCHAR(10),
    electric_bike_surcharge_waiver BOOLEAN,
    eightd_station_services VARCHAR(20) ,
    CONSTRAINT ustaion_id UNIQUE(station_id)
)

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
Done.


[]

In [None]:
# %%sql
# CREATE TABLE stations(
#     id SERIAL PRIMARY KEY,
#     station_id VARCHAR(10) NOT NULL,
#     station_type VARCHAR(20) NOT NULL,
#     name VARCHAR(20) NOT NULL,
#     short_name VARCHAR(10) NOT NULL,
#     capacity INTEGER NOT NULL,
#     external_id VARCHAR(100) NOT NULL,
#     has_kiosk BOOLEAN,
#     legacy_id VARCHAR(10),
#     region_id VARCHAR(10),
#     electric_bike_surcharge_waiver BOOLEAN,
#     eightd_station_services VARCHAR ARRAY ,
#     CONSTRAINT ustaion_id UNIQUE(station_id)
# )

In [8]:
%%sql
select * from information_schema.columns where table_name='stations'

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
12 rows affected.


table_catalog,table_schema,table_name,column_name,ordinal_position,column_default,is_nullable,data_type,character_maximum_length,character_octet_length,numeric_precision,numeric_precision_radix,numeric_scale,datetime_precision,interval_type,interval_precision,character_set_catalog,character_set_schema,character_set_name,collation_catalog,collation_schema,collation_name,domain_catalog,domain_schema,domain_name,udt_catalog,udt_schema,udt_name,scope_catalog,scope_schema,scope_name,maximum_cardinality,dtd_identifier,is_self_referencing,is_identity,identity_generation,identity_start,identity_increment,identity_maximum,identity_minimum,identity_cycle,is_generated,generation_expression,is_updatable
itv002461_sms_db,public,stations,capacity,6,,NO,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,int4,,,,,6,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,has_kiosk,8,,YES,boolean,,,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,bool,,,,,8,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,electric_bike_surcharge_waiver,11,,YES,boolean,,,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,bool,,,,,11,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,id,1,nextval('stations_id_seq'::regclass),NO,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,int4,,,,,1,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,short_name,5,,NO,character varying,10.0,40.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,5,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,external_id,7,,NO,character varying,100.0,400.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,7,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,legacy_id,9,,YES,character varying,10.0,40.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,9,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,region_id,10,,YES,character varying,10.0,40.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,10,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,eightd_station_services,12,,YES,character varying,20.0,80.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,12,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,stations,station_id,2,,NO,character varying,10.0,40.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,2,NO,NO,,,,,,NO,NEVER,,YES


### Read data from JSON

In [3]:
import pandas as pd
import requests

url=' https://gbfs.citibikenyc.com/gbfs/en/station_information.json'
response=requests.get(url).json()
data=response['data']['stations']
df=pd.DataFrame(data)
#df=df.drop(['rental_methods','rental_uris','lat','lon','eightd_has_key_dispenser','eightd_station_services'],axis=1)
df


Unnamed: 0,station_type,name,has_kiosk,capacity,rental_methods,external_id,lat,eightd_has_key_dispenser,electric_bike_surcharge_waiver,short_name,station_id,legacy_id,rental_uris,lon,eightd_station_services,region_id
0,classic,W 52 St & 11 Ave,True,55,"[CREDITCARD, KEY]",66db237e-0aca-11e7-82f6-3863bb44ef7c,40.767272,False,False,6926.01,72,72,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-73.993929,[],71
1,classic,Franklin St & W Broadway,True,33,"[CREDITCARD, KEY]",66db269c-0aca-11e7-82f6-3863bb44ef7c,40.719116,False,False,5430.08,79,79,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-74.006667,[],71
2,classic,St James Pl & Pearl St,True,27,"[CREDITCARD, KEY]",66db277a-0aca-11e7-82f6-3863bb44ef7c,40.711174,False,False,5167.06,82,82,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-74.000165,[],71
3,classic,Atlantic Ave & Fort Greene Pl,True,62,"[CREDITCARD, KEY]",66db281e-0aca-11e7-82f6-3863bb44ef7c,40.683826,False,False,4354.07,83,83,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-73.976323,[],71
4,classic,W 17 St & 8 Ave,True,50,"[CREDITCARD, KEY]",66db28b5-0aca-11e7-82f6-3863bb44ef7c,40.741776,False,False,6148.02,116,116,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-74.001497,[],71
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1618,classic,E 156 St & Courtlandt Ave,True,23,"[CREDITCARD, KEY]",4dcff9e7-7cb6-4d3d-ad67-c2e4c05933d3,40.821239,False,False,7940.04,4745,4745,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-73.917330,[],71
1619,classic,W 111 St & 5 Ave,True,24,"[CREDITCARD, KEY]",0f51cb84-e386-44ac-947f-a969f7617d17,40.797521,False,False,7587.16,4746,4746,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-73.948940,[],71
1620,classic,South St & Broad St,True,31,"[CREDITCARD, KEY]",c00ef46d-fcde-48e2-afbd-0fb595fe3fa7,40.701889,False,False,4920.13,4748,4748,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-74.010899,[],71
1621,classic,30 Ave & 12 St,True,33,"[CREDITCARD, KEY]",ce669a45-ad63-43bd-8aa1-4fc07237edaf,40.771749,False,False,7034.08,4753,4753,{'android': 'https://bkn.lft.to/lastmile_qr_sc...,-73.931613,[],71


### Connection

In [123]:
import psycopg2

def get_connection(host, port, database, user, password):
    connection = None
    try:
        connection = psycopg2.connect(
            host=host,
            port=port,
            database=database,
            user=user,
            password=password
        )
    except Exception as e:
        raise(e)
    
    return connection

In [124]:
host = 'pg.itversity.com'
port = '5433'
database = 'itv002461_sms_db'
user = 'itv002461_sms_user'
password = '7ji8g7gg8p8olbqbna5vz1tjyikaixco'

sms_connection = get_connection(
    host=host,
    port=port,
    database=database,
    user=user,
    password=password
)

In [None]:
# df.to_sql('stations',sms_connection,if_exists='append',index=False)

### Insert Data into DataBase

In [None]:
cursor = sms_connection.cursor()

query = ("""
    INSERT INTO stations 
        (id,station_id,station_type,name ,short_name ,capacity ,external_id ,has_kiosk ,legacy_id ,region_id ,
        electric_bike_surcharge_waiver)
    VALUES 
        (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
""")
i=0
for index,data in df.iterrows():
    i+=1
    insert_data=[str(i),data.station_id,data.station_type,data.name,data.short_name,data.capacity,data.external_id,data.has_kiosk,
                data.legacy_id,data.region_id,data.electric_bike_surcharge_waiver]
    cursor.execute(query,insert_data)

sms_connection.commit()
print("Inserted")


In [None]:
# cursor = sms_connection.cursor()
# cursor.execute('select * from stations limit 10')
# for data in cursor:
#     print(data)

In [49]:
%sql select * from stations limit 5

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
5 rows affected.


id,station_id,station_type,name,short_name,capacity,external_id,has_kiosk,legacy_id,region_id,electric_bike_surcharge_waiver,eightd_station_services
1,72,classic,0,6926.01,55,66db237e-0aca-11e7-82f6-3863bb44ef7c,True,72,71,False,
2,79,classic,1,5430.08,33,66db269c-0aca-11e7-82f6-3863bb44ef7c,True,79,71,False,
3,82,classic,2,5167.06,27,66db277a-0aca-11e7-82f6-3863bb44ef7c,True,82,71,False,
4,83,classic,3,4354.07,62,66db281e-0aca-11e7-82f6-3863bb44ef7c,True,83,71,False,
5,116,classic,4,6148.02,50,66db28b5-0aca-11e7-82f6-3863bb44ef7c,True,116,71,False,


#### Distinct station_type

In [44]:
%sql select distinct station_type from stations

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
1 rows affected.


station_type
classic


#### Get number of stations per region_id

In [50]:
%%sql

select region_id, count(1) 
from stations 
group by region_id

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
3 rows affected.


region_id,count
311,28
71,1542
70,53


#### Get top 10 stations by capacity.

In [54]:
%%sql

select station_id, capacity
from stations
group by station_id,capacity
order by capacity desc
limit 10

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
10 rows affected.


station_id,capacity
445,107
293,91
248,91
4675,87
470,84
3687,83
501,83
386,82
426,81
4406,80


#### Get number of stations where there are no kiosks

In [56]:
%sql select count(1) from stations where has_kiosk=False

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
1 rows affected.


count
3


In [61]:
%%sql
create table station_rental_types(
    station_id varchar(10) not null,
    rental_type VARCHAR(50) CHECK (rental_type = 'KEY' or rental_type = 'CREDITCARD'),
    station_rental_type_id serial primary key,
    unique(station_id,rental_type) 
)

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
Done.


[]

In [64]:
%%sql

select * from information_schema.columns where table_name='station_rental_types'

 * postgresql://itv002461_sms_user:***@pg.itversity.com:5433/itv002461_sms_db
3 rows affected.


table_catalog,table_schema,table_name,column_name,ordinal_position,column_default,is_nullable,data_type,character_maximum_length,character_octet_length,numeric_precision,numeric_precision_radix,numeric_scale,datetime_precision,interval_type,interval_precision,character_set_catalog,character_set_schema,character_set_name,collation_catalog,collation_schema,collation_name,domain_catalog,domain_schema,domain_name,udt_catalog,udt_schema,udt_name,scope_catalog,scope_schema,scope_name,maximum_cardinality,dtd_identifier,is_self_referencing,is_identity,identity_generation,identity_start,identity_increment,identity_maximum,identity_minimum,identity_cycle,is_generated,generation_expression,is_updatable
itv002461_sms_db,public,station_rental_types,station_rental_type_id,3,nextval('station_rental_types_station_rental_type_id_seq'::regclass),NO,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,int4,,,,,3,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,station_rental_types,station_id,1,,NO,character varying,10.0,40.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,1,NO,NO,,,,,,NO,NEVER,,YES
itv002461_sms_db,public,station_rental_types,rental_type,2,,YES,character varying,50.0,200.0,,,,,,,,,,,,,,,,itv002461_sms_db,pg_catalog,varchar,,,,,2,NO,NO,,,,,,NO,NEVER,,YES


In [65]:
host = 'pg.itversity.com'
port = '5433'
database = 'itv002461_sms_db'
user = 'itv002461_sms_user'
password = '7ji8g7gg8p8olbqbna5vz1tjyikaixco'

sms_connection = get_connection(
    host=host,
    port=port,
    database=database,
    user=user,
    password=password
)

In [None]:
cursor = sms_connection.cursor()

query = ("""
    INSERT INTO station_rental_types 
        (station_id,rental_type)
    VALUES 
        (%s, %s)
""")


In [105]:
count=0
rental_method=[]
for i in range(0,(df.shape)[0]):
    if(len(df.rental_methods)==2):
        rental_method.extend([(df.station_id,df.rental_methods[0]),(df.station_id,df.rental_methods[1])])
    else:
        rental_method.append((df.station_id,df.rental_methods[0]))
df2=pd.DataFrame(rental_method)
df2

Unnamed: 0,0,1
0,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
1,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
2,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
3,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
4,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
...,...,...
1618,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
1619,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
1620,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"
1621,0 72 1 79 2 82 3 ...,"[CREDITCARD, KEY]"


In [5]:
rental_method=map(lambda rental_type: [(df.station_id,rental_type[0]),(df.station_id,rental_type[1])] , list(df.rental_methods))
filter_data=list(rental_method)
for data in filter_data:
    if len(data)==2:
        print(data)
        break

[(0         72
1         79
2         82
3         83
4        116
        ... 
1618    4745
1619    4746
1620    4748
1621    4753
1622    4754
Name: station_id, Length: 1623, dtype: object, 'CREDITCARD'), (0         72
1         79
2         82
3         83
4        116
        ... 
1618    4745
1619    4746
1620    4748
1621    4753
1622    4754
Name: station_id, Length: 1623, dtype: object, 'KEY')]
