# RedShift Challenge

We have a database in our qa-app cluster, ds schema, called dummytable. The challenge is to create a notebook in your sandbox that does the following:
* Read the records from this table and display in a pandas dataframe
* Create a new record in the table, following the format of the other records, then display only the new record in a dataframe
* Update an existing record, then display the updated record in a dataframe
* Delete the new record you created, and display all records in a dataframe to ensure it was deleted.
* Be sure to add some markdown documentation, and inline code documentation to describe what your code is doing.
* Once finished, push up to our data-sci git rep

In [6]:
import boto3 
import json
import pandas as pd
import psycopg2

In [7]:
! aws sso login --profile Stellaralgo-DataScienceAdmin

Attempting to automatically open the SSO authorization page in your default browser.
If the browser does not open or you wish to use a different device to authorize this request, open the following URL:

https://device.sso.us-east-1.amazonaws.com/

Then enter the code:

SLBF-TSPL
Successfully logged into Start URL: https://stellaralgo.awsapps.com/start#/


### Assign global variables

In [8]:
CLUSTER = 'qa-app'
DBNAME = 'datascience'
SCHEMA = 'ds'
TABLE = 'dummytable'

### Make connection to redshift

In [9]:
session = boto3.setup_default_session(profile_name='Stellaralgo-DataScienceAdmin')
rs_client = boto3.client('redshift', "us-east-1")

endpoint = 'qa-app.ctjussvyafp4.us-east-1.redshift.amazonaws.com'

cluster_credentials = rs_client.get_cluster_credentials(
    ClusterIdentifier = CLUSTER,
    DbUser = 'admin',
    DbName = DBNAME,
    DbGroups = ["admin_group"],
    AutoCreate = True
)

CNXN = psycopg2.connect(
    host = endpoint,
    port = 5439,
    user = cluster_credentials["DbUser"],
    password = cluster_credentials["DbPassword"],
    database = DBNAME
)

### Read table from Redshift and save as Pandas data frame

In [10]:
cursor = CNXN.cursor()

select_sql = f"""
    SELECT *
    FROM {DBNAME}.{SCHEMA}.{TABLE}
"""

df_customerScores = pd.read_sql(select_sql, CNXN)

display(df_customerScores)



Unnamed: 0,playerid,dob,gamesplayed,injured,position,name,numassists,numgoals,pointpercentage
0,11,1988-08-04,20,False,RW,Dale,24,21,2.1
1,12,1985-06-05,20,False,C,Skip,15,36,2.5
2,13,1985-03-15,15,True,LW,Sanders,20,30,1.9
3,14,1983-02-20,20,False,LD,Patty,38,12,1.5
4,15,1987-08-04,18,False,RD,Reynolds,16,6,0.8
5,25,1993-03-02,20,True,G,Frank,2,0,0.1


### Add new row to the Redshift table

In [11]:
new_row_sql = f"""
    INSERT INTO {DBNAME}.{SCHEMA}.{TABLE} (
        dob,
        gamesplayed,
        injured,
        position,
        name,
        numassists,
        numgoals,
        pointpercentage
    ) VALUES (
        '1993-03-02',
        20,
        False,
        'G',
        'Frank',
        2,
        0,
        0.1
    )
"""

cursor.execute(new_row_sql)

### Show new row

In [12]:
get_new_row_sql = f"""
    SELECT *
    FROM {DBNAME}.{SCHEMA}.{TABLE}
    WHERE name = 'Frank' AND position = 'G'
"""

display(pd.read_sql(get_new_row_sql, CNXN))



Unnamed: 0,playerid,dob,gamesplayed,injured,position,name,numassists,numgoals,pointpercentage
0,25,1993-03-02,20,True,G,Frank,2,0,0.1
1,26,1993-03-02,20,False,G,Frank,2,0,0.1


### Update record in Redshift table

In [13]:
update_sql = f"""
    UPDATE {DBNAME}.{SCHEMA}.{TABLE}
    SET injured = True
    WHERE name = 'Frank' AND position = 'G'
"""

cursor.execute(update_sql)

display(pd.read_sql(get_new_row_sql, CNXN))



Unnamed: 0,playerid,dob,gamesplayed,injured,position,name,numassists,numgoals,pointpercentage
0,25,1993-03-02,20,True,G,Frank,2,0,0.1
1,26,1993-03-02,20,True,G,Frank,2,0,0.1


### Delete a record in the Redshift table

In [14]:
delete_sql = f"""
    DELETE 
    FROM {DBNAME}.{SCHEMA}.{TABLE}
    WHERE playerid = 22
"""

cursor.execute(delete_sql)

display(pd.read_sql(select_sql, CNXN))



Unnamed: 0,playerid,dob,gamesplayed,injured,position,name,numassists,numgoals,pointpercentage
0,11,1988-08-04,20,False,RW,Dale,24,21,2.1
1,12,1985-06-05,20,False,C,Skip,15,36,2.5
2,13,1985-03-15,15,True,LW,Sanders,20,30,1.9
3,14,1983-02-20,20,False,LD,Patty,38,12,1.5
4,15,1987-08-04,18,False,RD,Reynolds,16,6,0.8
5,25,1993-03-02,20,True,G,Frank,2,0,0.1
6,26,1993-03-02,20,True,G,Frank,2,0,0.1


### Commit and close connections

In [15]:
CNXN.commit()
cursor.close() 
CNXN.close() 