# Using SQLAlchemy to implement postgreSQL code in a more Pythonic way

SQLAlchemy is a library that provides several different options for the type of syntax one can use to implement SQL commands for many types of SQL programs and databases, including MySQL, Oracle, SQLite, and PostgreSQL. This notebook will utilize SQLAlchemy to implement PostgreSQL commands. 

Explaining the different options for syntax:

1.) One option is essentially the same as that of psycopg2, which provides syntax identical to SQL code, but embedded within strings that are executed after establishing a connection to the database.

2.) Another option--the one this notebook will utilize-- involves implementing SQL commands with a more Python-esque way, without the need to input SQL syntax/commands buried within strings.

In a few previous notebooks, I created a PostgreSQL database and ran some queries using the psycopg2 library. This time, I will also create a table within the same database, but using the SQLAlchemy library. The first several cells of code will implement CRUD: create, read, update, and delete, for demonstration purposes.

The following cells of code will implement more advanced (and arguably more interesting) SQL commands, such as joins and other SELECT queries.

In [14]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

meta = MetaData(db_co)  
v_games = Table('v_games', meta,
                      Column('ID', Integer, primary_key=True),
                      Column('title', String),
                      Column('genre', String),
                      Column('year', String))
#connect Python to the database
with db_co.connect() as conn:

    #Create table called vgames
    v_games.create()
    #specify the data to insert into a row of each column in the table
    insert_statement = v_games.insert().values(ID = 1, title="XCOM2", genre="Strategy", year="2016")
    #commit/save the change to the table
    conn.execute(insert_statement)

    #do SELECT statement, and print out its output
    #specify the SELECT statement
    select_statement = v_games.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

(1, 'XCOM2', 'Strategy', '2016')


Notice that the code creates a table called "v_games" in the database, and creates a unique ID (integer) in addition to 3 string variables as its columns.

The code then inserts 1 row of data into each column of the table, and then selects each row of each column from the video_games table.

In [15]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

meta = MetaData(db_co)  
#define/specify the table, so that SQL commands can be implemented on it
v_games = Table('v_games', meta,
                      Column('ID', Integer, primary_key=True),
                      Column('title', String),
                      Column('genre', String),
                      Column('year', String))

#connect Python to the database
with db_co.connect() as conn:

    #specify the data to insert into a row of each column in the table
    insert_statement = v_games.insert().values(ID = 2, title="Persona_5", genre="RPG", year="2017")
    #another insert statement
    insert_statement2 = v_games.insert().values(ID = 3, title="Witcher_3", genre="RPG", year="2015")

    #commit/save the change to the table
    conn.execute(insert_statement)
    conn.execute(insert_statement2)

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = v_games.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

(1, 'XCOM2', 'Strategy', '2016')
(2, 'Persona_5', 'RPG', '2017')
(3, 'Witcher_3', 'RPG', '2015')


Notice that the 2 rows of data were added successfully. 

Say you notice there's a spelling error, or some other aspect of the current rows of data that needs to be updated. This obviously can be implemented via SQLAlchemy as well. 

A specific row of data can be singled out for updating by chaining the where() method to the update() command, so as to select/single out the specific row of data you want to update.

In [15]:
#re-specify the poassword, host, and other info needed to access the database
db_string = "postgres://postgres:five@localhost:5433/goods"
db_co = create_engine(db_string)

meta = MetaData(db_co)  

with db_co.connect() as conn: 
    # Update the 
    update_statement = v_games.update().where(v_games.c.year=="2016").values(title = "XCOM_2")
    conn.execute(update_statement)
    
    # do SELECT statement, and print out its output
    #this way, we can see whether the update statement worked as expected
    select_statement = v_games.select()
    result_set = conn.execute(select_statement)
    for r in result_set:
        print(r)

(2, 'Persona_5', 'RPG', '2017')
(3, 'Witcher_3', 'RPG', '2015')
(1, 'XCOM_2', 'Strategy', '2016')


Notice the update() command successfully updated the name of XCOM_2 within the title column. 

Any data within a given table can also of course be deleted. The next lines of code will delete the data that was entered (again, this is only for demonstration purposes).

In [28]:
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

meta = MetaData(db_co)

with db_co.connect() as conn: 
    

    # Delete the row of data for XCOM_2: i.e., released where year equals 2016
    delete_statement = v_games.delete().where(v_games.c.year == "2016")
    conn.execute(delete_statement)
    
    
    # do SELECT statement, and print out its output
    select_statement = v_games.select()
    result_set = conn.execute(select_statement)
    for r in result_set:
        print(r)
    

(2, 'Persona_5', 'RPG', '2017')
(3, 'Witcher_3', 'RPG', '2015')


In [30]:
#import SQLAlchemy library 
from sqlalchemy 

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  


#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = v_games.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 14:57:00,372 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 14:57:00,373 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:00,374 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 14:57:00,375 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:00,377 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 14:57:00,378 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:00,379 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 14:57:00,380 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:00,383 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 14:57:00,384 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:00,387 INFO sqlalchemy.engine.base.Engine SELECT v_games."ID", v_games.title, v_games.genre, v_games.year 
FROM v_games
2019-02-01 14:57:00,388 INFO sqlalchemy.engine.base.Engine {}
(2

### Finally, insert several more rows of data, so the dataset isn't quite as trivial in length. The dataset will include at least 1 game for each year from 2004 to 2017. 

In [31]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  


#connect Python to the database
with db_co.connect() as conn:
    
    #specify the data to insert into a row of each column in the table
    insert_statement = v_games.insert().values(ID = 1, title="Borderlands_2", genre="FPS", year="2012")
    insert_statement2 = v_games.insert().values(ID = 4, title="Okami", genre="Adventure", year="2006")
    insert_statement3 = v_games.insert().values(ID = 5, title="Fallout_New_Vegas", genre="RPG", year="2010")
    insert_statement4 = v_games.insert().values(ID = 6, title="Red_Faction_Guerrilla", genre="Action", year="2009")
    insert_statement5 = v_games.insert().values(ID = 7, title="Bioshock", genre="RPG", year="2007")
    insert_statement6 = v_games.insert().values(ID = 8, title="Age_of_Empires_III", genre="Strategy", year="2005")
    insert_statement7 = v_games.insert().values(ID = 9, title="Uncharted_3", genre="Aventure", year="2011")
    insert_statement8 = v_games.insert().values(ID = 10, title="Fallout_3", genre="RPG", year="2008")
    insert_statement9 = v_games.insert().values(ID = 11, title="Tony_Hawk's_Underground_2", genre="Sports", year="2004")
    insert_statement10 = v_games.insert().values(ID = 12, title="XCOM: Enemy Within", genre="Strategy", year="2013")
    insert_statement11 = v_games.insert().values(ID = 13, title="Divinity_Original_Sin", genre="RPG", year="2014")
    insert_statement12 = v_games.insert().values(ID = 14, title="Pillars_of_Eternity", genre="RPG", year="2015")

    #commit/save the change to the table
    conn.execute(insert_statement)
    conn.execute(insert_statement2)
    conn.execute(insert_statement3)
    conn.execute(insert_statement4)
    conn.execute(insert_statement5)
    conn.execute(insert_statement6)
    conn.execute(insert_statement7)
    conn.execute(insert_statement8)
    conn.execute(insert_statement9)
    conn.execute(insert_statement10)
    conn.execute(insert_statement11)
    conn.execute(insert_statement12)

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = v_games.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 14:57:45,508 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 14:57:45,509 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:45,510 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 14:57:45,511 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:45,513 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 14:57:45,513 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:45,515 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 14:57:45,516 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:45,517 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 14:57:45,519 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 14:57:45,520 INFO sqlalchemy.engine.base.Engine INSERT INTO v_games ("ID", title, genre, year) VALUES (%(ID)s, %(title)s, %(genre)s, %(year)s)
2019-02-01 14:57:45,521 INFO sqlalchemy.engine.

# Code will now create a 2nd table containing game sales and gamefaqs as well as IGN ratings. Then, some SQL queries and JOINS will be performed.


The game ratings will be from 2 sources: gamefaqs user ratings (this will be called user_ratings) and IGN (which features video game critic reviews; I will call this ign_ratings). Since ratings may vary across platforms, PC ratings will be used for all games, except for console exclusives. Ratings will be inputted only for the original release of the games in this dataset (i.e., no ratings for any remakes or subsequent DLC for these games). 

#### Note the following links as sources for ratings and video game sales data:

--video game user ratings: www.gamefaqs.com

--IGN ratings, see following kaggle kernel: https://www.kaggle.com/marekpagel/video-game-sales-throughout-1985-2017/data

--video game sales data: http://www.vgchartz.com/

### Caveats: the scale of the ratings are different: gamefaqs ratings are out of 5, while IGN ratings are out of 10. 

In [39]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  

#define the new table
ratings = Table('ratings', meta,
                      Column('ID', Integer, primary_key=True),
                      Column('user_rating', Float),
                      Column('ign_rating', Float))

#connect Python to the database
with db_co.connect() as conn:
                
    #Create table called ratings
    ratings.create() 
                
    #specify the data to insert into a row of each column in the table
    insert_statement = ratings.insert().values(ID = 1, user_rating= 4.1, ign_rating = 9.0)
    insert_statement2 = ratings.insert().values(ID = 2, user_rating= 4.58 , ign_rating =9.7)
    insert_statement3 = ratings.insert().values(ID = 3, user_rating= 4.54, ign_rating = 9.3)
    insert_statement4 = ratings.insert().values(ID = 4, user_rating= 4.45 , ign_rating = 9.1)
    insert_statement5 = ratings.insert().values(ID = 5, user_rating= 4.24, ign_rating = 9.0)
    insert_statement6 = ratings.insert().values(ID = 6, user_rating= 3.63, ign_rating =8.0)
    insert_statement7 = ratings.insert().values(ID = 7, user_rating= 4.13, ign_rating  =9.7)
    insert_statement8 = ratings.insert().values(ID = 8, user_rating= 3.82, ign_rating  = 8.8)
    insert_statement9 = ratings.insert().values(ID = 9, user_rating= 4.22, ign_rating  = 10)
    insert_statement10 = ratings.insert().values(ID = 10, user_rating= 4.07, ign_rating  =9.6)
    insert_statement11 = ratings.insert().values(ID = 11, user_rating= 4.00, ign_rating  =8.6)
    insert_statement12 = ratings.insert().values(ID = 12, user_rating= 4.33, ign_rating  =9.0)
    insert_statement13 = ratings.insert().values(ID = 13, user_rating=4.14, ign_rating  =9.0)
    insert_statement14 = ratings.insert().values(ID = 14, user_rating=4.04, ign_rating  =9.0)

    #commit/save the change to the table
    conn.execute(insert_statement)
    conn.execute(insert_statement2)
    conn.execute(insert_statement3)
    conn.execute(insert_statement4)
    conn.execute(insert_statement5)
    conn.execute(insert_statement6)
    conn.execute(insert_statement7)
    conn.execute(insert_statement8)
    conn.execute(insert_statement9)
    conn.execute(insert_statement10)
    conn.execute(insert_statement11)
    conn.execute(insert_statement12)
    conn.execute(insert_statement13)
    conn.execute(insert_statement14)


    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = ratings.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 15:10:20,570 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 15:10:20,570 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 15:10:20,572 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 15:10:20,572 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 15:10:20,575 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 15:10:20,577 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 15:10:20,581 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 15:10:20,581 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 15:10:20,583 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 15:10:20,583 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 15:10:20,590 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE ratings (
	"ID" SERIAL NOT NULL, 
	user_rating FLOAT, 
	ign_rating FLOAT, 
	PRIMARY KEY ("ID")
)


2019-02-01 15:10:20,591 INFO 

# More Advanced SELECT queries and Joins 

### Select the title and year of games from the v_games table that are of the RPG genre.

In [63]:
#import SQLAlchemy library 
from sqlalchemy 

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  

#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = select([v_games.c.title, v_games.c.year]).where(v_games.c.genre =='RPG')
    
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 16:06:19,065 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 16:06:19,065 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:06:19,067 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 16:06:19,068 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:06:19,070 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:06:19,071 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:06:19,072 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:06:19,073 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:06:19,074 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 16:06:19,075 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:06:19,077 INFO sqlalchemy.engine.base.Engine SELECT v_games.title, v_games.year 
FROM v_games 
WHERE v_games.genre = %(genre_1)s
2019-02-01 16:06:19,078 INFO sqlalchemy.engine.base.Engine

## Gamefaqs ratings data: Full Outer Join of v_games table on ratings table to select all video game titles, years of release, and show their gamefaqs (user) ratings

In [68]:
#import SQLAlchemy library 
from sqlalchemy 

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey
import statistics

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  

#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = select([v_games.c.title, v_games.c.year, ratings.c.user_rating], v_games.c.ID == ratings.c.ID)
    
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 16:39:34,993 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 16:39:34,994 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:39:34,996 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 16:39:34,997 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:39:34,999 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:39:35,000 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:39:35,001 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:39:35,002 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:39:35,003 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 16:39:35,004 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:39:35,005 INFO sqlalchemy.engine.base.Engine SELECT v_games.title, v_games.year, ratings.user_rating 
FROM v_games, ratings 
WHERE v_games."ID" = ratings."ID"
2019-02-01 16:39:35,006 INFO

## IGN ratings data:  Full Outer Join of v_games table on ratings table to select all titles, years, and show each of their IGN ratings

In [69]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey
import statistics

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  

#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = select([v_games.c.title, v_games.c.year, ratings.c.ign_rating], v_games.c.ID == ratings.c.ID)
    
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

2019-02-01 16:40:48,595 INFO sqlalchemy.engine.base.Engine select version()
2019-02-01 16:40:48,596 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:40:48,598 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-02-01 16:40:48,599 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:40:48,602 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:40:48,603 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:40:48,605 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-02-01 16:40:48,606 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:40:48,608 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-02-01 16:40:48,608 INFO sqlalchemy.engine.base.Engine {}
2019-02-01 16:40:48,610 INFO sqlalchemy.engine.base.Engine SELECT v_games.title, v_games.year, ratings.ign_rating 
FROM v_games, ratings 
WHERE v_games."ID" = ratings."ID"
2019-02-01 16:40:48,611 INFO 

## Which games received a gamefaqs ratings greater than 4.3/5:? Also, what were the IGN ratings of these games?
## Implement an Inner Join between the v_games and ratings tables, and filter where user_rating >4.3.

In [83]:
#import SQLAlchemy library 

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)


meta = MetaData(db_co)  

#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = select([v_games.c.title, v_games.c.year,v_games.c.genre, ratings.c.ign_rating], v_games.c.ID == ratings.c.ID).where(ratings.c.user_rating > 4.3)
    
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

('Persona_5', '2017', 'RPG', 9.7)
('Witcher_3', '2015', 'RPG', 9.3)
('Okami', '2006', 'Adventure', 9.1)
('XCOM: Enemy Within', '2013', 'Strategy', 9.0)


### Notice that each of these games, which are considered fairly high-tier via gamefaqs users, are also highly regarded by critics: i.e., these high-scoring gamefaqs games also rate highly via IGN.

### Another interesting observation is that half of the games are RPGs, and most of the games were released since 2013.

## Left Outer Join of v_games table on ratings table-- to examine following question:
## What is the average gamefaqs rating (user_rating) for each video game genre in the database?

In [None]:
SELECT title.v_games, user_rating.ratings 
FROM v_games LEFT JOIN ratings ON v_games.ID = ratings.ID;

In [76]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey
from sqlalchemy.sql import func
#establish database connection
#via sqlalchemy, the general syntax for creating a database connection is: dialect+driver://username:password@host:port/database
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

#show the SQL code that's being generated via the Python SQLAlchemy code
db_co.echo=True

meta = MetaData(db_co)  

#connect Python to the database
with db_co.connect() as conn:

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = session.query(func.avg(ign_user_rating)).join(v_games).filter(ratings.ID==ID).filter(v_games)
    
    
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

ImportError: cannot import name 'session' from 'sqlalchemy' (/Users/kevinallen/anaconda3/lib/python3.7/site-packages/sqlalchemy/__init__.py)

## Create table of video game sales data

### Note: the video game sales data is current as of Februrary 1, 2019. 


### The sales statistic will be in millions. 

#### To keep the statistics consistent, the sales numbers will only be for PC, except for console exclusives (again, only statistics will be only for the original releases). Where the data are ambiguous or unavailable, NULL values will be filled in.

In [46]:
#import SQLAlchemy library 
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData, Float, Integer, ForeignKey

#establish database connection
db_string = "postgres://postgres:five@localhost:5433/goods"

db_co = create_engine(db_string)

meta = MetaData(db_co)  

#define the new table
sales = Table('sales', meta,
                      Column('ID', Integer, primary_key=True),
                      Column('sales', Float))
#connect Python to the database
with db_co.connect() as conn:

    #Create table called sales
    sales.create()           
              

    #specify the data to insert into a row of each column in the table
    insert_statement = sales.insert().values(ID = 1, sales= 0.94)
    insert_statement = sales.insert().values(ID = 2, sales=1.64)
    insert_statement = sales.insert().values(ID = 3, sales= 5.48)
    insert_statement = sales.insert().values(ID = 4, sales= 0.63)
    insert_statement = sales.insert().values(ID = 5, sales=5.22)
    insert_statement = sales.insert().values(ID = 6, sales= None)
    insert_statement = sales.insert().values(ID = 7, sales=0.41)
    insert_statement = sales.insert().values(ID = 8, sales= 0.37)
    insert_statement = sales.insert().values(ID = 9, sales= 6.84)
    insert_statement = sales.insert().values(ID = 10, sales= 0.98)
    insert_statement = sales.insert().values(ID = 11, sales= None)
    insert_statement = sales.insert().values(ID = 12, sales= None)
    insert_statement = sales.insert().values(ID = 13, sales=0.02)
    insert_statement = sales.insert().values(ID = 14, sales= None)

    #commit/save the change to the table
    conn.execute(insert_statement)
    conn.execute(insert_statement2)
    conn.execute(insert_statement3)
    conn.execute(insert_statement4)
    conn.execute(insert_statement5)
    conn.execute(insert_statement6)
    conn.execute(insert_statement7)
    conn.execute(insert_statement8)
    conn.execute(insert_statement9)
    conn.execute(insert_statement10)
    conn.execute(insert_statement11)
    conn.execute(insert_statement12)
    conn.execute(insert_statement13)
    conn.execute(insert_statement14)

    #do SELECT statement, and print out its output as a sanity check
    #specify the SELECT statement
    select_statement = v_games.select()
    #execute the SELECT query
    result_set = conn.execute(select_statement)
    #iterate on each row from the results of the SELECT query
    for r in result_set:
        #print the select statement
        print(r)

IntegrityError: (psycopg2.IntegrityError) duplicate key value violates unique constraint "ratings_pkey"
DETAIL:  Key ("ID")=(2) already exists.
 [SQL: 'INSERT INTO ratings ("ID", user_rating, ign_rating) VALUES (%(ID)s, %(user_rating)s, %(ign_rating)s)'] [parameters: {'ID': 2, 'user_rating': 4.58, 'ign_rating': 9.7}] (Background on this error at: http://sqlalche.me/e/gkpj)