In [1]:
!pip install psycopg2



In [2]:
import psycopg2
import pandas as pd

LINK TO DATABASE INFO & SCHEMA: https://github.com/isaac-campbell-smith/Pokestars

In [3]:
def pretty_query(cur, query, conn):
    conn.rollback() #if most recent query threw an error you will not be able to make another query without rollback
    cur.execute(query)
    data = cur.fetchall() 
    #cur.description stores SQL column information as a tuple
        #the name method contains SQL column labels
    headers = [head.name for head in cur.description]
    out = pd.DataFrame(data=data, columns=headers)
    
    return out

In [4]:
conn = psycopg2.connect('') 
cur = conn.cursor()

In [5]:
# Example Query to test connection
query = """
SELECT * FROM pokemon LIMIT 10;
"""
pretty_query(cur, query, conn)

Unnamed: 0,id,name,type_1,type_2,hp,attack,defense,sp_attack,sp_defense,speed
0,0,Gliscor,Ground,Flying,75,95,125,45,75,95
1,1,Fletchinder,Fire,Flying,62,73,55,56,52,84
2,2,Poliwrath,Water,Fighting,90,95,95,70,90,70
3,3,Cresselia,Psychic,,120,70,120,75,130,85
4,4,Frogadier,Water,,54,63,52,83,56,97
5,5,Sceptile,Grass,,70,110,75,145,85,145
6,6,Lapras,Water,Ice,130,85,80,85,95,60
7,7,Sharpedo,Water,Dark,70,140,70,110,65,105
8,8,Slowbro,Water,Psychic,95,75,180,130,80,30
9,9,Omastar,Rock,Water,70,60,125,115,70,55


### WARM-UP
There was a new Pokemon officially released last month (08-2020); will it rise to the top of the ladder? I'd like to know what where it ranked in usage, but we'll start small and build that query out in pieces. To start, tell me it's name, typing and usage from last month. 

Pro Tip: Think about identifying this from the `pokemon` table, and not the `battles` table, as it was the most recent entry. There was a second Pokemon who was already in the `pokemon` table but also only shows up for the first time in the `battles` table last month because it's usage climbed above a relevant threshold.<br><br>
[Solution](#Warm-Up)

In [6]:
query = """
;
"""
pretty_query(cur, query, conn)

### QUESTION 1.a
Now execute a query for this Pokemon's id. The catch? Create a temporary table that contains everything from the previous query (+ it's id of course), then query it's id from your temporary table.

Temporary Table structure:

>WITH &nbsp;&nbsp; temp_table_name(column_label_1, column_label_2, etc.) <br>
&nbsp;&nbsp;&nbsp;  AS <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    ( QUERY )
    <br>SELECT id FROM temp_table_name;
<br><br>

[Solution](#Q1.a)

In [7]:
query = """
;
"""
pretty_query(cur, query, conn)

### QUESTION 1.b
What were the top 10 most used Pokemon last month ('08-2020')? <br> Your query should return columns: id & rank

If you've never used the SQL RANK() function before (also known as a WINDOW FUNCTION), it's a simple but powerful tool that ranks all rows in a query by any given column and outputs as a new column. The general syntax is:<br>
> SELECT RANK() OVER(ORDER BY column [ASC/DESC])

An additional parameter that is commonly used in RANK() functions is the PARTITION BY clause. You can kind of think of this as a GROUP BY statement as the RANK will reset on whatever column you've partitioned by. So, for example, if you wanted the ranking of cities for every state, the syntax would be:
> SELECT RANK() OVER(PARTITION BY state ORDER BY city DESC)

You shouldn't need to use a partition clause for this query, but it will come in handy later!
<br><br>

[Solution](#Q1.b)

In [8]:
query="""
;
      """
pretty_query(cur, query, conn)

### QUESTION 1.c
Now, using the temporary table from 1.a and our query from 1.b as a second temporary table (don't forget to remove your LIMIT 10 clause!), you should be able to answer the question - What was was Zarude's usage ranking last month? 

<br> Adding temp tables is as simple as

WITH &nbsp;&nbsp; table_name(column_label_1, column_label_2, etc.) <br>
&nbsp;&nbsp;&nbsp;  AS <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    ( QUERY ),<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; table_name_2(column_label_1, column_label_2, etc.) <br>
&nbsp;&nbsp;&nbsp;  AS <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    ( QUERY )

<br>
This query doesn't really <i>need</i> to use temporary tables but it's good to practice them. Refactor the solution if you have time.
<br><br>

[Solution](#Q1.c)

In [9]:
query = """
;
"""
pretty_query(cur, query, conn)

### QUESTION 2.a
How did the average usage ranking for Grass and Dark types compare against all other types in August? In other words, what was the average ranking for all types?
<br><br>
To properly execute this query you'll need to use a UNION ALL statement since every Pokemon has 1-2 types. You can think of UNION ALL as the SQL equivalent of pandas df.concat(axis=0). You'll need to add some JOINs and WHERE clauses here, but the basic syntax for this sub-query will be:

SELECT type_1 AS type<br>
&nbsp;&nbsp; UNION ALL<br>
SELECT type_2 AS type WHERE type_2 != 'None'

Note that I aliased type_1 & type_2 as type. SQL will always throw an error if your 2 SELECT statements do not have identical column headers. We also don't want to count mono-type Pokemon twice (hence WHERE type_2 != 'None')
<br><br>

[Solution](#Q2.a)

In [10]:
query = """
;
"""
pretty_query(cur, query, conn)

### QUESTION 2.b
I'm not convinced this gives us the best picture of what Zarude is up against. Instead of calculating the average usage, get a count of each type for Pokemon with a top 30 usage ranking.
<br><br>

[Solution](#Q2.b)

In [11]:
query = """
;
"""
pretty_query(cur, query, conn)

### QUESTION 3

There were 2 Pokemon in the Top 10 July usage rankings and have been dominating the meta-game all summer. But they got hit with bans around mid-August, and as result, dramatically dropped in usage for the whole of August. What are their names?
<br><br>There are a number of ways you can accomplish this query but i'd recommend by starting with a temporary rankings table using a PARTITION BY clause and go from there.
<br><br>

[Solution](#Q3)

In [12]:
query = """
;
"""
pretty_query(cur, query, conn)

# RUN THIS CELL WHEN YOU'RE DONE OR ELSE I WILL FIND YOU AND HURT YOU

In [21]:
cur.close()  # Close the cursor
conn.close()

# Warm-Up

In [13]:
query = """
SELECT p.name, type_1, type_2, usage
  FROM battles b
  JOIN pokemon p ON p.id=b.id
  WHERE b.id=(SELECT MAX(id) FROM pokemon);
"""
pretty_query(cur, query, conn)

Unnamed: 0,name,type_1,type_2,usage
0,Zarude,Dark,Grass,0.023724


# Q1.a

In [14]:
query = """
WITH zarude(id, name, type_1, type_2, usage)
  AS 
     (SELECT b.id, p.name, p.type_1, p.type_2, b.usage
        FROM battles b
        JOIN pokemon p ON p.id=b.id
       WHERE b.id=(SELECT MAX(id) FROM pokemon)
     )
SELECT id FROM zarude;
"""
pretty_query(cur, query, conn)

Unnamed: 0,id
0,712


# Q1.b

In [15]:
query="""
     SELECT id, RANK () OVER(ORDER BY usage DESC)
       FROM battles
      WHERE month='08-2020'
      LIMIT 10;
      """
pretty_query(cur, query, conn)

Unnamed: 0,id,rank
0,674,1
1,32,2
2,673,3
3,624,4
4,206,5
5,84,6
6,707,7
7,531,8
8,186,9
9,658,10


# Q1.c

In [16]:
query = """
WITH zarude(id, name, type_1, type_2, usage)
  AS 
     (SELECT b.id, p.name, p.type_1, p.type_2, b.usage
        FROM battles b
        JOIN pokemon p ON p.id=b.id
       WHERE b.id=(SELECT MAX(id) FROM pokemon)
     ),

     august_ranks(id, rank)
  AS
    (
     SELECT id, RANK () OVER(ORDER BY usage DESC)
       FROM battles b
      WHERE month='08-2020'
    )
     
SELECT rank
  FROM august_ranks
  JOIN zarude ON august_ranks.id=zarude.id;
"""
pretty_query(cur, query, conn)

Unnamed: 0,rank
0,61


# Q2.a

In [18]:
query = """
WITH type_ranks(type, usage)
  AS 
    (SELECT p.type_1 AS type, usage 
       FROM battles b 
       JOIN pokemon p ON b.id=p.id 
      WHERE month='08-2020'
   UNION ALL
     SELECT p.type_2 AS type, usage 
       FROM battles b 
       JOIN pokemon p ON b.id=p.id 
      WHERE type_2 !='None' AND month='08-2020')
      
SELECT type, RANK() OVER(ORDER BY AVG(usage) DESC)
  FROM type_ranks 
  GROUP BY type
;
"""
pretty_query(cur, query, conn)

Unnamed: 0,type,rank
0,Grass,1
1,Steel,2
2,Flying,3
3,Ghost,4
4,Dragon,5
5,Electric,6
6,Dark,7
7,Fire,8
8,Fairy,9
9,Ground,10


# Q2.b

In [19]:
query = """
WITH type_ranks(type, rank)
  AS 
    (SELECT p.type_1 AS type, RANK() OVER(ORDER BY usage DESC)
       FROM battles b 
       JOIN pokemon p ON b.id=p.id 
      WHERE month='08-2020'
   UNION ALL
     SELECT p.type_2 AS type, RANK() OVER(ORDER BY usage DESC)
       FROM battles b 
       JOIN pokemon p ON b.id=p.id 
      WHERE type_2 !='None' AND month='08-2020')
      
SELECT type, COUNT(rank)
  FROM type_ranks 
 WHERE rank<=30
  GROUP BY type
 ORDER BY count DESC
;
"""
pretty_query(cur, query, conn)

Unnamed: 0,type,count
0,Flying,5
1,Steel,5
2,Poison,5
3,Fairy,5
4,Water,5
5,Dark,5
6,Ghost,4
7,Grass,4
8,Dragon,4
9,Fighting,4


# Q3

In [20]:
query = """
WITH 
     july_august(id, month, rank)
  AS
    (
     SELECT id, month, RANK () OVER(PARTITION BY month ORDER BY usage DESC)
       FROM battles
      WHERE month IN ('07-2020','08-2020')
    )
    
SELECT p.name
 FROM july_august AS ja
 JOIN pokemon AS p
   ON ja.id=p.id
 WHERE ja.id NOT IN (SELECT id FROM july_august WHERE month='08-2020' AND rank <= 10)
       AND rank <= 10 AND month='07-2020';
"""
pretty_query(cur, query, conn)

Unnamed: 0,name
0,Magearna
1,Cinderace
