# Other Statements

In this notebook we'll present several SQL statements and commands so that we can add new skills to our SQl arsenal.

Just like in previous notebooks we create the usual functions and objects

In [1]:
# Usual libraries
import pandas as pd
import psycopg2 as pg2

# Creates the connection to the database
connection = pg2.connect(database = 'premier_league', user = 'postgres', password = 'password')

# Defines get_data function
def get_data(query, rows = 10):

    with connection.cursor() as cursor:
        cursor.execute(query)

        if rows == 'all':
            raw_data = cursor.fetchall()
        else:
            raw_data = cursor.fetchmany(rows) 

        col_names = [col_desc[0] for col_desc in cursor.description]
        data = pd.DataFrame(raw_data, columns = col_names)

    return data

# Creates cursor
cursor = connection.cursor()

Recall that, in the previous notebook, to show the usage of the DELELTE and DROP statements we removed some some rows and columns form the premiere_league database. To have more data to play with, we'll restore the database to its original state.

In [2]:
# Deletes tables
delete_tables = 'DROP TABLE IF EXISTS clubs, players CASCADE'
cursor.execute(delete_tables)

create_clubs_table = '''
                     CREATE TABLE clubs(
                         club_id SERIAL PRIMARY KEY,
                         name VARCHAR(100) UNIQUE NOT NULL,
                         stadium_name VARCHAR(100), 
                         location VARCHAR(100),
                         times_champion INTEGER CHECK(times_champion >= 0)
                     )
                     '''

create_players_table = '''
                       CREATE TABLE players(
                           player_id SERIAL PRIMARY KEY,
                           first_name VARCHAR(100) NOT NULL,
                           last_name VARCHAR(100) NOT NULL, 
                           club_id INTEGER REFERENCES clubs(club_id),
                           nationality VARCHAR(100) NOT NULL
                       )
                       '''

# Creates tables
cursor.execute(create_clubs_table)
cursor.execute(create_players_table)

Interts the rows into both tables 

In [3]:
# Inserts clubs
clubs = [
        ('Liverpool FC', 'Anfield', 'Liverpool', 19),
        ('Manchester United FC', 'Old Trafford', 'Manchester', 20),
        ('Manchester City FC', 'Etihad Stadium', 'Manchester', 6),
        ('Chelsea FC', 'Stamford Bridge', 'London', 6),
        ('Tottenham Hotspur FC', 'Tottenham Hotspur Stadium', 'London', 2)
        ]

insert_generic_club = '''
                      INSERT INTO clubs(name, stadium_name, location, times_champion)
                      VALUES
                      (%s, %s, %s, %s)
                      '''

for club in clubs:
    cursor.execute(insert_generic_club, club)

# Insterts players
players = [
          ('Mohamed', 'Salah', 1, 'Egyptian'),
          ('Sadio', 'Mane', 1, 'Senegalese'),
          ('Marcus', 'Rashford', 2, 'British'),
          ('Sergio', 'Aguero', 3, 'Argentine'),
          ('Timo', 'Werner', 4, 'German') ,
          ('Harry', 'Kane', 5, 'British')
          ]

insert_player = '''
                INSERT INTO players(first_name, last_name, club_id, Nationality)
                VALUES
                (%s, %s, %s, %s)
                '''
                
for player in players:
    cursor.execute(insert_player, player)

Let's display the tables

In [4]:
clubs = get_data('SELECT * FROM clubs')
players = get_data('SELECT * FROM players')

clubs

Unnamed: 0,club_id,name,stadium_name,location,times_champion
0,1,Liverpool FC,Anfield,Liverpool,19
1,2,Manchester United FC,Old Trafford,Manchester,20
2,3,Manchester City FC,Etihad Stadium,Manchester,6
3,4,Chelsea FC,Stamford Bridge,London,6
4,5,Tottenham Hotspur FC,Tottenham Hotspur Stadium,London,2


In [5]:
players

Unnamed: 0,player_id,first_name,last_name,club_id,nationality
0,1,Mohamed,Salah,1,Egyptian
1,2,Sadio,Mane,1,Senegalese
2,3,Marcus,Rashford,2,British
3,4,Sergio,Aguero,3,Argentine
4,5,Timo,Werner,4,German
5,6,Harry,Kane,5,British


## CASE Statement

The CASE statement is very similiar to the _ if, if else, else _ block in other programming languages. It will help us to only excecute a particular set of SQL statements when certain conditions are satisfied. 

## Ex. 1

For example, we can ask whether or not a player plays for the liverpool

In [6]:
query_1 = '''
          SELECT first_name, last_name, 
          CASE 
              WHEN club_id = 1 THEN 'yes'
              ELSE 'no'
          END AS plays_for_liverpool 
          FROM players        
          '''

plays_for_liverpool = get_data(query_1)
plays_for_liverpool

Unnamed: 0,first_name,last_name,plays_for_liverpool
0,Mohamed,Salah,yes
1,Sadio,Mane,yes
2,Marcus,Rashford,no
3,Sergio,Aguero,no
4,Timo,Werner,no
5,Harry,Kane,no


## COALESCE Function

The COALESCE function accepts an unlimited number or arguments and returns the first not NULL argument. The main purpose of this functions is tu help us deal with the NULL values in a table.

Notice that there are no NULL values in both tables. We'll modify a couple of rows un the clubs table so that there are NULL values in the table.

In [7]:
insert_null_values = '''
                     UPDATE clubs
                     SET times_champion = NULL
                     WHERE name IN ('Chelsea FC', 'Manchester City FC')
                     '''

cursor.execute(insert_null_values)
# Displays the clubs table
clubs = get_data('SELECT * FROM clubs')
clubs

Unnamed: 0,club_id,name,stadium_name,location,times_champion
0,1,Liverpool FC,Anfield,Liverpool,19.0
1,2,Manchester United FC,Old Trafford,Manchester,20.0
2,5,Tottenham Hotspur FC,Tottenham Hotspur Stadium,London,2.0
3,3,Manchester City FC,Etihad Stadium,Manchester,
4,4,Chelsea FC,Stamford Bridge,London,


## Ex. 2

Imagine that, for some reason, we need to add one to all the times_champion values. Also, let´s suppose that the reason why some clubs have a NULL value in the time_champion field, is because they have never won the premier league (despite we know that Chelsea and Man. City have, indeed, won the league several times). 

Let's see what happens if we just try to make the sum without taking attention to the NULL values in the table

In [8]:
query_2 = '''
          SELECT name, (times_champion + 1) AS new_values
          FROM clubs      
          '''

new_values = get_data(query_2)
new_values

Unnamed: 0,name,new_values
0,Liverpool FC,20.0
1,Manchester United FC,21.0
2,Tottenham Hotspur FC,3.0
3,Manchester City FC,
4,Chelsea FC,


Note that SQL couldn not correctly calcualte calculate the new value for the clubs that haven't won the premier league. This is because SQL cannot add one to a NULL value. 

We can overcome thi problem by using the COALESCE function

In [9]:
query_3 = '''
          SELECT name, (COALESCE(times_champion, 0) + 1) AS new_values
          FROM clubs      
          '''

new_values = get_data(query_3)
new_values

Unnamed: 0,name,new_values
0,Liverpool FC,20
1,Manchester United FC,21
2,Tottenham Hotspur FC,3
3,Manchester City FC,1
4,Chelsea FC,1


This way, if the times_champion value happens to be NULL, the COALESCE function will replace it with a zero.

## CAST Function

The CAST function converts a value into a specified data type. Keep in mind that the convertion we try to perform has to make sense, for example, there is no way to convert the string _hola_ into an integer.


## Ex. 3 

We can convert the the club_id into string values

In [10]:
query_4 = 'SELECT CAST(club_id AS VARCHAR) FROM clubs'

club_id_strings = get_data(query_4)
club_id_strings

Unnamed: 0,club_id
0,1
1,2
2,5
3,3
4,4


Let'd make sure that these values are, in fact, stirngs

In [11]:
value = club_id_strings.at[0,'club_id']
value, type(value)

('1', str)

## CREATE VIEW Statemenr

Imagine that in the particular project we're working in, we ended up running a specific query over and over again. The CREATE VIEW help us to access the output data of that specific query as it was a table of the database.

## Ex. 4

Consider the following query


In [12]:
query_5 = '''
          SELECT player_id, first_name, last_name, name AS club_name, times_champion     
          FROM clubs INNER JOIN players
          ON clubs.club_id = players.club_id
          '''
joined_data = get_data(query_5)
joined_data

Unnamed: 0,player_id,first_name,last_name,club_name,times_champion
0,1,Mohamed,Salah,Liverpool FC,19.0
1,2,Sadio,Mane,Liverpool FC,19.0
2,3,Marcus,Rashford,Manchester United FC,20.0
3,4,Sergio,Aguero,Manchester City FC,
4,5,Timo,Werner,Chelsea FC,
5,6,Harry,Kane,Tottenham Hotspur FC,2.0


We can access this data in a more simple manner, to do so, we need to create a VIEW 

In [13]:
create_view = '''
              CREATE VIEW joined_data AS
              SELECT player_id, first_name, last_name, name AS club_name, times_champion     
              FROM clubs INNER JOIN players
              ON clubs.club_id = players.club_id
              '''

cursor.execute(create_view)

Now, we can do the followig

In [14]:
joined_data = get_data('SELECT * FROM joined_data')
joined_data

Unnamed: 0,player_id,first_name,last_name,club_name,times_champion
0,1,Mohamed,Salah,Liverpool FC,19.0
1,2,Sadio,Mane,Liverpool FC,19.0
2,3,Marcus,Rashford,Manchester United FC,20.0
3,4,Sergio,Aguero,Manchester City FC,
4,5,Timo,Werner,Chelsea FC,
5,6,Harry,Kane,Tottenham Hotspur FC,2.0


## Importing Data

In this section we'll show you how to load a .csv file into an already esxisting tamble in the database. We've already created a .csv file for this example, this table has information about other clubs in the premier league. In the following cell we use pandas to show its content.

In [15]:
file_path = 'csv_files/new_players.csv'
new_players = pd.read_csv(file_path)
new_players

Unnamed: 0,club_id,name,stadium_name,location,times_champion
0,6,Wolverhampton Wanderers FC,Molineux Stadium,Waterloo Road,3
1,7,West Ham United FC,Olympic Stadium,London,0
2,8,Arsenal FC,Emirates Stadium,London,13


## Ex 5
Now that we know the content of the table we can continue to load it into the clubs table in the database, to do so, we make use of the copy_from method of the cursor object

In [16]:
with open(file_path, 'r') as new_table:
    # Skips the header row
    next(new_table) 
    cursor.copy_from(new_table, 'clubs', sep=',')

clubs = get_data('SELECT * FROM clubs')
clubs

Unnamed: 0,club_id,name,stadium_name,location,times_champion
0,1,Liverpool FC,Anfield,Liverpool,19.0
1,2,Manchester United FC,Old Trafford,Manchester,20.0
2,5,Tottenham Hotspur FC,Tottenham Hotspur Stadium,London,2.0
3,3,Manchester City FC,Etihad Stadium,Manchester,
4,4,Chelsea FC,Stamford Bridge,London,
5,6,Wolverhampton Wanderers FC,Molineux Stadium,Waterloo Road,3.0
6,7,West Ham United FC,Olympic Stadium,London,0.0
7,8,Arsenal FC,Emirates Stadium,London,13.0


## Importing Data

We can also import a table to a csv file in a very easy way using the copy_to method of the cursro object

In [17]:
with open('csv_files/exported_table.csv', 'w') as new_file:
    cursor.copy_to(new_file, 'clubs', sep = ',', null = 'NULL')

We can use pandas to check the content of the file that we just created

In [18]:
exported_file = pd.read_csv('csv_files/exported_table.csv')
exported_file

Unnamed: 0,1,Liverpool FC,Anfield,Liverpool,19
0,2,Manchester United FC,Old Trafford,Manchester,20.0
1,5,Tottenham Hotspur FC,Tottenham Hotspur Stadium,London,2.0
2,3,Manchester City FC,Etihad Stadium,Manchester,
3,4,Chelsea FC,Stamford Bridge,London,
4,6,Wolverhampton Wanderers FC,Molineux Stadium,Waterloo Road,3.0
5,7,West Ham United FC,Olympic Stadium,London,0.0
6,8,Arsenal FC,Emirates Stadium,London,13.0


Let's save the changes

In [19]:
connection.commit()

Finaly, we must close the connection and cursor 

In [20]:
cursor.close()
connection.close()