### Setup
Please complete this exercise using sqlite3 and Jupyter notebook.

Download the [SQLite database](https://www.kaggle.com/hugomathien/soccer/downloads/soccer.zip) and load in your notebook using the sqlite3 library. 

In [2]:
! ls -al

total 128
drwxr-xr-x  11 jadams  staff    352 Oct 31 15:22 [34m.[m[m
drwxr-xr-x  12 jadams  staff    384 Oct 20 13:45 [34m..[m[m
drwxr-xr-x   4 jadams  staff    128 Oct 31 15:22 [34m.ipynb_checkpoints[m[m
-rw-r--r--@  1 jadams  staff     32 Oct 24 10:28 02_SQL_answers.sql
-rw-r--r--   1 jadams  staff   2476 Oct 20 11:32 09_part_i_sql_w3school.md
-rw-r--r--   1 jadams  staff  34163 Oct 31 15:21 09_part_ii_baseball.ipynb
-rw-r--r--   1 jadams  staff   1625 Oct 22 12:55 09_part_ii_sql_baseball.md
-rw-r--r--   1 jadams  staff   1135 Oct 20 11:20 09_part_iii_sql_soccer.md
-rw-r--r--   1 jadams  staff   5915 Oct 20 11:20 09_part_iv_sql_tennis.md
-rw-r--r--   1 jadams  staff     72 Oct 31 15:22 Untitled.ipynb
drwxr-xr-x  28 jadams  staff    896 Oct 20 13:51 [34mbaseball[m[m


In [3]:
! mkdir soccer

In [8]:
!wget https://www.kaggle.com/hugomathien/soccer/downloads/soccer.zip -P 'soccer/'

--2018-10-31 15:24:14--  https://www.kaggle.com/hugomathien/soccer/downloads/soccer.zip
Resolving www.kaggle.com (www.kaggle.com)... 23.96.207.25
Connecting to www.kaggle.com (www.kaggle.com)|23.96.207.25|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /account/login?returnUrl=%2Fhugomathien%2Fsoccer%2Fdata [following]
--2018-10-31 15:24:15--  https://www.kaggle.com/account/login?returnUrl=%2Fhugomathien%2Fsoccer%2Fdata
Reusing existing connection to www.kaggle.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 6849 (6.7K) [text/html]
Saving to: ‘soccer/soccer.zip’


2018-10-31 15:24:15 (37.1 MB/s) - ‘soccer/soccer.zip’ saved [6849/6849]



In [15]:
! unzip soccer/soccer.zip -d 'soccer/'
# ! unzip baseball/lahman-csv_2014-02-14.zip -d 'baseball/'

Archive:  soccer/soccer.zip
  inflating: soccer/database.sqlite  


In [25]:
from sqlalchemy import create_engine, inspect
import pandas as pd

engine = create_engine('sqlite:///soccer/database.sqlite')

### Question 1

1. Which team scored the most points when playing at home?  

In [24]:
inspector = inspect(engine)
for table_name in inspector.get_table_names():
    print(table_name)

Country
League
Match
Player
Player_Attributes
Team
Team_Attributes
sqlite_sequence


In [37]:
sql = """
            SELECT t.team_long_name as team_name, sum(m.home_team_goal) as goals_scored_at_home
            FROM Match as m
            INNER JOIN Team as t
                ON m.home_team_api_id = t.team_api_id
            GROUP BY team_name
            ORDER BY goals_scored_at_home DESC
            LIMIT 1;
    """

pd.read_sql(sql, engine)

Unnamed: 0,team_name,goals_scored_at_home
0,Real Madrid CF,505


Real Madrid scored the most home goals.

### Question 2

2. Did this team also score the most points when playing away?  

In [38]:
sql = """
            SELECT t.team_long_name as team_name, sum(m.away_team_goal) as goals_scored_at_home
            FROM Match as m
            INNER JOIN Team as t
                ON m.away_team_api_id = t.team_api_id
            GROUP BY team_name
            ORDER BY goals_scored_at_home DESC
            LIMIT 1;
    """

pd.read_sql(sql, engine)

Unnamed: 0,team_name,goals_scored_at_home
0,FC Barcelona,354


No! The arch-rival of FC Barcelona socred the most away goals.

### Question 3
3. How many matches resulted in a tie?  

In [43]:
sql = """
            SELECT COUNT(1) as num_matches_tied
            FROM Match as m
            WHERE home_team_goal = away_team_goal;
    """

pd.read_sql(sql, engine)

Unnamed: 0,num_matches_tied
0,6596


6,596 matches ended in a tie.

### Question 4
4. How many players have Smith for their last name? How many have 'smith' anywhere in their name?

In [50]:
sql = """
            SELECT COUNT(1) as num_players
            FROM Player
            WHERE player_name LIKE '%smith';
    """

pd.read_sql(sql, engine)

Unnamed: 0,num_players
0,18


In [49]:
sql = """
            SELECT COUNT(1) as num_players
            FROM Player
            WHERE player_name LIKE '%smith%';
    """

pd.read_sql(sql, engine)

Unnamed: 0,num_players
0,18


There are 18 players with a last name of Smith. These are the only players have 'smith' anywhere in their name.

### Question 5
5. What was the median tie score? Use the value determined in the previous question for the number of tie games. *Hint:* PostgreSQL does not have a median function. Instead, think about the steps required to calculate a median and use the [`WITH`](https://www.postgresql.org/docs/8.4/static/queries-with.html) command to store stepwise results as a table and then operate on these results. 

In [86]:
sql_sort_total_goals = """
        WITH middle AS (
            SELECT COUNT(1) / 2 as num_matches_tied
            FROM Match as m
            WHERE home_team_goal = away_team_goal
            )

        SELECT  home_team_goal 
            ,   away_team_goal
            ,   home_team_goal + away_team_goal AS total_goals_scored
        FROM Match
        WHERE home_team_goal = away_team_goal
        ORDER BY total_goals_scored
        LIMIT (SELECT num_matches_tied FROM middle);
    """

pd.read_sql(sql_sort_total_goals, engine)

Unnamed: 0,home_team_goal,away_team_goal,total_goals_scored
0,0,0,0
1,0,0,0
2,0,0,0
3,0,0,0
4,0,0,0
5,0,0,0
6,0,0,0
7,0,0,0
8,0,0,0
9,0,0,0
