# Goal
- Look at the [kaggle page](https://www.kaggle.com/laudanum/footballdelphi)
    - put into a SQL DB
- Information Needed
    - Name of the team
    - Total # of goals scored by the 2011 season
    - Total # of wins of the 2011 season team
    - Histogram viz of the team's win/loss for the 2011 season
    - Team's win percentage on the days it was raining during the game in 2011 season
    

# Getting Weather Data
* This is for the last sub-note taking mark mark under "Information Needed"
* Need weather Data to find the team's win percentage on rainy days

- Use [DarkSky API](https://darksky.net/dev) to get historical weather data
    - Use weather in Berlin, Germany as a proxy
    * If it was raining in Berlin on a game day, then count that as a rain game

In [448]:
import pandas as pd
import requests
import sqlite3
import seaborn as sns

In [465]:
# Creating a cursor for the soccer SQL DB
conn = sqlite3.connect("Data/soccer.sqlite")
cur = conn.cursor()

# Sum of the home team's home goals, and the away team's away goals
soccer_df = pd.read_sql_query("""
SELECT Date, HomeTeam, AwayTeam, SUM(FTAG) as away_goals, SUM(FTHG) as home_goals, FTR as who_won
FROM Matches
WHERE season LIKE 2011
GROUP BY HomeTeam
""", conn)


# GOLAS TOTAL

In [260]:
# Finding the total score for TeamNames
goals_total_df = pd.read_sql_query("""
SELECT TeamName, SUM(FTHG + FTAG) as goals_total
FROM Unique_Teams ut 
LEFT JOIN Teams_in_Matches 
USING(Unique_Team_ID)
LEFT JOIN Matches
USING(Match_ID)
WHERE Season LIKE 2011
GROUP BY TeamName -- no dups
""", conn)

# HOME AND AWAY WINS

In [276]:
home_wins_df = pd.read_sql_query("""
SELECT HomeTeam as team, COUNT(FTR) as total_home_wins
FROM Matches
WHERE Season LIKE 2011 AND FTR = "H"
GROUP BY HomeTeam -- no dups
""", conn)

In [None]:
# Finding total wins for away teams
away_wins_df = pd.read_sql_query("""
SELECT AwayTeam as team, COUNT(FTR) as total_away_wins
FROM Matches
WHERE Season LIKE 2011 AND FTR = "A"
GROUP BY AwayTeam -- no dups
""", conn)

# HOME AND AWAY DRAWS

In [344]:
away_draws_df = pd.read_sql_query("""
SELECT AwayTeam as team, COUNT(FTR) as total_away_draws
FROM Matches
WHERE Season LIKE 2011 AND FTR = "D"
GROUP BY AwayTeam -- no dups
""", conn)

In [345]:
home_draws_df = pd.read_sql_query("""
SELECT HomeTeam as team, COUNT(FTR) as total_home_draws
FROM Matches
WHERE Season LIKE 2011 AND FTR = "D"
GROUP BY HomeTeam -- no dups
""", conn)

# HOME AND AWAY LOSS

In [349]:
# Finding home team total loss
home_loss_df = pd.read_sql_query("""
SELECT HomeTeam as team, COUNT(FTR) as home_loss
FROM Matches
WHERE Season LIKE 2011 AND FTR = "A"
GROUP BY HomeTeam -- no dups
""", conn)

In [370]:
# Finding away team total loss
away_loss_df = pd.read_sql_query("""
SELECT AwayTeam as team, COUNT(FTR) as away_loss
FROM Matches
WHERE Season LIKE 2011 AND FTR = "H"
GROUP BY AwayTeam -- no dups
""", conn)

# TOTAL LOSS

In [421]:
total_loss = away_loss_df.away_loss + home_loss_df.home_loss

In [443]:
total_loss = pd.DataFrame(total_loss)
total_loss.head()

Unnamed: 0,0
0,15.0
1,10.0
2,14.0
3,12.0
4,7.0


# TEAM INFO ON TEAM NAMES AND WINS AND LOSS

In [444]:
team_info = pd.merge(left=away_wins_df, right=home_wins_df)

In [445]:
team_wins = team_info.sum(axis=1)

In [446]:
team_info["team_wins"] = team_wins

In [447]:
team_info['total_loss'] = total_loss
team_info

Unnamed: 0,team,total_away_wins,total_home_wins,team_wins,total_loss
0,Aachen,2,4,6,15.0
1,Arsenal,9,12,21,10.0
2,Aston Villa,3,4,7,14.0
3,Augsburg,2,6,8,12.0
4,Bayern Munich,9,14,23,7.0
5,Blackburn,2,6,8,23.0
6,Bochum,3,7,10,17.0
7,Bolton,6,4,10,22.0
8,Braunschweig,4,6,10,9.0
9,Chelsea,6,12,18,10.0


# TEAM INFO FOR DRAW

In [None]:
# Histogram of the team's win/loss for the 2011 season