# Football Database

In this notebook I analyze a database that contains data from  European Football.

The database is composed for 7 tables which I describe below:

* `Country`: 11 rows and 2 columns

    - id: Country id
    - name: Name of the country

* `League`: 11 rows and 3 columns

    - id: League id
    - country_id: Country id of the League
    - name: Name of the League

* `Match`: 26k rows and 115 columns

    - id: Id of the match
    - country_id: Id of the country
    - league_id: Id of the league
    - season: Season the match happened in (goes from 2008/2009 to 2015/2016 season)
    - home_team_goal
    - others
    
* `Player`: 11.1k rows and 7 columns

    - id: Id of the player
    - player_name: Name of the player
    - birthday: Date of birth of player
    - height: Height of the player
    - weight: Weight of the player
    - others
    
* `Player_Attributes`: 184k rows and 42 columns:
 
    - overall_rating
    - potential
    - preferred_foot
    - attacking_work_rate
    - defensive_work_rate
    - crossing
    - others
    
* `Team`: 299 rows and 5 columns

    - id: Id of the team
    - team_api_id
    - team_fifa_api_id
    - team_long_name
    - team_short_name

* `Team_Attributes`: 1458 rows and 25 columns

    - id: Id of the team
    - buildUpPlaySpeed
    - buildUpPlaySpeedClass
    - buildUpPlayDribbling
    - buildUpPlayDribblingClass
    - buildUpPlayPassing
    - buildUpPlayPassingClass

This is a very extensive dataset with more than 11000 players, 300 teams and more than 25k matches. Most of the attributes I mention above are the ones I will be using to answer different questions. 

## Goals

The main goal of this analysis if the use of SQL (SQLite) language to extract analytical information to answer specific questions and provide different insights. The topics I address will be based on the following points:

* Best teams per league
* Best league: I will focus on the 5 biggest and best known leagues: Spain, France, Germany, England and Italy
* Best teams overall
* Players with best features
* Relationship between players and their team play.


In [1]:
# Installing ipython-sql to use SQL with Python
!conda install -yc conda-forge ipython-sql

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\manit\anaconda3\envs\sql_analysis

  added / updated specs:
    - ipython-sql


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.12.5  |       h5b45459_0         173 KB  conda-forge
    certifi-2020.12.5          |   py39hcbf5309_0         144 KB  conda-forge
    ipython-sql-0.3.9          |py39hde42818_1002          28 KB  conda-forge
    openssl-1.1.1i             |       h8ffe710_0         5.8 MB  conda-forge
    prettytable-2.0.0          |     pyhd8ed1ab_0          22 KB  conda-forge
    python_abi-3.9             |           1_cp39           4 KB  conda-forge
    sqlalchemy-1.3.20          |   py39h4cdbadb_0         1.8 MB  conda-forge
    sqlparse-0.4.1             |     pyh9f0ad1d_0          34 KB

**Connecting the Julyter notebook to the database file: database.sqlite**

In [12]:
%%capture
%load_ext sql
%sql sqlite:///database.sqlite

## **Extracting information from the database**

In [13]:
%%sql
SELECT *
FROM sqlite_master
WHERE type='table';

 * sqlite:///database.sqlite
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,4,"CREATE TABLE sqlite_sequence(name,seq)"
table,Player_Attributes,Player_Attributes,11,"CREATE TABLE ""Player_Attributes"" ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`player_fifa_api_id`	INTEGER, 	`player_api_id`	INTEGER, 	`date`	TEXT, 	`overall_rating`	INTEGER, 	`potential`	INTEGER, 	`preferred_foot`	TEXT, 	`attacking_work_rate`	TEXT, 	`defensive_work_rate`	TEXT, 	`crossing`	INTEGER, 	`finishing`	INTEGER, 	`heading_accuracy`	INTEGER, 	`short_passing`	INTEGER, 	`volleys`	INTEGER, 	`dribbling`	INTEGER, 	`curve`	INTEGER, 	`free_kick_accuracy`	INTEGER, 	`long_passing`	INTEGER, 	`ball_control`	INTEGER, 	`acceleration`	INTEGER, 	`sprint_speed`	INTEGER, 	`agility`	INTEGER, 	`reactions`	INTEGER, 	`balance`	INTEGER, 	`shot_power`	INTEGER, 	`jumping`	INTEGER, 	`stamina`	INTEGER, 	`strength`	INTEGER, 	`long_shots`	INTEGER, 	`aggression`	INTEGER, 	`interceptions`	INTEGER, 	`positioning`	INTEGER, 	`vision`	INTEGER, 	`penalties`	INTEGER, 	`marking`	INTEGER, 	`standing_tackle`	INTEGER, 	`sliding_tackle`	INTEGER, 	`gk_diving`	INTEGER, 	`gk_handling`	INTEGER, 	`gk_kicking`	INTEGER, 	`gk_positioning`	INTEGER, 	`gk_reflexes`	INTEGER, 	FOREIGN KEY(`player_fifa_api_id`) REFERENCES `Player`(`player_fifa_api_id`), 	FOREIGN KEY(`player_api_id`) REFERENCES `Player`(`player_api_id`) )"
table,Player,Player,14,"CREATE TABLE `Player` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`player_api_id`	INTEGER UNIQUE, 	`player_name`	TEXT, 	`player_fifa_api_id`	INTEGER UNIQUE, 	`birthday`	TEXT, 	`height`	INTEGER, 	`weight`	INTEGER )"
table,Match,Match,18,"CREATE TABLE `Match` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`country_id`	INTEGER, 	`league_id`	INTEGER, 	`season`	TEXT, 	`stage`	INTEGER, 	`date`	TEXT, 	`match_api_id`	INTEGER UNIQUE, 	`home_team_api_id`	INTEGER, 	`away_team_api_id`	INTEGER, 	`home_team_goal`	INTEGER, 	`away_team_goal`	INTEGER, 	`home_player_X1`	INTEGER, 	`home_player_X2`	INTEGER, 	`home_player_X3`	INTEGER, 	`home_player_X4`	INTEGER, 	`home_player_X5`	INTEGER, 	`home_player_X6`	INTEGER, 	`home_player_X7`	INTEGER, 	`home_player_X8`	INTEGER, 	`home_player_X9`	INTEGER, 	`home_player_X10`	INTEGER, 	`home_player_X11`	INTEGER, 	`away_player_X1`	INTEGER, 	`away_player_X2`	INTEGER, 	`away_player_X3`	INTEGER, 	`away_player_X4`	INTEGER, 	`away_player_X5`	INTEGER, 	`away_player_X6`	INTEGER, 	`away_player_X7`	INTEGER, 	`away_player_X8`	INTEGER, 	`away_player_X9`	INTEGER, 	`away_player_X10`	INTEGER, 	`away_player_X11`	INTEGER, 	`home_player_Y1`	INTEGER, 	`home_player_Y2`	INTEGER, 	`home_player_Y3`	INTEGER, 	`home_player_Y4`	INTEGER, 	`home_player_Y5`	INTEGER, 	`home_player_Y6`	INTEGER, 	`home_player_Y7`	INTEGER, 	`home_player_Y8`	INTEGER, 	`home_player_Y9`	INTEGER, 	`home_player_Y10`	INTEGER, 	`home_player_Y11`	INTEGER, 	`away_player_Y1`	INTEGER, 	`away_player_Y2`	INTEGER, 	`away_player_Y3`	INTEGER, 	`away_player_Y4`	INTEGER, 	`away_player_Y5`	INTEGER, 	`away_player_Y6`	INTEGER, 	`away_player_Y7`	INTEGER, 	`away_player_Y8`	INTEGER, 	`away_player_Y9`	INTEGER, 	`away_player_Y10`	INTEGER, 	`away_player_Y11`	INTEGER, 	`home_player_1`	INTEGER, 	`home_player_2`	INTEGER, 	`home_player_3`	INTEGER, 	`home_player_4`	INTEGER, 	`home_player_5`	INTEGER, 	`home_player_6`	INTEGER, 	`home_player_7`	INTEGER, 	`home_player_8`	INTEGER, 	`home_player_9`	INTEGER, 	`home_player_10`	INTEGER, 	`home_player_11`	INTEGER, 	`away_player_1`	INTEGER, 	`away_player_2`	INTEGER, 	`away_player_3`	INTEGER, 	`away_player_4`	INTEGER, 	`away_player_5`	INTEGER, 	`away_player_6`	INTEGER, 	`away_player_7`	INTEGER, 	`away_player_8`	INTEGER, 	`away_player_9`	INTEGER, 	`away_player_10`	INTEGER, 	`away_player_11`	INTEGER, 	`goal`	TEXT, 	`shoton`	TEXT, 	`shotoff`	TEXT, 	`foulcommit`	TEXT, 	`card`	TEXT, 	`cross`	TEXT, 	`corner`	TEXT, 	`possession`	TEXT, 	`B365H`	NUMERIC, 	`B365D`	NUMERIC, 	`B365A`	NUMERIC, 	`BWH`	NUMERIC, 	`BWD`	NUMERIC, 	`BWA`	NUMERIC, 	`IWH`	NUMERIC, 	`IWD`	NUMERIC, 	`IWA`	NUMERIC, 	`LBH`	NUMERIC, 	`LBD`	NUMERIC, 	`LBA`	NUMERIC, 	`PSH`	NUMERIC, 	`PSD`	NUMERIC, 	`PSA`	NUMERIC, 	`WHH`	NUMERIC, 	`WHD`	NUMERIC, 	`WHA`	NUMERIC, 	`SJH`	NUMERIC, 	`SJD`	NUMERIC, 	`SJA`	NUMERIC, 	`VCH`	NUMERIC, 	`VCD`	NUMERIC, 	`VCA`	NUMERIC, 	`GBH`	NUMERIC, 	`GBD`	NUMERIC, 	`GBA`	NUMERIC, 	`BSH`	NUMERIC, 	`BSD`	NUMERIC, 	`BSA`	NUMERIC, 	FOREIGN KEY(`country_id`) REFERENCES `country`(`id`), 	FOREIGN KEY(`league_id`) REFERENCES `League`(`id`), 	FOREIGN KEY(`home_team_api_id`) REFERENCES `Team`(`team_api_id`), 	FOREIGN KEY(`away_team_api_id`) REFERENCES `Team`(`team_api_id`), 	FOREIGN KEY(`home_player_1`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_2`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_3`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_4`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_5`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_6`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_7`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_8`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_9`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_10`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_11`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_1`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_2`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_3`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_4`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_5`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_6`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_7`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_8`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_9`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_10`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_11`) REFERENCES `Player`(`player_api_id`) )"
table,League,League,24,"CREATE TABLE `League` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`country_id`	INTEGER, 	`name`	TEXT UNIQUE, 	FOREIGN KEY(`country_id`) REFERENCES `country`(`id`) )"
table,Country,Country,26,"CREATE TABLE `Country` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`name`	TEXT UNIQUE )"
table,Team,Team,29,"CREATE TABLE ""Team"" ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`team_api_id`	INTEGER UNIQUE, 	`team_fifa_api_id`	INTEGER, 	`team_long_name`	TEXT, 	`team_short_name`	TEXT )"
table,Team_Attributes,Team_Attributes,2,"CREATE TABLE `Team_Attributes` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`team_fifa_api_id`	INTEGER, 	`team_api_id`	INTEGER, 	`date`	TEXT, 	`buildUpPlaySpeed`	INTEGER, 	`buildUpPlaySpeedClass`	TEXT, 	`buildUpPlayDribbling`	INTEGER, 	`buildUpPlayDribblingClass`	TEXT, 	`buildUpPlayPassing`	INTEGER, 	`buildUpPlayPassingClass`	TEXT, 	`buildUpPlayPositioningClass`	TEXT, 	`chanceCreationPassing`	INTEGER, 	`chanceCreationPassingClass`	TEXT, 	`chanceCreationCrossing`	INTEGER, 	`chanceCreationCrossingClass`	TEXT, 	`chanceCreationShooting`	INTEGER, 	`chanceCreationShootingClass`	TEXT, 	`chanceCreationPositioningClass`	TEXT, 	`defencePressure`	INTEGER, 	`defencePressureClass`	TEXT, 	`defenceAggression`	INTEGER, 	`defenceAggressionClass`	TEXT, 	`defenceTeamWidth`	INTEGER, 	`defenceTeamWidthClass`	TEXT, 	`defenceDefenderLineClass`	TEXT, 	FOREIGN KEY(`team_fifa_api_id`) REFERENCES `Team`(`team_fifa_api_id`), 	FOREIGN KEY(`team_api_id`) REFERENCES `Team`(`team_api_id`) )"


**Extracting name of the tables**

In [14]:
%%sql
SELECT name, type
FROM sqlite_master
WHERE type IN ("table", "view")

 * sqlite:///database.sqlite
Done.


name,type
sqlite_sequence,table
Player_Attributes,table
Player,table
Match,table
League,table
Country,table
Team,table
Team_Attributes,table


## Extracting information of our teams and leagues

**Name of the leagues**

In [15]:
%%sql
SELECT * FROM league;

 * sqlite:///database.sqlite
Done.


id,country_id,name
1,1,Belgium Jupiler League
1729,1729,England Premier League
4769,4769,France Ligue 1
7809,7809,Germany 1. Bundesliga
10257,10257,Italy Serie A
13274,13274,Netherlands Eredivisie
15722,15722,Poland Ekstraklasa
17642,17642,Portugal Liga ZON Sagres
19694,19694,Scotland Premier League
21518,21518,Spain LIGA BBVA


**Note:** As I mentioned above, I will focus on the 5 best known leagues: 

    * England Premier League
    * France Ligue 1
    * Italy Serie A
    * Germany 1. Bundesliga
    * Spain LIGA BBVA
    
I will create a view with the data of only these views

In [23]:
%%sql
DROP VIEW IF EXISTS best_leagues;

CREATE VIEW best_leagues AS
SELECT id, country_id,
CASE 
    WHEN name = 'England Premier League' THEN 'Premier League'
    WHEN name = 'France Ligue 1' THEN 'League 1'
    WHEN name = 'Germany 1. Bundesliga' THEN 'Bundesliga'
    WHEN name = 'Italy Serie A' THEN 'Serie A'
    WHEN name = 'Spain LIGA BBVA' THEN 'La Liga'
END AS name
FROM league
WHERE name IN ('England Premier League', 'France Ligue 1', 'Germany 1. Bundesliga', 'Italy Serie A','Spain LIGA BBVA');

SELECT * FROM best_leagues;

 * sqlite:///database.sqlite
Done.
Done.
Done.


id,country_id,name
1729,1729,Premier League
4769,4769,League 1
7809,7809,Bundesliga
10257,10257,Serie A
21518,21518,La Liga
