# Playstyle for Victory

![](./images/teams.png)

## Analysis of Football European Teams

Taking a database from [Kaggle](https://www.kaggle.com/omarmomen/football-database), in this notebook I analyze some of the Football teams from the most popular European Leagues. The database is composed by 199 columns and 7 tables.

Description of the tables:

* **`Country` (11 rows and 2 columns)**: Describe the countries which the leagues belong to.

    - id: Country id
    - name: Name of the country

* **`League` (11 rows and 3 columns)**: Describe the name of the leagues and the country they belong.

    - id: League id
    - country_id: Country id of the League
    - name: Name of the League

* **`Match` (26k rows and 115 columns)**: Describe the different matches among the teams in their leagues. The table specifies the date of the match and the goals every team scored.

    - id: Id of the match
    - country_id: Id of the country
    - league_id: Id of the league
    - season: Season the match happened in (goes from 2008/2009 to 2015/2016 season)
    - home_team_goal
    - others
    
* **`Player` (11.1k rows and 7 columns)**: Describe the player id, their name and features such as birth date, height and weight. 

    - id: Id of the player
    - player_name: Name of the player
    - birthday: Date of birth of player
    - height: Height of the player
    - weight: Weight of the player
    - others
    
* **`Player_Attributes` (+184k rows and 42 columns)**: Describes attributes of the different players such as rating, preferred foot, and potential, among others. These values are based in the FIFA attributes.
 
    - overall_rating
    - potential
    - preferred_foot
    - attacking_work_rate
    - defensive_work_rate
    - crossing
    - others
    
* **`Team` (299 rows and 5 columns)**: Describe the teams, with their long and short names, leagues, and id from FIFA.

    - id: Id of the team
    - team_api_id
    - team_fifa_api_id
    - team_long_name
    - team_short_name

* **`Team_Attributes` (1458 rows and 25 columns)**: Descibes teams attributes such as Play of Speed, type of Defence, Creation on Passing, and other parameters that define the playstyle of the teams.

    - id: Id of the team
    - buildUpPlaySpeed
    - buildUpPlaySpeedClass
    - buildUpPlayDribbling
    - buildUpPlayDribblingClass
    - buildUpPlayPassing
    - buildUpPlayPassingClass

This is a very extensive dataset with more than 11000 players, 300 teams and more than 25k matches. Most of the attributes I mention above are the ones I will be using to answer different questions. 

## Objectives

The main goal of this analysis if the use of SQL (SQLite) language to extract analytical information to answer specific questions and provide different insights. 

**Technical skills used in SQLite**:

* Joins
* Views
* Common Table Expressions
* Windows Functions
* Nested Queries

**Topics addressed in the analysis:**

* Best teams per league 
* Best league: I will focus on the 5 biggest and best known leagues: Spain, France, Germany, England and Italy
* Comparison of the teams per league according to their attributes
* Best players

## Main Question

* What attributes contribute to the victory of a team? Are they definitive?


# Table Of Contents:

<a href='#Extracting information from the database'>1. Information of the database</a>

<a href='#Wrangling Data'>2. Data Wrangling</a>

&emsp;  <a href='#Wrangling Data End'>2.1 Analysis of Data Wrangling</a>

<a href='#Exploratory Data Analysis'>3. Exploratory Data Analysis</a>

&emsp; <a href='#Attribute Classes'>3.1 Classes of Attributes</a>

&emsp; <a href='#Attribute Classes Summary'>3.2 Summary of Classes of Attributes</a>

&emsp; <a href='#Ranking Teams'>3.3 Teams Rank</a>

&emsp;     &emsp; <a href='#Top 5 Teams per Attribute'>3.3.1 Top 5 Teams per Attribute</a>

&emsp;     &emsp; <a href='#Ranking Teams on Match'>3.3.2 Winners and Lossers</a>

&emsp;     &emsp; <a href='#Teams with more Wins'>3.3.3 Ranking of Teams per Wins</a>

&emsp;     &emsp; <a href='#Teams with more Losses'>3.3.4 Ranking of Teams per Losses</a>

&emsp; <a href='#Attributes Analysis'>3.4 Analyzing Team Attributes</a>

&emsp;    &emsp; <a href='#Summary Attributes Analysis'>3.4.1 Summary of Team Attributes Analysis</a>

&emsp; <a href='#Attributes vs Team Victory'>3.5 Attributes vs Team Victory</a>

&emsp;   &emsp; <a href='#Attributes vs Team Victory Summary'>3.5.1 Attributes vs Team Victory. Summary of Results</a>

&emsp; <a href='#Combination of Attributes vs Team Victory'>3.6 Combined Attributes vs Team Victory</a>

&emsp;    &emsp; <a href='#Combination of Attributes vs Team Victory Summary'>3.6.1 Combined Attributes vs Team Victory. Summary of Results</a>

&emsp; <a href='#Best Players'>3.7 Best Players</a>

&emsp;&emsp; <a href='#Best Players'>3.7.1 Best Players Summary</a>

<a href='#Analysis of Results'>4. Analysis of Results</a>

<a href='#Limitations'>5. Limitations</a>

<a href='#Future Ideas'>6. Future Ideas</a>

In [1]:
# Installing ipython-sql to use SQL with Python
!conda install -yc conda-forge ipython-sql

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\manit\anaconda3\envs\sql_analysis

  added / updated specs:
    - ipython-sql


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.12.5  |       h5b45459_0         173 KB  conda-forge
    certifi-2020.12.5          |   py39hcbf5309_0         144 KB  conda-forge
    ipython-sql-0.3.9          |py39hde42818_1002          28 KB  conda-forge
    openssl-1.1.1i             |       h8ffe710_0         5.8 MB  conda-forge
    prettytable-2.0.0          |     pyhd8ed1ab_0          22 KB  conda-forge
    python_abi-3.9             |           1_cp39           4 KB  conda-forge
    sqlalchemy-1.3.20          |   py39h4cdbadb_0         1.8 MB  conda-forge
    sqlparse-0.4.1             |     pyh9f0ad1d_0          34 KB

**Connecting the Julyter notebook to the database file: database.sqlite**

In [1]:
%%capture
%load_ext sql
%sql sqlite:///database.sqlite

<a id='Extracting information from the database'></a>
# **1. Extracting information from the database**

In [2]:
%%sql
SELECT *
FROM sqlite_master
WHERE type='table';

 * sqlite:///database.sqlite
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,4,"CREATE TABLE sqlite_sequence(name,seq)"
table,Player_Attributes,Player_Attributes,11,"CREATE TABLE ""Player_Attributes"" ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`player_fifa_api_id`	INTEGER, 	`player_api_id`	INTEGER, 	`date`	TEXT, 	`overall_rating`	INTEGER, 	`potential`	INTEGER, 	`preferred_foot`	TEXT, 	`attacking_work_rate`	TEXT, 	`defensive_work_rate`	TEXT, 	`crossing`	INTEGER, 	`finishing`	INTEGER, 	`heading_accuracy`	INTEGER, 	`short_passing`	INTEGER, 	`volleys`	INTEGER, 	`dribbling`	INTEGER, 	`curve`	INTEGER, 	`free_kick_accuracy`	INTEGER, 	`long_passing`	INTEGER, 	`ball_control`	INTEGER, 	`acceleration`	INTEGER, 	`sprint_speed`	INTEGER, 	`agility`	INTEGER, 	`reactions`	INTEGER, 	`balance`	INTEGER, 	`shot_power`	INTEGER, 	`jumping`	INTEGER, 	`stamina`	INTEGER, 	`strength`	INTEGER, 	`long_shots`	INTEGER, 	`aggression`	INTEGER, 	`interceptions`	INTEGER, 	`positioning`	INTEGER, 	`vision`	INTEGER, 	`penalties`	INTEGER, 	`marking`	INTEGER, 	`standing_tackle`	INTEGER, 	`sliding_tackle`	INTEGER, 	`gk_diving`	INTEGER, 	`gk_handling`	INTEGER, 	`gk_kicking`	INTEGER, 	`gk_positioning`	INTEGER, 	`gk_reflexes`	INTEGER, 	FOREIGN KEY(`player_fifa_api_id`) REFERENCES `Player`(`player_fifa_api_id`), 	FOREIGN KEY(`player_api_id`) REFERENCES `Player`(`player_api_id`) )"
table,Player,Player,14,"CREATE TABLE `Player` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`player_api_id`	INTEGER UNIQUE, 	`player_name`	TEXT, 	`player_fifa_api_id`	INTEGER UNIQUE, 	`birthday`	TEXT, 	`height`	INTEGER, 	`weight`	INTEGER )"
table,Match,Match,18,"CREATE TABLE `Match` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`country_id`	INTEGER, 	`league_id`	INTEGER, 	`season`	TEXT, 	`stage`	INTEGER, 	`date`	TEXT, 	`match_api_id`	INTEGER UNIQUE, 	`home_team_api_id`	INTEGER, 	`away_team_api_id`	INTEGER, 	`home_team_goal`	INTEGER, 	`away_team_goal`	INTEGER, 	`home_player_X1`	INTEGER, 	`home_player_X2`	INTEGER, 	`home_player_X3`	INTEGER, 	`home_player_X4`	INTEGER, 	`home_player_X5`	INTEGER, 	`home_player_X6`	INTEGER, 	`home_player_X7`	INTEGER, 	`home_player_X8`	INTEGER, 	`home_player_X9`	INTEGER, 	`home_player_X10`	INTEGER, 	`home_player_X11`	INTEGER, 	`away_player_X1`	INTEGER, 	`away_player_X2`	INTEGER, 	`away_player_X3`	INTEGER, 	`away_player_X4`	INTEGER, 	`away_player_X5`	INTEGER, 	`away_player_X6`	INTEGER, 	`away_player_X7`	INTEGER, 	`away_player_X8`	INTEGER, 	`away_player_X9`	INTEGER, 	`away_player_X10`	INTEGER, 	`away_player_X11`	INTEGER, 	`home_player_Y1`	INTEGER, 	`home_player_Y2`	INTEGER, 	`home_player_Y3`	INTEGER, 	`home_player_Y4`	INTEGER, 	`home_player_Y5`	INTEGER, 	`home_player_Y6`	INTEGER, 	`home_player_Y7`	INTEGER, 	`home_player_Y8`	INTEGER, 	`home_player_Y9`	INTEGER, 	`home_player_Y10`	INTEGER, 	`home_player_Y11`	INTEGER, 	`away_player_Y1`	INTEGER, 	`away_player_Y2`	INTEGER, 	`away_player_Y3`	INTEGER, 	`away_player_Y4`	INTEGER, 	`away_player_Y5`	INTEGER, 	`away_player_Y6`	INTEGER, 	`away_player_Y7`	INTEGER, 	`away_player_Y8`	INTEGER, 	`away_player_Y9`	INTEGER, 	`away_player_Y10`	INTEGER, 	`away_player_Y11`	INTEGER, 	`home_player_1`	INTEGER, 	`home_player_2`	INTEGER, 	`home_player_3`	INTEGER, 	`home_player_4`	INTEGER, 	`home_player_5`	INTEGER, 	`home_player_6`	INTEGER, 	`home_player_7`	INTEGER, 	`home_player_8`	INTEGER, 	`home_player_9`	INTEGER, 	`home_player_10`	INTEGER, 	`home_player_11`	INTEGER, 	`away_player_1`	INTEGER, 	`away_player_2`	INTEGER, 	`away_player_3`	INTEGER, 	`away_player_4`	INTEGER, 	`away_player_5`	INTEGER, 	`away_player_6`	INTEGER, 	`away_player_7`	INTEGER, 	`away_player_8`	INTEGER, 	`away_player_9`	INTEGER, 	`away_player_10`	INTEGER, 	`away_player_11`	INTEGER, 	`goal`	TEXT, 	`shoton`	TEXT, 	`shotoff`	TEXT, 	`foulcommit`	TEXT, 	`card`	TEXT, 	`cross`	TEXT, 	`corner`	TEXT, 	`possession`	TEXT, 	`B365H`	NUMERIC, 	`B365D`	NUMERIC, 	`B365A`	NUMERIC, 	`BWH`	NUMERIC, 	`BWD`	NUMERIC, 	`BWA`	NUMERIC, 	`IWH`	NUMERIC, 	`IWD`	NUMERIC, 	`IWA`	NUMERIC, 	`LBH`	NUMERIC, 	`LBD`	NUMERIC, 	`LBA`	NUMERIC, 	`PSH`	NUMERIC, 	`PSD`	NUMERIC, 	`PSA`	NUMERIC, 	`WHH`	NUMERIC, 	`WHD`	NUMERIC, 	`WHA`	NUMERIC, 	`SJH`	NUMERIC, 	`SJD`	NUMERIC, 	`SJA`	NUMERIC, 	`VCH`	NUMERIC, 	`VCD`	NUMERIC, 	`VCA`	NUMERIC, 	`GBH`	NUMERIC, 	`GBD`	NUMERIC, 	`GBA`	NUMERIC, 	`BSH`	NUMERIC, 	`BSD`	NUMERIC, 	`BSA`	NUMERIC, 	FOREIGN KEY(`country_id`) REFERENCES `country`(`id`), 	FOREIGN KEY(`league_id`) REFERENCES `League`(`id`), 	FOREIGN KEY(`home_team_api_id`) REFERENCES `Team`(`team_api_id`), 	FOREIGN KEY(`away_team_api_id`) REFERENCES `Team`(`team_api_id`), 	FOREIGN KEY(`home_player_1`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_2`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_3`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_4`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_5`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_6`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_7`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_8`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_9`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_10`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`home_player_11`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_1`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_2`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_3`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_4`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_5`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_6`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_7`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_8`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_9`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_10`) REFERENCES `Player`(`player_api_id`), 	FOREIGN KEY(`away_player_11`) REFERENCES `Player`(`player_api_id`) )"
table,League,League,24,"CREATE TABLE `League` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`country_id`	INTEGER, 	`name`	TEXT UNIQUE, 	FOREIGN KEY(`country_id`) REFERENCES `country`(`id`) )"
table,Country,Country,26,"CREATE TABLE `Country` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`name`	TEXT UNIQUE )"
table,Team,Team,29,"CREATE TABLE ""Team"" ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`team_api_id`	INTEGER UNIQUE, 	`team_fifa_api_id`	INTEGER, 	`team_long_name`	TEXT, 	`team_short_name`	TEXT )"
table,Team_Attributes,Team_Attributes,2,"CREATE TABLE `Team_Attributes` ( 	`id`	INTEGER PRIMARY KEY AUTOINCREMENT, 	`team_fifa_api_id`	INTEGER, 	`team_api_id`	INTEGER, 	`date`	TEXT, 	`buildUpPlaySpeed`	INTEGER, 	`buildUpPlaySpeedClass`	TEXT, 	`buildUpPlayDribbling`	INTEGER, 	`buildUpPlayDribblingClass`	TEXT, 	`buildUpPlayPassing`	INTEGER, 	`buildUpPlayPassingClass`	TEXT, 	`buildUpPlayPositioningClass`	TEXT, 	`chanceCreationPassing`	INTEGER, 	`chanceCreationPassingClass`	TEXT, 	`chanceCreationCrossing`	INTEGER, 	`chanceCreationCrossingClass`	TEXT, 	`chanceCreationShooting`	INTEGER, 	`chanceCreationShootingClass`	TEXT, 	`chanceCreationPositioningClass`	TEXT, 	`defencePressure`	INTEGER, 	`defencePressureClass`	TEXT, 	`defenceAggression`	INTEGER, 	`defenceAggressionClass`	TEXT, 	`defenceTeamWidth`	INTEGER, 	`defenceTeamWidthClass`	TEXT, 	`defenceDefenderLineClass`	TEXT, 	FOREIGN KEY(`team_fifa_api_id`) REFERENCES `Team`(`team_fifa_api_id`), 	FOREIGN KEY(`team_api_id`) REFERENCES `Team`(`team_api_id`) )"


**Extracting name of the tables**

In [3]:
%%sql
SELECT name, type
FROM sqlite_master
WHERE type IN ("table", "view")

 * sqlite:///database.sqlite
Done.


name,type
sqlite_sequence,table
Player_Attributes,table
Player,table
Match,table
League,table
Country,table
Team,table
Team_Attributes,table
best_leagues,view


## Leagues

![](./images/best_leagues.jpg)

**Name of the leagues**

In [4]:
%%sql
SELECT * FROM league;

 * sqlite:///database.sqlite
Done.


id,country_id,name
1,1,Belgium Jupiler League
1729,1729,England Premier League
4769,4769,France Ligue 1
7809,7809,Germany 1. Bundesliga
10257,10257,Italy Serie A
13274,13274,Netherlands Eredivisie
15722,15722,Poland Ekstraklasa
17642,17642,Portugal Liga ZON Sagres
19694,19694,Scotland Premier League
21518,21518,Spain LIGA BBVA


**Note:** As I mentioned above, I will focus on the 5 best known leagues: 

    * England Premier League
    * France Ligue 1
    * Italy Serie A
    * Germany 1. Bundesliga
    * Spain LIGA BBVA
    
I will create a view with the data of only these views

<a id='Wrangling Data'></a>
# **2. Wrangling Data**

**Creating a VIEW with the best leagues**

In [5]:
%%sql
DROP VIEW IF EXISTS best_leagues;

CREATE VIEW best_leagues AS
SELECT id, country_id,
CASE 
    WHEN name = 'England Premier League' THEN 'Premier League'
    WHEN name = 'France Ligue 1' THEN 'League 1'
    WHEN name = 'Germany 1. Bundesliga' THEN 'Bundesliga'
    WHEN name = 'Italy Serie A' THEN 'Serie A'
    WHEN name = 'Spain LIGA BBVA' THEN 'La Liga'
END AS name
FROM league
WHERE name IN ('England Premier League', 'France Ligue 1', 'Germany 1. Bundesliga', 'Italy Serie A','Spain LIGA BBVA');

SELECT * FROM best_leagues;

 * sqlite:///database.sqlite
Done.
Done.
Done.


id,country_id,name
1729,1729,Premier League
4769,4769,League 1
7809,7809,Bundesliga
10257,10257,Serie A
21518,21518,La Liga


**Creating a VIEW with the teams of the biggest leagues**

In [8]:
%%sql
SELECT t.* FROM team t LIMIT 1;

 * sqlite:///database.sqlite
Done.


id,team_api_id,team_fifa_api_id,team_long_name,team_short_name
1,9987,673,KRC Genk,GEN


In [2]:
%%sql
DROP VIEW IF EXISTS best_teams;

CREATE VIEW best_teams AS
SELECT t.* FROM team t
    INNER JOIN Match m ON m.home_team_api_id = t.team_api_id
    INNER JOIN best_leagues l ON l.id = m.league_id;
    
SELECT COUNT(*) FROM best_teams

 * sqlite:///database.sqlite
Done.
Done.
Done.


COUNT(*)
14585


**Creating a VIEW for the players of the best teams**

I will be analyzing the players that played in the teams that belonged to the best leagues (the main 5 leagues) 

In [11]:
%%sql

SELECT * FROM player LIMIT 0;

 * sqlite:///database.sqlite
Done.


id,player_api_id,player_name,player_fifa_api_id,birthday,height,weight


In [3]:
%%sql
SELECT * FROM match 
WHERE season = '2011/2012' LIMIT 5;

 * sqlite:///database.sqlite
Done.


id,country_id,league_id,season,stage,date,match_api_id,home_team_api_id,away_team_api_id,home_team_goal,away_team_goal,home_player_X1,home_player_X2,home_player_X3,home_player_X4,home_player_X5,home_player_X6,home_player_X7,home_player_X8,home_player_X9,home_player_X10,home_player_X11,away_player_X1,away_player_X2,away_player_X3,away_player_X4,away_player_X5,away_player_X6,away_player_X7,away_player_X8,away_player_X9,away_player_X10,away_player_X11,home_player_Y1,home_player_Y2,home_player_Y3,home_player_Y4,home_player_Y5,home_player_Y6,home_player_Y7,home_player_Y8,home_player_Y9,home_player_Y10,home_player_Y11,away_player_Y1,away_player_Y2,away_player_Y3,away_player_Y4,away_player_Y5,away_player_Y6,away_player_Y7,away_player_Y8,away_player_Y9,away_player_Y10,away_player_Y11,home_player_1,home_player_2,home_player_3,home_player_4,home_player_5,home_player_6,home_player_7,home_player_8,home_player_9,home_player_10,home_player_11,away_player_1,away_player_2,away_player_3,away_player_4,away_player_5,away_player_6,away_player_7,away_player_8,away_player_9,away_player_10,away_player_11,goal,shoton,shotoff,foulcommit,card,cross,corner,possession,B365H,B365D,B365A,BWH,BWD,BWA,IWH,IWD,IWA,LBH,LBD,LBA,PSH,PSD,PSA,WHH,WHD,WHA,SJH,SJD,SJA,VCH,VCD,VCA,GBH,GBD,GBA,BSH,BSD,BSA
757,1,1,2011/2012,1,2011-07-29 00:00:00,1032692,1773,8635,2,1,1,2,4,6,8,3,5,7,3,5,7,1,3,4,7,1,3,5,7,9,4,6,1,3,3,3,3,7,7,7,10,10,10,1,3,3,3,7,7,7,7,7,10,10,37993,37865,37051,45840,179059,37981.0,38791,37963,38777,45865,68114,38391,38389,208493,149150,40536,38253,114333,178249,265123.0,46552,181276,,,,,,,,,7.0,4.0,1.5,6.5,4.1,1.42,5.0,3.7,1.5,5.5,3.6,1.5,,,,5.5,3.75,1.57,6.0,3.6,1.57,7.0,4.0,1.5,6.0,3.75,1.5,6.5,4.0,1.44
758,1,1,2011/2012,1,2011-07-30 00:00:00,1032693,9998,9985,1,1,1,1,3,5,7,9,3,4,8,6,5,1,2,4,6,8,6,5,4,3,5,7,1,3,3,3,3,3,7,7,7,7,11,1,3,3,3,3,6,8,6,10,10,10,39153,39977,181574,166577,38906,36849.0,21753,37128,149258,26224,78902,38797,129462,245653,164229,33620,38969,17276,119117,38382.0,248689,46335,,,,,,,,,5.0,3.5,1.73,4.75,3.5,1.65,3.8,3.3,1.75,4.5,3.5,1.62,,,,4.33,3.5,1.75,4.5,3.5,1.73,4.75,3.6,1.75,4.5,3.4,1.72,4.5,3.6,1.67
759,1,1,2011/2012,1,2011-07-30 00:00:00,1032694,9987,9993,3,1,1,2,4,6,8,2,4,6,8,4,6,1,2,4,6,8,3,5,7,3,5,7,1,3,3,3,3,7,7,7,7,10,10,1,3,3,3,3,7,7,7,10,10,10,91929,94462,38368,148314,109331,104411.0,39498,169200,43158,42153,38794,36873,57078,38800,174363,27508,38784,163613,38371,33622.0,166679,14487,,,,,,,,,1.44,4.33,7.0,1.45,3.95,6.25,1.5,3.7,5.0,1.44,3.75,6.0,,,,1.4,4.33,7.0,1.44,4.2,6.5,1.44,4.5,7.0,1.45,4.0,6.25,1.44,4.0,6.5
760,1,1,2011/2012,1,2011-07-30 00:00:00,1032695,9991,9984,0,1,1,1,3,5,7,9,3,5,7,4,6,1,2,4,6,8,3,5,7,3,5,7,1,3,3,3,3,3,7,7,7,10,10,1,3,3,3,3,7,7,7,10,10,10,37854,12473,114368,178096,37440,,33662,26771,166618,148329,12574,36835,38342,243250,37047,38789,27110,166670,188231,277766.0,38251,209855,,,,,,,,,1.57,3.8,6.0,1.55,3.85,5.0,1.55,3.5,4.8,1.57,3.5,5.0,,,,1.6,3.75,5.0,1.5,3.9,6.0,1.53,4.0,6.0,1.55,3.75,5.5,1.53,3.75,5.5
761,1,1,2011/2012,1,2011-07-30 00:00:00,1032696,9994,10000,0,0,1,3,4,7,1,3,5,7,9,4,6,1,1,3,5,7,9,3,5,7,4,6,1,3,3,3,7,7,7,7,7,10,10,1,3,3,3,3,3,7,7,7,10,10,30934,94030,25791,166675,95609,38290.0,67898,30910,42706,104406,104404,37900,37100,41005,46877,80678,37886,131530,208984,,208852,240044,,,,,,,,,2.2,3.3,3.3,2.15,3.25,3.05,2.1,3.1,3.0,2.0,3.2,3.2,,,,2.0,3.3,3.4,2.2,3.2,3.2,2.2,3.3,3.3,2.15,3.25,3.1,2.2,3.3,2.88


In [25]:
%%sql
DROP VIEW IF EXISTS best_players;

CREATE VIEW best_players AS
WITH pl AS
(
SELECT home_player_1, home_player_2, home_player_3, home_player_4, home_player_5, home_player_6,
        home_player_7, home_player_8, home_player_9, home_player_10, home_player_11, away_player_1,
        away_player_2, away_player_3, away_player_4, away_player_5, away_player_6, away_player_7,
        away_player_8, away_player_9, away_player_10, away_player_11
    FROM Match 
    INNER JOIN best_leagues ON best_leagues.id = Match.league_id 
)

SELECT id, player_api_id, player_name, player_fifa_api_id, date(birthday),  substr(birthday, 1,4) AS birth_year,
        substr(birthday, 6,2) AS birth_month, substr(birthday, 9,2) AS birth_day, height, weight

        FROM player WHERE player_api_id IN (SELECT home_player_1 FROM pl)
                        OR player_api_id IN (SELECT home_player_2 FROM pl)
                        OR player_api_id IN (SELECT home_player_3 FROM pl)
                        OR player_api_id IN (SELECT home_player_4 FROM pl)
                        OR player_api_id IN (SELECT home_player_5 FROM pl)
                        OR player_api_id IN (SELECT home_player_6 FROM pl)
                        OR player_api_id IN (SELECT home_player_7 FROM pl)
                        OR player_api_id IN (SELECT home_player_8 FROM pl)
                        OR player_api_id IN (SELECT home_player_9 FROM pl)
                        OR player_api_id IN (SELECT home_player_10 FROM pl)
                        OR player_api_id IN (SELECT home_player_11 FROM pl)
                        
                        OR player_api_id IN (SELECT away_player_1 FROM pl)
                        OR player_api_id IN (SELECT away_player_2 FROM pl)
                        OR player_api_id IN (SELECT away_player_3 FROM pl)
                        OR player_api_id IN (SELECT away_player_4 FROM pl)
                        OR player_api_id IN (SELECT away_player_5 FROM pl)
                        OR player_api_id IN (SELECT away_player_6 FROM pl)
                        OR player_api_id IN (SELECT away_player_7 FROM pl)
                        OR player_api_id IN (SELECT away_player_8 FROM pl)
                        OR player_api_id IN (SELECT away_player_9 FROM pl)
                        OR player_api_id IN (SELECT away_player_10 FROM pl)
                        OR player_api_id IN (SELECT away_player_11 FROM pl)
                        ;
                        
SELECT COUNT(*) FROM best_players;

 * sqlite:///database.sqlite
Done.
Done.
Done.


COUNT(*)
6102


### Testing the newly created views

**Best_players**

In [26]:
%%sql
SELECT * FROM best_players LIMIT 5;

 * sqlite:///database.sqlite
Done.


id,player_api_id,player_name,player_fifa_api_id,date(birthday),birth_year,birth_month,birth_day,height,weight
9677,2984,Sergio Aragones,111106,1977-02-01,1977,2,1,182.88,176
6856,5440,Marjan Petkovic,122118,1979-05-22,1979,5,22,185.42,185
4784,11316,Jean-Louis Leca,153275,1985-09-21,1985,9,21,180.34,165
1232,11319,Benoit Costil,158121,1987-07-03,1987,7,3,187.96,190
9895,11320,Steeve Elana,111163,1980-07-11,1980,7,11,187.96,187


**Best_teams**

In [27]:
%%sql
SELECT * FROM best_teams LIMIT 5;

 * sqlite:///database.sqlite
Done.


id,team_api_id,team_fifa_api_id,team_long_name,team_short_name
3457,10260,11,Manchester United,MUN
3459,9825,1,Arsenal,ARS
3461,8472,106,Sunderland,SUN
3463,8654,19,West Ham United,WHU
3465,10252,2,Aston Villa,AVL


**Best_leagues**

In [28]:
%%sql
SELECT * FROM best_leagues;

 * sqlite:///database.sqlite
Done.


id,country_id,name
1729,1729,Premier League
4769,4769,League 1
7809,7809,Bundesliga
10257,10257,Serie A
21518,21518,La Liga


<a id='Wrangling Data End'></a>
## END OF WRANGLING PROCEDURE

As a result of my wrangling analysis, 3 different views were created to diminish the amount of data to the one of my interest for my analysis. New columns such as birth_year, birth_month and birth_day were added to facilitate future calculations and analysis. Name of the leagues were modified to their popular name (how must of people call them).

Major problems such as null data were not identify. From the source I had the information of no null data in this dataset. Structure problems or grammar issues were not visually identified. As the dataset is very long, possible grammar issues in the data may be encountered during the future analysis and will be addressed them. 


<a id='Exploratory Data Analysis'></a>
# 3.Exploratory Data Analysis

**Exploring the attributes table**

In [33]:
%%sql
SELECT * FROM Team_attributes LIMIT 10;

 * sqlite:///database.sqlite
Done.


id,team_fifa_api_id,team_api_id,date,buildUpPlaySpeed,buildUpPlaySpeedClass,buildUpPlayDribbling,buildUpPlayDribblingClass,buildUpPlayPassing,buildUpPlayPassingClass,buildUpPlayPositioningClass,chanceCreationPassing,chanceCreationPassingClass,chanceCreationCrossing,chanceCreationCrossingClass,chanceCreationShooting,chanceCreationShootingClass,chanceCreationPositioningClass,defencePressure,defencePressureClass,defenceAggression,defenceAggressionClass,defenceTeamWidth,defenceTeamWidthClass,defenceDefenderLineClass
1,434,9930,2010-02-22 00:00:00,60,Balanced,,Little,50,Mixed,Organised,60,Normal,65,Normal,55,Normal,Organised,50,Medium,55,Press,45,Normal,Cover
2,434,9930,2014-09-19 00:00:00,52,Balanced,48.0,Normal,56,Mixed,Organised,54,Normal,63,Normal,64,Normal,Organised,47,Medium,44,Press,54,Normal,Cover
3,434,9930,2015-09-10 00:00:00,47,Balanced,41.0,Normal,54,Mixed,Organised,54,Normal,63,Normal,64,Normal,Organised,47,Medium,44,Press,54,Normal,Cover
4,77,8485,2010-02-22 00:00:00,70,Fast,,Little,70,Long,Organised,70,Risky,70,Lots,70,Lots,Organised,60,Medium,70,Double,70,Wide,Cover
5,77,8485,2011-02-22 00:00:00,47,Balanced,,Little,52,Mixed,Organised,53,Normal,48,Normal,52,Normal,Organised,47,Medium,47,Press,52,Normal,Cover
6,77,8485,2012-02-22 00:00:00,58,Balanced,,Little,62,Mixed,Organised,45,Normal,70,Lots,55,Normal,Organised,40,Medium,40,Press,60,Normal,Cover
7,77,8485,2013-09-20 00:00:00,62,Balanced,,Little,45,Mixed,Organised,40,Normal,50,Normal,55,Normal,Organised,42,Medium,42,Press,60,Normal,Cover
8,77,8485,2014-09-19 00:00:00,58,Balanced,64.0,Normal,62,Mixed,Organised,56,Normal,68,Lots,57,Normal,Organised,41,Medium,42,Press,60,Normal,Cover
9,77,8485,2015-09-10 00:00:00,59,Balanced,64.0,Normal,53,Mixed,Organised,51,Normal,72,Lots,63,Normal,Free Form,49,Medium,45,Press,63,Normal,Cover
10,614,8576,2010-02-22 00:00:00,60,Balanced,,Little,40,Mixed,Organised,45,Normal,35,Normal,55,Normal,Organised,30,Deep,70,Double,30,Narrow,Offside Trap


<a id='Attribute Classes'></a>

## Classes of Attributes 

Let's understand better our attributes and classes

I will extract the range of the classes of our attributes

### BuildUpPlaySpeed

In [45]:
%%sql
SELECT buildUpPlaySpeedClass, MIN(buildUpPlaySpeed) AS Min_Points, MAX(buildUpPlaySpeed) AS Max_Points
    FROM team_attributes
    GROUP BY buildUpPlaySpeedClass;


 * sqlite:///database.sqlite
Done.


buildUpPlaySpeedClass,Min_Points,Max_Points
Balanced,34,66
Fast,67,80
Slow,20,33


### BuildUpPlayDribbling

In [44]:
%%sql

SELECT buildUpPlayDribblingClass, MIN(buildUpPlayDribbling) AS Min_Points, MAX(buildUpPlayDribbling) AS Max_Points
    FROM team_attributes
    GROUP BY buildUpPlayDribblingClass;

 * sqlite:///database.sqlite
Done.


buildUpPlayDribblingClass,Min_Points,Max_Points
Little,24,33
Lots,67,77
Normal,34,66


### BuildUpPlayPassing

In [46]:
%%sql

SELECT buildUpPlayPassingClass, MIN(buildUpPlayPassing) AS Min_Points, MAX(buildUpPlayPassing) AS Max_Points
    FROM team_attributes
    GROUP BY buildUpPlayPassingClass;

 * sqlite:///database.sqlite
Done.


buildUpPlayPassingClass,Min_Points,Max_Points
Long,67,80
Mixed,34,66
Short,20,33


### ChanceCreationPassingClass

In [54]:
%%sql

SELECT chanceCreationPassingClass, MIN(chanceCreationPassing) AS Min_Points, MAX(chanceCreationPassing) AS Max_Points
    FROM team_attributes
    GROUP BY chanceCreationPassingClass;

 * sqlite:///database.sqlite
Done.


chanceCreationPassingClass,Min_Points,Max_Points
Normal,34,66
Risky,67,80
Safe,21,33


### ChanceCreationCrossingClass

In [56]:
%%sql

SELECT chanceCreationCrossingClass, MIN(chanceCreationCrossing) AS Min_Points, MAX(chanceCreationCrossing) AS Max_Points
    FROM team_attributes
    GROUP BY chanceCreationCrossingClass;


 * sqlite:///database.sqlite
Done.


chanceCreationCrossingClass,Min_Points,Max_Points
Little,20,33
Lots,67,80
Normal,34,66


### ChanceCreationShootingClass

In [57]:
%%sql

SELECT chanceCreationShootingClass, MIN(chanceCreationShooting) AS Min_Points, MAX(chanceCreationShooting) AS Max_Points
    FROM team_attributes
    GROUP BY chanceCreationShootingClass;

 * sqlite:///database.sqlite
Done.


chanceCreationShootingClass,Min_Points,Max_Points
Little,22,33
Lots,67,80
Normal,34,66


### DefencePressureClass

In [58]:
%%sql

SELECT defencePressureClass, MIN(defencePressure) AS Min_Points, MAX(defencePressure) AS Max_Points
    FROM team_attributes
    GROUP BY defencePressureClass;

 * sqlite:///database.sqlite
Done.


defencePressureClass,Min_Points,Max_Points
Deep,23,33
High,67,72
Medium,34,66


### DefenceAggressionClass

In [59]:
%%sql

SELECT defenceAggressionClass, MIN(defenceAggression) AS Min_Points, MAX(defenceAggression) AS Max_Points
    FROM team_attributes
    GROUP BY defenceAggressionClass;

 * sqlite:///database.sqlite
Done.


defenceAggressionClass,Min_Points,Max_Points
Contain,24,33
Double,67,72
Press,34,66


### DefenceTeamWidthClass

In [60]:
%%sql

SELECT defenceTeamWidthClass, MIN(defenceTeamWidth) AS Min_Points, MAX(defenceTeamWidth) AS Max_Points
    FROM team_attributes
    GROUP BY defenceTeamWidthClass;


 * sqlite:///database.sqlite
Done.


defenceTeamWidthClass,Min_Points,Max_Points
Narrow,29,33
Normal,34,66
Wide,67,73


<a id='Attribute Classes Summary'></a>

## Summary of Classes of Attributes

Before continuing, I will define the summarized attributes according to FIFA in order to understand how they define and are related to the victory or defeat of a team and its style of play. 

**Important Note:** It is fair to say that in football better statistical results not always lead to more victories. Some attributes have more weight in this factor than others. Outside factors such as luck, mistake of players and referees during the game are not taken into account inside this data. So, per safe of this analysis, I will trust the number provided by the attributes in order to define the best teams, at least on these terms. 


**buildUpPlaySpeed:**

Define the speed in which attacks are put together:

| Values | Category | 
| --- | --- |
| 1 - 33 | SLOW |
| 34 - 66 | BALANCED |
| 67 - 100 | FAST |

**buildUpPassing:**

Affects passing distance and support from teammates:

| Values | Category | 
| --- | --- |
| 1 - 33 | SHORT |
| 34 - 66 | MIXED |
| 67 - 100 | LONG |

**buildUpDribbling:**

This parameter defines the creativity of the player in 1 on 1 situations. 

| Values | Category | 
| --- | --- |
| 1 - 33 | LITTLE |
| 34 - 66 | NORMAL |
| 67 - 100 | LOTS |

**ChanceCreationPassingClass:**

Amount of risk in pass decision and run support:

| Values | Category | 
| --- | --- |
| 1 - 33 | SAFE |
| 34 - 66 | NORMAL |
| 67 - 100 | RISKY |

**ChanceCreationCrossingClass:**

The tendency / frequency of crosses into the box

| Values | Category | 
| --- | --- |
| 1 - 33 | LITTLE |
| 34 - 66 | NORMAL |
| 67 - 100 | LOTS |

**ChanceCreationShootingClass:**

The tendency / frequency of shots taken:

| Values | Category | 
| --- | --- |
| 1 - 33 | LITTLE |
| 34 - 66 | NORMAL |
| 67 - 100 | LOTS |

**DefencePressureClass**

Defines how hight he pitch the team will start pressuring:

| Values | Category | 
| --- | --- |
| 1 - 33 | DEEP |
| 34 - 66 | MEDIUM |
| 67 - 100 | HIGH |

**DefenceAggressionClass:**

Defines the team approach to tackling the ball possessor:

| Values | Category | 
| --- | --- |
| 1 - 33 | CONTAIN |
| 34 - 66 | PRESS |
| 67 - 100 | DOUBLE |

**DefenceTeamWidthClass**

Defines how much the team shift to the ball side. The narrower width means that the team tends to cover central position while the wider teams tend to cover more the wings/sides.

| Values | Category | 
| --- | --- |
| 1 - 33 | NARROW |
| 34 - 66 | NORMAL |
| 67 - 100 | WIDE |

[Source](https://www.fifplay.com/fifa-17-tactics/)

<a id='Ranking Teams'></a>

## Ranking Teams

![](./images/leagues.jpg)
### Ranking Leagues according their attributes 

In the next analysis, I will extract the best leagues by parameters through the years to see the variation.

#### Leagues with best buildUpPlaySpeed

In [102]:
%%sql
SELECT l.name AS League, m.Season AS Season,
    
    ROUND(AVG(a.buildUpPlaySpeed),2) AS Avg_PlaySpeed, MAX(a.buildUpPlaySpeed) AS Max_PlaySpeed,
    MIN(a.buildUpPlaySpeed) AS Min_PlaySpeed, 
    
    
        RANK() 
        OVER (PARTITION BY Season ORDER BY ROUND(AVG(a.buildUpPlaySpeed),2) DESC) AS Ranking      
       
        FROM best_leagues l
        INNER JOIN Match m ON m.league_id = l.id
        INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
        INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
  
GROUP BY League, Season;



 * sqlite:///database.sqlite
Done.


League,Season,Avg_PlaySpeed,Max_PlaySpeed,Min_PlaySpeed,Ranking
Premier League,2008/2009,56.77,77,25,1
Bundesliga,2008/2009,56.51,78,31,2
Serie A,2008/2009,56.01,78,26,3
League 1,2008/2009,53.57,70,30,4
La Liga,2008/2009,47.7,71,20,5
Bundesliga,2009/2010,56.39,78,31,1
Premier League,2009/2010,56.12,77,25,2
Serie A,2009/2010,55.3,78,26,3
League 1,2009/2010,53.47,70,30,4
La Liga,2009/2010,47.2,70,20,5


<a id='Top 5 Teams per Attribute'></a>

### Creating a top 5 teams per Attribute per league

**Top 5 teams with better play Passing score per league**

Ranking the teams with better passing, in other words, those whose player's passes have higher efficacy. Good pass score leads to a player whose passes reaches his teammates

In [143]:
%%sql
WITH stats AS(
SELECT l.name AS League, t.team_long_name AS team_name, ROUND(AVG(a.buildUpPlayPassing),2) AS PlayPassing_Score,
    RANK() 
        OVER (PARTITION BY l.name ORDER BY ROUND(AVG(a.buildUpPlayPassing),2) DESC ) AS Ranking
    FROM best_leagues l
        INNER JOIN Match m ON m.league_id = l.id
        INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
        INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
    GROUP BY League, team_name

)

SELECT * FROM (SELECT * FROM stats WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Serie A' LIMIT 5 )
;

 * sqlite:///database.sqlite
Done.


League,team_name,PlayPassing_Score,Ranking
Bundesliga,SV Darmstadt 98,77.0,1
Bundesliga,1. FC Köln,61.17,2
Bundesliga,DSC Arminia Bielefeld,59.33,3
Bundesliga,Eintracht Braunschweig,56.0,4
Bundesliga,FC St. Pauli,55.67,5
League 1,OGC Nice,61.83,1
League 1,Grenoble Foot 38,60.0,2
League 1,Toulouse FC,55.33,3
League 1,AJ Auxerre,55.17,4
League 1,Valenciennes FC,54.0,5


**Teams with better builUpDribbling**

Ranking the teams per each league with best one to one players score

In [144]:
%%sql
WITH stats AS(
SELECT l.name AS League, t.team_long_name AS team_name, ROUND(AVG(a.buildUpPlayDribbling),2) AS PlayDribbling_Score,
    RANK() 
        OVER (PARTITION BY l.name ORDER BY ROUND(AVG(a.buildUpPlayDribbling),2) DESC ) AS Ranking
    FROM best_leagues l
        INNER JOIN Match m ON m.league_id = l.id
        INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
        INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
    GROUP BY League, team_name

)

SELECT * FROM (SELECT * FROM stats WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Serie A' LIMIT 5 )
;

 * sqlite:///database.sqlite
Done.


League,team_name,PlayDribbling_Score,Ranking
Bundesliga,Fortuna Düsseldorf,65.5,1
Bundesliga,SpVgg Greuther Fürth,60.0,2
Bundesliga,1. FC Nürnberg,60.0,2
Bundesliga,Hamburger SV,58.5,4
Bundesliga,VfB Stuttgart,58.0,5
League 1,Olympique de Marseille,69.5,1
League 1,Stade Rennais FC,62.0,2
League 1,FC Sochaux-Montbéliard,62.0,2
League 1,FC Nantes,61.5,4
League 1,SC Bastia,61.0,5


<a id='Ranking Teams on Match'></a>
## Identifying winners and Lossers


**Creating a view with the teams who won and lossed the match**

New columns: 

`who_loss`: team_api_id of the team who lost the game

`who_win`: team_api_id of the tean who won the game  `

In [164]:
%%sql
DROP VIEW IF EXISTS win_loss;

CREATE VIEW win_loss AS
SELECT home_team_api_id, away_team_api_id, league_id,
    CASE WHEN home_team_goal > away_team_goal THEN home_team_api_id
    WHEN away_team_goal > home_team_goal THEN away_team_api_id
    ELSE 'draw'
    END who_won,
    CASE WHEN home_team_goal > away_team_goal THEN away_team_api_id
    WHEN away_team_goal > home_team_goal THEN home_team_api_id
    ELSE 'draw'
    END who_lose
    
    FROM Match;

 * sqlite:///database.sqlite
Done.
Done.


[]



**Creating a view with the ranking of teams according to their victories**

`Columns:` League, Team, Wins, Rank

In [16]:
%%sql
DROP VIEW IF EXISTS victories_ranking;

CREATE VIEW victories_ranking AS
SELECT t.team_api_id, l.name AS League, t.team_long_name AS Team, COUNT(w.who_won) AS Wins, 
RANK()
    OVER(ORDER BY COUNT(w.who_won) DESC) AS ranking
FROM win_loss w
    JOIN best_leagues l ON l.id = w.league_id
    JOIN best_teams t ON t.team_api_id = w.who_won
GROUP BY w.who_won
ORDER BY Wins DESC;

 * sqlite:///database.sqlite
Done.
Done.


[]

**Creating a view with the ranking of teams according to their losses**

`Columns:` League, Team, Losses, Rank

In [17]:
%%sql
DROP VIEW IF EXISTS losses_ranking;

CREATE VIEW losses_ranking AS
SELECT t.team_api_id, l.name AS League, t.team_long_name AS Team, COUNT(w.who_lose) AS Losses,
RANK()
    OVER(ORDER BY COUNT(w.who_lose) DESC) AS ranking
FROM win_loss w
    JOIN best_leagues l ON l.id = w.league_id
    JOIN best_teams t ON t.team_api_id = w.who_won
GROUP BY w.who_lose
ORDER BY Losses DESC;

 * sqlite:///database.sqlite
Done.
Done.


[]

<a id='Teams with more Wins'></a>
## Extracting ranking of the teams with more victories

![](./images/top_clubs_2.jpg)
[Image taken from Trollfootball](https://www.trollfootball.me/videos/view/current-uefa-coefficient-ranking-for-top-10-clubs)

### **Top 10 Teams with more victories from the best leagues**

In [18]:
%%sql
SELECT * FROM victories_ranking LIMIT 10;

 * sqlite:///database.sqlite
Done.


team_api_id,League,Team,Wins,ranking
8634,La Liga,FC Barcelona,35568,1
8633,La Liga,Real Madrid CF,34656,2
10260,Premier League,Manchester United,29184,3
9885,Serie A,Juventus,28539,4
8455,Premier League,Chelsea,26752,5
9847,League 1,Paris Saint-Germain,26600,6
8456,Premier League,Manchester City,26600,6
9823,Bundesliga,FC Bayern Munich,26248,8
9825,Premier League,Arsenal,25840,9
9906,La Liga,Atlético Madrid,25384,10


<a id='Teams with more Losses'></a>
### Extracting ranking of the teams with more losses

**Top 10 Teams with more losses from the best leagues**

In [19]:
%%sql
SELECT * FROM losses_ranking LIMIT 10;

 * sqlite:///database.sqlite
Done.


team_api_id,League,Team,Losses,ranking
7878,La Liga,Granada CF,17366,1
7943,Serie A,Sassuolo,17159,2
8197,Premier League,Leicester City,16606,3
8191,Premier League,Burnley,16606,3
7878,La Liga,Granada CF,16492,5
7943,Serie A,Sassuolo,15563,6
8524,Serie A,Atalanta,15334,7
8165,Bundesliga,1. FC Nürnberg,15147,8
7878,La Liga,Granada CF,15105,9
7943,Serie A,Sassuolo,15039,10


<a id='Attributes Analysis'></a>

## Difference on Attributes for teams with more wins and losses

### **Top 5 Teams per League with more losses**


In [14]:
%%sql
WITH wins AS (
SELECT l.name AS League, t.team_long_name AS Team, COUNT(w.who_lose) AS Loss,
        a.buildUpPlaySpeedClass AS speed_class, a.buildUpPlayDribblingClass AS dribbling_class,
        a.buildUpPlayPassingClass AS passing_class, a.chanceCreationPassingClass AS creation_passing_class,
        a.chanceCreationShootingCLass AS creation_shooting_class, a.defencePressureClass AS pressure_class,
        a.defenceAggressionClass AS agression_class, a.defenceTeamWidthClass AS team_width_class,
        a.defenceDefenderLineClass AS defender_line_class, a.buildUpPlayPositioningClass AS possition_class
    
FROM win_loss w
    JOIN best_leagues l ON l.id = w.league_id
    JOIN best_teams t ON t.team_api_id = w.who_lose
    JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY w.who_lose
ORDER BY Loss DESC
)

SELECT * FROM (SELECT * FROM wins WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Serie A' LIMIT 5 )

 * sqlite:///database.sqlite
Done.


League,Team,Loss,speed_class,dribbling_class,passing_class,creation_passing_class,creation_shooting_class,pressure_class,agression_class,team_width_class,defender_line_class,possition_class
Bundesliga,Hannover 96,101184,Balanced,Little,Long,Normal,Normal,High,Double,Wide,Cover,Organised
Bundesliga,Hamburger SV,92208,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
Bundesliga,VfB Stuttgart,91392,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
Bundesliga,TSG 1899 Hoffenheim,88944,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
Bundesliga,SV Werder Bremen,87312,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Wide,Cover,Organised
League 1,OGC Nice,106704,Balanced,Little,Mixed,Normal,Normal,Deep,Press,Normal,Cover,Organised
League 1,FC Lorient,105792,Balanced,Little,Long,Safe,Normal,High,Press,Wide,Offside Trap,Organised
League 1,Toulouse FC,101232,Balanced,Little,Long,Normal,Normal,Deep,Press,Narrow,Cover,Organised
League 1,Stade Rennais FC,93024,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
League 1,AS Saint-Étienne,92112,Balanced,Little,Mixed,Normal,Normal,Deep,Contain,Narrow,Offside Trap,Organised


### **Top 5 Teams Per League with more Wins**


In [15]:
%%sql
WITH wins AS (
SELECT l.name AS League, t.team_long_name AS Team, COUNT(w.who_won) AS Win,
        a.buildUpPlaySpeedClass AS speed_class, a.buildUpPlayDribblingClass AS dribbling_class,
        a.buildUpPlayPassingClass AS passing_class, a.chanceCreationPassingClass AS creation_passing_class,
        a.chanceCreationShootingCLass AS creation_shooting_class, a.defencePressureClass AS pressure_class,
        a.defenceAggressionClass AS agression_class, a.defenceTeamWidthClass AS team_width_class,
        a.defenceDefenderLineClass AS defender_line_class, a.buildUpPlayPositioningClass AS possition_class
FROM win_loss w
    JOIN best_leagues l ON l.id = w.league_id
    JOIN best_teams t ON t.team_api_id = w.who_won
    JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY w.who_won
ORDER BY Win DESC
)

SELECT * FROM (SELECT * FROM wins WHERE League = 'Bundesliga' ORDER BY Win DESC LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'League 1' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'La Liga' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Premier League' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Serie A' ORDER BY Win DESC LIMIT 5 )

 * sqlite:///database.sqlite
Done.


League,Team,Win,speed_class,dribbling_class,passing_class,creation_passing_class,creation_shooting_class,pressure_class,agression_class,team_width_class,defender_line_class,possition_class
Bundesliga,FC Bayern Munich,157488,Balanced,Little,Mixed,Normal,Lots,High,Press,Normal,Cover,Free Form
Bundesliga,Borussia Dortmund,128112,Fast,Little,Mixed,Normal,Lots,Medium,Double,Normal,Cover,Organised
Bundesliga,Bayer 04 Leverkusen,111792,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
Bundesliga,FC Schalke 04,103632,Balanced,Little,Mixed,Normal,Lots,High,Press,Wide,Cover,Organised
Bundesliga,VfL Wolfsburg,95472,Balanced,Little,Mixed,Normal,Lots,Medium,Press,Normal,Cover,Organised
League 1,Paris Saint-Germain,159600,Balanced,Little,Mixed,Normal,Normal,High,Double,Wide,Offside Trap,Organised
League 1,Olympique Lyonnais,139536,Balanced,Little,Mixed,Normal,Lots,Medium,Press,Normal,Cover,Organised
League 1,LOSC Lille,134064,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised
League 1,Olympique de Marseille,130416,Balanced,Little,Mixed,Normal,Lots,Medium,Press,Normal,Cover,Organised
League 1,Girondins de Bordeaux,114912,Balanced,Little,Mixed,Normal,Normal,Medium,Press,Normal,Cover,Organised


<a id='Summary Attributes Analysis'></a>
## Summary: Teams with more wins and losses

Surpringsly, there are not that many differences in the playstyle of the teams with more winds related to the teams with more losses. I will need to look at other parameters. However there are some differences to take into account:

### Similarities

* `Speed class`: Mostly balanced - Allows to defend, attack, and dominate the middle of the park effectively. It is mostly defined by the experience and quality of the players.

* `Dribbling class`: Mostly little - Creativy of players on 1 to 1 situations.

* `Creation Passing Class`: Mostly Normal - Ability and accuracy of the team passing the ball. Neither risky or safe, most like a combination of both. Risky class means dragging your players out of position, making the team sometimes susceptible to counterattacks; on the other hand, when playing against high pressure can be effective because of the space the team can exploit.

* `Pressure Class`: Mixing of Medium, High, and Deep - Define how the team pressures the other.

* `Agression Class`: Mostly Press - Together with pressure, can have the most impact on the game.

* `Team Width Class`: Mostly Normal and some wides. Defines how the team position itself on the pitch when defending. Nothing to do with how the team attacks.

* `Defender Line Class`: Mostly Cover

### Differences

* `Creation Shooting Class`: Teams with more losses have mostly a category of Normal. Teams with more wins, somehow predictable, have mostly a category of Lots, meaning a higher class creation shooting class. This means, the players possition themselves more on the edge of the box to shoot. Shooting on low means the players run more into the box before shooting. Many times, this doesn't defines the team wins more, sometimes it is related to the playstyle of the team. For example, Barcelona uses to have a low shooting class and it is one of the teams with more wins.


* `Passing class`: Teams with more losses tend to have Long passing class. However teams with more wins are more in the "Mixed" and "Short" categories. Teams with "Mixed" category tends to use of a combination of short and long passing class. On short passing, the plays come shorter to look for passes. On long class, they will stay away to start their run for the passes. The last one has some effect on the defensive position. When players stay further away from each other, they will create space for the opposition to dive into counters. 


* `Possition Class`: Even though both of them have predominance of "Organized" category, teams with more wins tend to have a "Free Form" category. "Organized" is a safer option for the team, the central defenders keep their position better, and the midfielders are defensively better positioned. However, on "Free Form" the teams are more creative; the central defenders are more dynamic and make more runs, but they, together with the midfielders, can get caught out of position easier, and they more prone to counterattacks, depending most on talent and team work for defending certain game situations.


**Note**: Even though some differences and particularities were extracted to identify playstyle that can lead the teams to more victories, a different approach may be useful for better and more concised conclusions

<a id='Attributes vs Team Victory'></a>

## Attributes vs Team Victory

![](./images/victory.png)
[Image taken from 90min](https://www.90min.com/posts/the-20-most-valuable-football-clubs-in-europe-ranked-01e9fv5d9yvy)

In the following, I compare the different attributes that led to more victories in the different teams. Using a different approach than before. Previously, I visually compared the teams with more victories and their attirbutes, trying to find a relationship between victory and class of attributes. However, this approach can have a huge bias in some attributes when analyzing their relationship. Teams with more victories are usually teams with more economic power, in other words, where the best players are. So, sometimes, individual talent makes the difference in the games in order to obtain the victory, regardless of the type of strategy the team may take. There is other groups more similar regarding theirt economic power and quality of their players, in these teams their playstyle may have a bigger role when achieving victories. With the following approach, I am looking to reveal that relationship by grouping the different classes by category with the number of wins and loss teams had using these categories.

####  Speed Class

In [28]:
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.buildUpPlaySpeedClass
        ORDER BY "Rate of Wins vs Losses" DESC;



 * sqlite:///database.sqlite
Done.


Speed Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Fast,15154.0,6224.0,2.43
Slow,11151.0,7360.0,1.52
Balanced,6335.0,8657.0,0.73


#### Dribbling Class

In [32]:
%%sql
SELECT a.buildUpPlayDribblingClass AS 'Dribbling Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.buildUpPlayDribblingClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Dribbling Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Normal,7868.0,8263.0,0.95
Little,7437.0,8334.0,0.89
Lots,2109.0,10227.0,0.21


#### Passing Class

In [31]:
%%sql
SELECT a.buildUpPlayPassingClass AS 'Passing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.buildUpPlayPassingClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Passing Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Short,11448.0,6552.0,1.75
Mixed,7564.0,8362.0,0.9
Long,2861.0,9389.0,0.3


#### Positioning Class

In [33]:
%%sql
SELECT a.buildUpPlayPositioningClass AS 'Positioning Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.buildUpPlayPositioningClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Positioning Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Free Form,22723.0,5731.0,3.96
Organised,6684.0,8476.0,0.79


#### Creation Passing Class

In [34]:
%%sql
SELECT a.chanceCreationPassingClass AS 'Creation Passing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.chanceCreationPassingClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Creation Passing Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Normal,7678.0,8209.0,0.94
Risky,6838.0,8988.0,0.76
Safe,5313.0,8427.0,0.63


#### Creation Crossing Class

In [35]:
%%sql
SELECT a.chanceCreationCrossingClass AS 'Creation Crossing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.chanceCreationCrossingClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Creation Crossing Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Normal,7587.0,8125.0,0.93
Lots,7312.0,9226.0,0.79
Little,6294.0,9115.0,0.69


#### Creation Shooting Class

In [36]:
%%sql
SELECT a.chanceCreationShootingClass AS 'Creation Shooting Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.chanceCreationShootingClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Creation Shooting Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Little,12413.0,2780.0,4.46
Lots,13209.0,7484.0,1.77
Normal,6584.0,8491.0,0.78


#### Creation Positioning Class

In [37]:
%%sql
SELECT a.chanceCreationPositioningClass AS 'Creation Positioning Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.chanceCreationPositioningClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Creation Positioning Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Free Form,19970.0,6328.0,3.16
Organised,6476.0,8501.0,0.76


#### Defence Pressure Class

In [38]:
%%sql
SELECT a.defencePressureClass AS 'Defence Pressure Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.defencePressureClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Defence Pressure Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Medium,7575.0,8340.0,0.91
Deep,7710.0,8608.0,0.9
High,4922.0,7324.0,0.67


#### Defence Aggression Class

In [39]:
%%sql
SELECT a.defenceAggressionClass AS 'Defence Aggression Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.defenceAggressionClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Defence Aggression Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Double,7882.0,7560.0,1.04
Press,7588.0,8354.0,0.91
Contain,5373.0,9026.0,0.6


#### Defence Team Width Class

In [40]:
%%sql
SELECT a.defenceTeamWidthClass AS 'Defence Team Width Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.defenceTeamWidthClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Defence Team Width Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Normal,7854.0,8150.0,0.96
Wide,6143.0,9347.0,0.66
Narrow,2028.0,9872.0,0.21


#### Defence Line Class

In [41]:
%%sql
SELECT a.defenceDefenderLineClass AS 'Defence Line Class',ROUND(AVG(v.Wins)) AS 'Average of Wins',
    ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
        GROUP BY a.defenceDefenderLineClass
        ORDER BY "Rate of Wins vs Losses" DESC;

 * sqlite:///database.sqlite
Done.


Defence Line Class,Average of Wins,Average of Losses,Rate of Wins vs Losses
Offside Trap,7757.0,8279.0,0.94
Cover,7486.0,8339.0,0.9


<a id='Attributes vs Team Victory Summary'></a>

## Summary of Attributes per Victory

* `Play Speed Class`: Teams with "Fast" Category achieved the greatest win over loss rate. Surpringsly, Teams with "Low" Category obtained higher rate than those with "Balanced" category. This fact contradicts the previous analysis that reflected that the among the 25 teams with more victory (top 5 for each 5 leagues), 24 had a "Balanced" Category and only 1 had "Fast Category". So, for the most powerful teams (economically speaking, as I discussed in the introduction of this section) the speed of Play had not an impact in their victories as for other teams may had.

* `Dribbling Class`: The rate for "Lots" Category is surpringsly low, teams in this category had, on average, 5 more losses than victories. In other words, those teams with players with high creativity on 1 to 1, had a higher probability to lose. 

* `Passing Class`: "Short" Category had the highest rate of wins vs lose. "Long" had the lowest. 

* `Positioning Class`: "Free form" has a considerably higher rate (around 4 times) compared with "Organised". Reaffriming the previously discussed, some of the teams with more wins practice a "Free Form" position class.

* `Creation Passion Class`: This category doesn't seem to have such an impact in wins and losses for the team. 

* `Creation Crossing Class`: This category doesn't seem to have such an impact in wins and losses for the team. 

* `Creation Shooting Class`: Different from the previous analysis, here I can see that "Little" category has 4 times more victories than losses. So, the teams with this type of strategy have a higher probability of achieving victory. 

* `Defence Pressure Class`: Doesn't seem to have any weight in victory outcome alone.

* `Defence Aggression Class`: Doesn't seem to have a high impact in victory for itself. Those teams with "Contain" Category seems to have the lowest chance of victory.

* `Defence Team Width Class`: Those with a "Narrow" Category seems to have the lowest probability of achieving victories.

* `Defence Line Class`: Neither of those categories seems to have a impact on victory by itself. Both categories have a similar rate of win vs losses.

**Final Notes:**

With this analysis, I confirm that attributes such as `Creation Shooting Class`, `Passing Class`, and `Possition Class` have an impact by itself in the possibility of a team to obtain victory. In addition, `Speed Class` attribute also has an impact in this probability. On the other side, :"Lost" and "Narrow" categories from the `Dribbling Class` and `Defence Team Width Class` attributes have a negative impact in the probability of a team in achieving victory.

<a id='Combination of Attributes vs Team Victory'></a>

## Combination of Attributes vs Victory

![](./images/victory2.png)
[Image taken from hindustatimes](https://www.hindustantimes.com/football/record-breaker-cristiano-ronaldo-lauded-for-most-beautiful-goal/story-MFs5SB0d8cs8k7ideZ5ZnN.html)

In this section, I extract the attributes with more relevance for achieven victory, to identify the combination with the highest probability of victory.



In [44]:
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', a.buildUpPlayDribblingClass AS 'Dribbling Class',
    a.buildUpPlayPassingClass AS 'Passing Class', a.buildUpPlayPositioningClass AS 'Positioning Class',
    a.chanceCreationShootingClass AS 'Creation Shooting Class', a.defenceTeamWidthClass AS 'Defence Team Width Class',
    ROUND(AVG(v.Wins)) AS 'Average of Wins', ROUND(AVG(l.Losses)) AS "Average of Losses", 
    ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses", RANK() OVER(ORDER BY ROUND(AVG(v.Wins)/AVG(l.Losses),2) DESC) AS "Ranking"
    
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
            
        GROUP BY a.buildUpPlaySpeedClass, a.buildUpPlayDribblingClass, a.buildUpPlayPassingClass, a.buildUpPlayPositioningClass,
                a.chanceCreationShootingClass, a.defenceTeamWidthClass
        ORDER BY "Rate of Wins vs Losses" DESC;


 * sqlite:///database.sqlite
Done.


Speed Class,Dribbling Class,Passing Class,Positioning Class,Creation Shooting Class,Defence Team Width Class,Average of Wins,Average of Losses,Rate of Wins vs Losses,Ranking
Balanced,Normal,Short,Free Form,Little,Normal,26600.0,2546.0,10.45,1
Balanced,Normal,Short,Organised,Normal,Normal,26600.0,2546.0,10.45,1
Fast,Little,Short,Organised,Normal,Normal,11832.0,2023.0,5.85,3
Slow,Little,Mixed,Organised,Normal,Normal,13060.0,2304.0,5.67,4
Slow,Little,Mixed,Organised,Lots,Normal,14288.0,2584.0,5.53,5
Fast,Little,Mixed,Organised,Lots,Normal,17379.0,3292.0,5.28,6
Balanced,Little,Mixed,Free Form,Lots,Normal,26752.0,5994.0,4.46,7
Fast,Little,Mixed,Free Form,Lots,Normal,26752.0,5994.0,4.46,7
Fast,Normal,Mixed,Organised,Lots,Normal,26752.0,5994.0,4.46,7
Fast,Little,Short,Organised,Lots,Wide,18632.0,5389.0,3.46,10


#### Best possible combination



In [46]:
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', a.buildUpPlayDribblingClass AS 'Dribbling Class',
    a.buildUpPlayPassingClass AS 'Passing Class', a.buildUpPlayPositioningClass AS 'Positioning Class',
    a.chanceCreationPassingClass AS 'Creation Passing Class', a.chanceCreationCrossingClass AS 'Creation Crossing Class',
    a.chanceCreationShootingClass AS 'Creation Shooting Class', a.defencePressureClass AS 'Defence Pressure',
    a.defenceAggressionClass AS 'Defence Aggression Class', a.defenceTeamWidthClass AS 'Defence Team Width Class',
    a.defenceDefenderLineClass AS 'Defender Line Class',
    ROUND(AVG(v.Wins)) AS 'Average of Wins', ROUND(AVG(l.Losses)) AS "Average of Losses", 
    ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses",
    RANK() OVER(ORDER BY ROUND(AVG(v.Wins)/AVG(l.Losses),2) DESC) AS "Ranking"
    
        FROM Team_Attributes a
            INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
            INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
            
        GROUP BY a.buildUpPlaySpeedClass, a.buildUpPlayDribblingClass, a.buildUpPlayPassingClass, a.buildUpPlayPositioningClass,
                a.chanceCreationPassingClass, a.chanceCreationCrossingClass, a.chanceCreationShootingClass,
                a.defencePressure, a.defenceAggressionClass, a.defenceTeamWidthClass, a.defenceDefenderLineClass
        ORDER BY "Rate of Wins vs Losses" DESC
        LIMIT 10;

 * sqlite:///database.sqlite
Done.


Speed Class,Dribbling Class,Passing Class,Positioning Class,Creation Passing Class,Creation Crossing Class,Creation Shooting Class,Defence Pressure,Defence Aggression Class,Defence Team Width Class,Defender Line Class,Average of Wins,Average of Losses,Rate of Wins vs Losses,Ranking
Balanced,Normal,Short,Free Form,Normal,Normal,Little,Medium,Press,Normal,Cover,26600.0,2546.0,10.45,1
Balanced,Normal,Short,Organised,Safe,Normal,Normal,Medium,Press,Normal,Cover,26600.0,2546.0,10.45,1
Fast,Little,Mixed,Organised,Normal,Lots,Lots,Medium,Press,Normal,Cover,26600.0,2546.0,10.45,1
Fast,Little,Mixed,Organised,Risky,Normal,Lots,Medium,Press,Normal,Cover,26600.0,2546.0,10.45,1
Fast,Little,Mixed,Organised,Risky,Normal,Lots,High,Press,Normal,Cover,11832.0,2023.0,5.85,5
Fast,Little,Short,Organised,Risky,Normal,Normal,Medium,Double,Normal,Offside Trap,11832.0,2023.0,5.85,5
Fast,Normal,Mixed,Organised,Normal,Normal,Normal,Medium,Press,Normal,Cover,11832.0,2023.0,5.85,5
Fast,Normal,Mixed,Organised,Risky,Normal,Normal,Medium,Press,Normal,Cover,11832.0,2023.0,5.85,5
Slow,Little,Mixed,Organised,Normal,Normal,Normal,Medium,Press,Normal,Cover,11832.0,2023.0,5.85,5
Balanced,Little,Mixed,Organised,Normal,Normal,Normal,Medium,Press,Normal,Cover,14288.0,2584.0,5.53,10


#### Teams which employ the best attributes

In [49]:
%%sql
SELECT v.* FROM victories_ranking v 
    INNER JOIN Team_Attributes a ON a.team_api_id = v.team_api_id
    WHERE a.buildUpPlaySpeedClass = 'Balanced' AND a.buildUpPlayDribblingClass = 'Normal' AND
    a.buildUpPlayPassingClass = 'Short' AND (a.buildUpPlayPositioningClass = 'Free Form' OR a.buildUpPlayPositioningClass = 'Organised') AND
    (a.chanceCreationShootingClass = 'Little' OR a.chanceCreationShootingClass = 'Normal') AND a.defenceTeamWidthClass = 'Normal';

 * sqlite:///database.sqlite
Done.


team_api_id,League,Team,Wins,ranking
9825,Premier League,Arsenal,25840,9
9825,Premier League,Arsenal,25840,9
8634,La Liga,FC Barcelona,35568,1
9823,Bundesliga,FC Bayern Munich,26248,8
8535,Serie A,Fiorentina,19800,23
8456,Premier League,Manchester City,26600,6
8456,Premier League,Manchester City,26600,6
10167,Serie A,Parma,8362,65


In [50]:
%%sql
SELECT COUNT(*) FROM best_teams;

 * sqlite:///database.sqlite
Done.


COUNT(*)
14585


<a id='Combination of Attributes vs Team Victory'></a>

## Analysis of Results: Combined Attributes vs Victory

* Combining the most influencing factors, we obtain higher rates for victory vs losses of up to 10.45 for the best and as low as 0 for the worst combination, re affirming their influence in the victory of the team from previous analysis. 


* Notice that adding the rest of attributes don't increment the rate of victory vs losses, giving proof that outside of the most influencer attributes, the others doesn't add too much weight in the achieving of victory for a team.


* There are 720 possible combinations of attributes using the 6 most influencer attributes of above and 14585 teams. 


* There are 6 teams that use the combination of attributes that higher rate of victory vs losses bring, all of the them in the top 65 of teams with more victories, and 4 in the top 10, including the top 1 team with more victories: Barcelona.

<a id='Best Players'></a>

# Best Players

![](./images/players3.jpg)
[Image taken from defensacentral](https://www.defensacentral.com/imagen/1506617155-premios-uefa)

### Rating of the best players

### Creating ranking for players according their overall rating


In [82]:
%%sql
DROP VIEW IF EXISTS player_ranking;

CREATE VIEW player_ranking AS 
SELECT p.player_api_id, p.player_name AS 'Player', ROUND(AVG(a.overall_rating),2) AS 'Rating', 
RANK() OVER(ORDER BY ROUND(AVG(a.overall_rating),2) DESC) AS 'Ranking'
    FROM Player p 
        INNER JOIN Player_attributes a ON p.player_api_id = a.player_api_id
    GROUP BY p.player_name
    ORDER BY ROUND(AVG(a.overall_rating),2) DESC;

 * sqlite:///database.sqlite
Done.
Done.


[]

In [83]:
%%sql
SELECT * FROM player_ranking LIMIT 10;

 * sqlite:///database.sqlite
Done.


player_api_id,Player,Rating,Ranking
30981,Lionel Messi,92.19,1
30893,Cristiano Ronaldo,91.28,2
30924,Franck Ribery,88.46,3
30955,Andres Iniesta,88.32,4
35724,Zlatan Ibrahimovic,88.29,5
30834,Arjen Robben,87.84,6
39854,Xavi Hernandez,87.64,7
30829,Wayne Rooney,87.22,8
30657,Iker Casillas,86.95,9
30894,Philipp Lahm,86.73,10


### Players with some of the best attributes 

In [92]:
%%sql
SELECT 'Player with higher potential' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.potential AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.potential = (SELECT MAX(potential) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with best heading_accuracy' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.heading_accuracy AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.heading_accuracy = (SELECT MAX(heading_accuracy) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with best dribbling' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.dribbling AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.dribbling = (SELECT MAX(dribbling) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with best ball control' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.ball_control AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.ball_control = (SELECT MAX(ball_control) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with higher free kick accuracy' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.free_kick_accuracy AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.free_kick_accuracy = (SELECT MAX(free_kick_accuracy) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with more interceptions' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.interceptions AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.interceptions = (SELECT MAX(interceptions) FROM Player_Attributes)
    GROUP BY r.Player
    
UNION

SELECT 'Player with more penalties' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.penalties AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.penalties = (SELECT MAX(penalties) FROM Player_Attributes)
    GROUP BY r.Player    
    
UNION

SELECT 'Player with more sliding tackles' AS 'Attribute', r.Player AS 'Player Name', 
    r.Ranking AS 'Ranking of Rating', a.sliding_tackle AS 'Value'
    FROM player_ranking r
    JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
    WHERE a.sliding_tackle = (SELECT MAX(sliding_tackle) FROM Player_Attributes)
    GROUP BY r.Player  

 * sqlite:///database.sqlite
Done.


Attribute,Player Name,Ranking of Rating,Value
Player with best ball control,Lionel Messi,1,97
Player with best ball control,Ronaldinho,70,97
Player with best dribbling,Cristiano Ronaldo,2,97
Player with best dribbling,Lionel Messi,1,97
Player with best dribbling,Ronaldinho,70,97
Player with best heading_accuracy,Nikola Zigic,2911,98
Player with higher free kick accuracy,"Juninho Pernambucano,20",429,97
Player with higher potential,Lionel Messi,1,97
Player with more interceptions,Andrea Pirlo,29,96
Player with more interceptions,Timmy Simons,1132,96


<a id='Best Players Summary'></a>

### **Analysis of Results for players**

Similar to teams attributes, some attributes influence more in the rating of the player than others. At the same time, offensive players tend to get higher ratings than other positions due to the influence they have in the victory of their teams. Other features, also depend on the style of game of the team.

Due to the limitations of SQLite, a more thoroughly analysis is difficult. A similar analysis to the teams section is repetitive in the procedure and will require too time consuming queries due to the schema of this particular database.

<a id='Analysis of Results'></a>
# 4. Analysis of Results

Through the analysis of the football database I have used some sqlite functions and techniques such as Views, Window functions, nested queries, common table expresions among others. The results have been analyzed at the end of each section. Next, it is a quick summary.

* Team Attributes not always decided the outcome of a team victory. The hint is that not all the teams with more wins use the same playstyle or strategy. 

* Some attributes have more weight in the victory of a team than others. From the around 13 classes of attributes, 6 of them showed to had a higher impact in the achieving of victory. 

* Attributes were combined to analyze the rate of win vs losses every combination had brought to their team. The best combination using this criteria brought a rate of over 10 victories per defeat. 6 out of the 8 teams that used this strategy were in the top 10 teams with more victories, including the number 1 team: Barcelona.

* After analyzing the best players according the overall rate, the result showed that all these players belong to the top teams in the different leagues. So, as stated through the analysis, outside the playstyle factor, victory is also brought by the individual talent of the players from the team, and the individual talent of the player brings the best playstyle to win a game. As every sport, the most powerful teams, economically speaking, most of the time are the ones with the best players.

<a id='Limitations'></a>
# 5. Limitations

* SQLite programming doesn't allow the use of functions, stored procedures and other functions such as standard deviation or variables which limit the analysis of the data and makes the queries somehow repetitive and excesivelly long in many cases.

* Schema of the database was a bit messy, I was not able to connect team players with their team, and to able to join some tables I had to go over many intermediary tables.

* Missing some values 

* Many features are taken from Fifa game, which makes them untrustable to make realistic analysis and draw more realistic conclusions.

* Missing valuable information about players such as position, goals, assistances, recoveries in games, minutes played and others that could have made the analysis of the players richer.

* Many unneccessary columns in teams attributes with not clear purpose.

<a id='Future Ideas'></a>
# 6. Future Ideas

* Analyze the performance of the teams through the seasons and find the ones who improved the most and the ones whose performance worsened over the season.

* Develop a Machine Learning model that can predict if the team wins or losses a game due to the playstyle and strategy of both teams.

* Add more data related to the players to enrichen the analysis of each player.