This notebook demonstrates advanced SQL techniques like CASE statements, subqueries, WITH clauses, and window functions through analysis of the European Soccer Database. The soccer database contains team, match, and player data for top European leagues from 2008-2016.

After importing necessary libraries like pandas and sqlite3, SQL queries are constructed to explore the database. Techniques like conditional logic in CASE statements, nesting subqueries, common table expressions with WITH clause, and analytic functions like ranking are applied.

The goals are to 
- gain insights into the soccer data, and 
- exemplify effective use of advanced SQL constructs for performing complex analysis. The notebook serves both as a practical guide for learning SQL and a showcase of impactful techniques using a real-world sports dataset.

In [1]:
import numpy as np
import pandas as pd
import sqlite3
import matplotlib.pyplot as plt

conn = sqlite3.connect('/kaggle/input/soccer/database.sqlite')

In [2]:
tables = pd.read_sql("""SELECT *
                        FROM sqlite_master
                        WHERE type='table';""", conn)
tables


Unnamed: 0,type,name,tbl_name,rootpage,sql
0,table,sqlite_sequence,sqlite_sequence,4,"CREATE TABLE sqlite_sequence(name,seq)"
1,table,Player_Attributes,Player_Attributes,11,"CREATE TABLE ""Player_Attributes"" (\n\t`id`\tIN..."
2,table,Player,Player,14,CREATE TABLE `Player` (\n\t`id`\tINTEGER PRIMA...
3,table,Match,Match,18,CREATE TABLE `Match` (\n\t`id`\tINTEGER PRIMAR...
4,table,League,League,24,CREATE TABLE `League` (\n\t`id`\tINTEGER PRIMA...
5,table,Country,Country,26,CREATE TABLE `Country` (\n\t`id`\tINTEGER PRIM...
6,table,Team,Team,29,"CREATE TABLE ""Team"" (\n\t`id`\tINTEGER PRIMARY..."
7,table,Team_Attributes,Team_Attributes,2,CREATE TABLE `Team_Attributes` (\n\t`id`\tINTE...


In [3]:
match_table = pd.read_sql("""SELECT *
                        FROM Match;""", conn)
match_table


Unnamed: 0,id,country_id,league_id,season,stage,date,match_api_id,home_team_api_id,away_team_api_id,home_team_goal,...,SJA,VCH,VCD,VCA,GBH,GBD,GBA,BSH,BSD,BSA
0,1,1,1,2008/2009,1,2008-08-17 00:00:00,492473,9987,9993,1,...,4.00,1.65,3.40,4.50,1.78,3.25,4.00,1.73,3.40,4.20
1,2,1,1,2008/2009,1,2008-08-16 00:00:00,492474,10000,9994,0,...,3.80,2.00,3.25,3.25,1.85,3.25,3.75,1.91,3.25,3.60
2,3,1,1,2008/2009,1,2008-08-16 00:00:00,492475,9984,8635,0,...,2.50,2.35,3.25,2.65,2.50,3.20,2.50,2.30,3.20,2.75
3,4,1,1,2008/2009,1,2008-08-17 00:00:00,492476,9991,9998,5,...,7.50,1.45,3.75,6.50,1.50,3.75,5.50,1.44,3.75,6.50
4,5,1,1,2008/2009,1,2008-08-16 00:00:00,492477,7947,9985,1,...,1.73,4.50,3.40,1.65,4.50,3.50,1.65,4.75,3.30,1.67
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25974,25975,24558,24558,2015/2016,9,2015-09-22 00:00:00,1992091,10190,10191,1,...,,,,,,,,,,
25975,25976,24558,24558,2015/2016,9,2015-09-23 00:00:00,1992092,9824,10199,1,...,,,,,,,,,,
25976,25977,24558,24558,2015/2016,9,2015-09-23 00:00:00,1992093,9956,10179,2,...,,,,,,,,,,
25977,25978,24558,24558,2015/2016,9,2015-09-22 00:00:00,1992094,7896,10243,0,...,,,,,,,,,,


In [4]:
detailed_matches = pd.read_sql("""
SELECT Match.id, 
Country.name AS country_name, 
League.name AS league_name, 
season, 
date,
HT.team_long_name AS  home_team,
AT.team_long_name AS away_team,
home_team_goal, 
away_team_goal                                        
FROM Match
LEFT JOIN Country on Country.id = Match.country_id
LEFT JOIN League on League.id = Match.league_id
LEFT JOIN Team AS HT on HT.team_api_id = Match.home_team_api_id
LEFT JOIN Team AS AT on AT.team_api_id = Match.away_team_api_id
WHERE country_name = 'England'
AND season = '2015/2016'
ORDER by date
LIMIT 10;""", conn)
detailed_matches

Unnamed: 0,id,country_name,league_name,season,date,home_team,away_team,home_team_goal,away_team_goal
0,4390,England,England Premier League,2015/2016,2015-08-08 00:00:00,Bournemouth,Aston Villa,0,1
1,4391,England,England Premier League,2015/2016,2015-08-08 00:00:00,Chelsea,Swansea City,2,2
2,4392,England,England Premier League,2015/2016,2015-08-08 00:00:00,Everton,Watford,2,2
3,4393,England,England Premier League,2015/2016,2015-08-08 00:00:00,Leicester City,Sunderland,4,2
4,4394,England,England Premier League,2015/2016,2015-08-08 00:00:00,Manchester United,Tottenham Hotspur,1,0
5,4396,England,England Premier League,2015/2016,2015-08-08 00:00:00,Norwich City,Crystal Palace,1,3
6,4389,England,England Premier League,2015/2016,2015-08-09 00:00:00,Arsenal,West Ham United,0,2
7,4395,England,England Premier League,2015/2016,2015-08-09 00:00:00,Newcastle United,Southampton,2,2
8,4397,England,England Premier League,2015/2016,2015-08-09 00:00:00,Stoke City,Liverpool,0,1
9,4398,England,England Premier League,2015/2016,2015-08-10 00:00:00,West Bromwich Albion,Manchester City,0,3


Lets think as a football fan!

I want to take a look at which matches were played in a particular league and during a particular season and I probably want to take a look at scores as well.

To get the the desired results, we have to join match table with country table and then with league tabel then to the team table

Some of the fields in the different tables have same names. Renaming using AS in the query
