## 2021: Week 28 - It's Coming Rome

'55 years of hurt, Never stopped me dreaming!'

It was another night of pain for England fans on Sunday evening when they lost yet another penalty shootout in the European Football Championship final. This seems like it has been a common outcome for a lot of the tournaments that England have taken part in over the years, but what does the data agree? 

The challenge this week is to analyse the all of the penalty shootouts in the World Cup and European Championships (Euro's) since 1976.

### Input
Data is from Wikipedia (World Cup & Euro's) and is two sheets

### Requirements
- Input Data
- Determine what competition each penalty was taken in
- Clean any fields, correctly format the date the penalty was taken, & group the two German countries (eg, West Germany & Germany)
- Rank the countries on the following: 
    - Shootout win % (exclude teams who have never won a shootout)
    - Penalties scored %
- What is the most and least successful time to take a penalty? (What penalty number are you most likely to score or miss?)
- Output the Data

### Outputs
3 Outputs:
1. Win % Rankings (5 fields, 26 rows)
2. Scored % Rankings (5 fields, 34 rows)
3. Penalty Position Rankings (6 fields, 9 rows)

In [128]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input Data

In [129]:
data = pd.read_excel("./data/InternationalPenalties.xlsx", sheet_name=["WorldCup", "Euros"])

In [130]:
world_cup = data["WorldCup"].copy()
euros = data["Euros"].copy()

### Determine what competition each penalty was taken in

In [131]:
world_cup.columns

Index(['No.', 'Penalty Number ', 'Event Year', 'Winner', 'Full Time Score',
       'Loser', 'Winning Team GK', 'Winning team Taker', 'Losing team Taker',
       'Losing Team GK', 'Round', 'Date'],
      dtype='object')

In [132]:
world_cup["Penalty Number "].value_counts()

1    30
2    30
3    30
4    30
5    24
6     2
Name: Penalty Number , dtype: int64

In [133]:
world_cup.head()

Unnamed: 0,No.,Penalty Number,Event Year,Winner,Full Time Score,Loser,Winning Team GK,Winning team Taker,Losing team Taker,Losing Team GK,Round,Date
0,1,1,1982,West Germany,3–3,France,Schumacher,Kaltz Penalty scored,Penalty scored Giresse,Ettori,Semi-finals,2021-07-08
1,1,2,1982,West Germany,3–3,France,Schumacher,Breitner Penalty scored,Penalty scored Amoros,Ettori,Semi-finals,2021-07-08
2,1,3,1982,West Germany,3–3,France,Schumacher,Stielike Penalty missed,Penalty scored Rocheteau,Ettori,Semi-finals,2021-07-08
3,1,4,1982,West Germany,3–3,France,Schumacher,Littbarski Penalty scored,Penalty missed Six,Ettori,Semi-finals,2021-07-08
4,1,5,1982,West Germany,3–3,France,Schumacher,Rummenigge Penalty scored,Penalty scored Platini,Ettori,Semi-finals,2021-07-08


In [134]:
euros.columns

Index(['No.', 'Penalty Number', 'Event Year', 'Winner', 'Full Time Score',
       'Loser', 'Winning team GK', 'Winning team Taker', 'Losing team Taker',
       'Losing team GK', 'Round', 'Date'],
      dtype='object')

In [135]:
euros["Penalty Number"].value_counts()

1    22
2    22
3    22
4    22
5    19
6     6
7     3
8     2
9     2
Name: Penalty Number, dtype: int64

In [136]:
winner_penalty = world_cup["Winning team Taker"].str.split().apply(pd.Series)

In [137]:
winner_penalty["Winner Penalty type"] = winner_penalty[1] + " " + winner_penalty[2]

In [138]:
winner_penalty = winner_penalty.drop([0, 1, 2, 3, 4, 5], axis=1)
winner_penalty

Unnamed: 0,Winner Penalty type
0,Penalty scored
1,Penalty scored
2,Penalty missed
3,Penalty scored
4,Penalty scored
...,...
141,Penalty scored
142,Penalty missed
143,Penalty scored
144,Penalty scored


In [139]:
world_cup = world_cup.join(winner_penalty, how="left")
world_cup

Unnamed: 0,No.,Penalty Number,Event Year,Winner,Full Time Score,Loser,Winning Team GK,Winning team Taker,Losing team Taker,Losing Team GK,Round,Date,Winner Penalty type
0,1,1,1982,West Germany,3–3,France,Schumacher,Kaltz Penalty scored,Penalty scored Giresse,Ettori,Semi-finals,2021-07-08,Penalty scored
1,1,2,1982,West Germany,3–3,France,Schumacher,Breitner Penalty scored,Penalty scored Amoros,Ettori,Semi-finals,2021-07-08,Penalty scored
2,1,3,1982,West Germany,3–3,France,Schumacher,Stielike Penalty missed,Penalty scored Rocheteau,Ettori,Semi-finals,2021-07-08,Penalty missed
3,1,4,1982,West Germany,3–3,France,Schumacher,Littbarski Penalty scored,Penalty missed Six,Ettori,Semi-finals,2021-07-08,Penalty scored
4,1,5,1982,West Germany,3–3,France,Schumacher,Rummenigge Penalty scored,Penalty scored Platini,Ettori,Semi-finals,2021-07-08,Penalty scored
...,...,...,...,...,...,...,...,...,...,...,...,...,...
141,30,1,2018,Croatia,2–2,Russia,Subašić,Brozović Penalty scored,Penalty missed Smolov,Akinfeev,Quarter-finals,2021-07-07,Penalty scored
142,30,2,2018,Croatia,2–2,Russia,Subašić,Kovačić Penalty missed,Penalty scored Dzagoev,Akinfeev,Quarter-finals,2021-07-07,Penalty missed
143,30,3,2018,Croatia,2–2,Russia,Subašić,Modrić Penalty scored,Penalty missed Fernandes,Akinfeev,Quarter-finals,2021-07-07,Penalty scored
144,30,4,2018,Croatia,2–2,Russia,Subašić,Vida Penalty scored,Penalty scored Ignashevich,Akinfeev,Quarter-finals,2021-07-07,Penalty scored


In [140]:
loser_penalty = world_cup["Losing team Taker"].str.split().apply(pd.Series)
loser_penalty["Loser Penalty type"] = loser_penalty[0] + " " + loser_penalty[1]
loser_penalty = loser_penalty.drop([0, 1, 2, 3, 4], axis=1)
loser_penalty

Unnamed: 0,Loser Penalty type
0,Penalty scored
1,Penalty scored
2,Penalty scored
3,Penalty missed
4,Penalty scored
...,...
141,Penalty missed
142,Penalty scored
143,Penalty missed
144,Penalty scored


In [141]:
world_cup = world_cup.join(loser_penalty, how="left")
world_cup

Unnamed: 0,No.,Penalty Number,Event Year,Winner,Full Time Score,Loser,Winning Team GK,Winning team Taker,Losing team Taker,Losing Team GK,Round,Date,Winner Penalty type,Loser Penalty type
0,1,1,1982,West Germany,3–3,France,Schumacher,Kaltz Penalty scored,Penalty scored Giresse,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
1,1,2,1982,West Germany,3–3,France,Schumacher,Breitner Penalty scored,Penalty scored Amoros,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
2,1,3,1982,West Germany,3–3,France,Schumacher,Stielike Penalty missed,Penalty scored Rocheteau,Ettori,Semi-finals,2021-07-08,Penalty missed,Penalty scored
3,1,4,1982,West Germany,3–3,France,Schumacher,Littbarski Penalty scored,Penalty missed Six,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty missed
4,1,5,1982,West Germany,3–3,France,Schumacher,Rummenigge Penalty scored,Penalty scored Platini,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
141,30,1,2018,Croatia,2–2,Russia,Subašić,Brozović Penalty scored,Penalty missed Smolov,Akinfeev,Quarter-finals,2021-07-07,Penalty scored,Penalty missed
142,30,2,2018,Croatia,2–2,Russia,Subašić,Kovačić Penalty missed,Penalty scored Dzagoev,Akinfeev,Quarter-finals,2021-07-07,Penalty missed,Penalty scored
143,30,3,2018,Croatia,2–2,Russia,Subašić,Modrić Penalty scored,Penalty missed Fernandes,Akinfeev,Quarter-finals,2021-07-07,Penalty scored,Penalty missed
144,30,4,2018,Croatia,2–2,Russia,Subašić,Vida Penalty scored,Penalty scored Ignashevich,Akinfeev,Quarter-finals,2021-07-07,Penalty scored,Penalty scored


### Clean any fields, correctly format the date the penalty was taken, & group the two German countries (eg, West Germany & Germany)

In [142]:
world_cup.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 146 entries, 0 to 145
Data columns (total 14 columns):
 #   Column               Non-Null Count  Dtype         
---  ------               --------------  -----         
 0   No.                  146 non-null    int64         
 1   Penalty Number       146 non-null    int64         
 2   Event Year           146 non-null    object        
 3   Winner               146 non-null    object        
 4   Full Time Score      146 non-null    object        
 5   Loser                146 non-null    object        
 6   Winning Team GK      146 non-null    object        
 7   Winning team Taker   141 non-null    object        
 8   Losing team Taker    138 non-null    object        
 9   Losing Team GK       146 non-null    object        
 10  Round                146 non-null    object        
 11  Date                 146 non-null    datetime64[ns]
 12  Winner Penalty type  141 non-null    object        
 13  Loser Penalty type   138 non-null  

In [143]:
world_cup["Event Year"] = pd.to_datetime(world_cup["Event Year"].str.replace(",", ""))
world_cup["Event Year"] = world_cup["Event Year"].map(lambda x: x.year)

In [162]:
world_cup["Winner"] = world_cup["Winner"].str.strip()
world_cup["Loser"] = world_cup["Loser"].str.strip()

In [168]:
world_cup.loc[world_cup["Winner"] == "West Germany", "Winner"] = "Germany"

In [170]:
world_cup[world_cup["Loser"] == "West Germany"]

Unnamed: 0,No.,Penalty Number,Event Year,Winner,Full Time Score,Loser,Winning Team GK,Winning team Taker,Losing team Taker,Losing Team GK,Round,Date,Winner Penalty type,Loser Penalty type


In [169]:
world_cup[world_cup["Winner"] == "Germany"]

Unnamed: 0,No.,Penalty Number,Event Year,Winner,Full Time Score,Loser,Winning Team GK,Winning team Taker,Losing team Taker,Losing Team GK,Round,Date,Winner Penalty type,Loser Penalty type
0,1,1,1982,Germany,3–3,France,Schumacher,Kaltz Penalty scored,Penalty scored Giresse,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
1,1,2,1982,Germany,3–3,France,Schumacher,Breitner Penalty scored,Penalty scored Amoros,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
2,1,3,1982,Germany,3–3,France,Schumacher,Stielike Penalty missed,Penalty scored Rocheteau,Ettori,Semi-finals,2021-07-08,Penalty missed,Penalty scored
3,1,4,1982,Germany,3–3,France,Schumacher,Littbarski Penalty scored,Penalty missed Six,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty missed
4,1,5,1982,Germany,3–3,France,Schumacher,Rummenigge Penalty scored,Penalty scored Platini,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty scored
5,1,6,1982,Germany,3–3,France,Schumacher,Hrubesch Penalty scored,Penalty missed Bossis,Ettori,Semi-finals,2021-07-08,Penalty scored,Penalty missed
11,3,1,1986,Germany,0–0,Mexico,Schumacher,Allofs Penalty scored,Penalty scored Negrete,Larios,Quarter-finals,2021-06-21,Penalty scored,Penalty scored
12,3,2,1986,Germany,0–0,Mexico,Schumacher,Brehme Penalty scored,Penalty missed Quirarte,Larios,Quarter-finals,2021-06-21,Penalty scored,Penalty missed
13,3,3,1986,Germany,0–0,Mexico,Schumacher,Matthäus Penalty scored,Penalty missed Servín,Larios,Quarter-finals,2021-06-21,Penalty scored,Penalty missed
14,3,4,1986,Germany,0–0,Mexico,Schumacher,Littbarski Penalty scored,,Larios,Quarter-finals,2021-06-21,Penalty scored,
