# Who will be Crowned the Best Player?

![](https://img.fifa.com/image/upload/t_l4/trtfnhjhunj6o0tgh1by.jpg)

### The top players selected for analysis are those who are featured for all 5 years [2016-2020]. The winner will be decided on six categories:
1. **Goals**
2. **Minutes Per On Target Shot**
3. **Minutes Per Goal**
4. **Goals -xG**
5. **Target Accuracy**
6. **Shots efficiency**

**Disclaimer:** The best player here is determined over the 2016-2020 period and not on the basis of  any individual season and its based on the information provided in the dataset.

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
df = pd.read_csv("../input/top-football-leagues-scorers/Data.csv")
df.head()

In [None]:
x = df["Player Names"].value_counts() == 5
players_5years = x[x].index
players_5years

In [None]:
df_topplayers = df[df['Player Names'].isin(players_5years)]
df_topplayers.head()

In [None]:
df_topplayers["Goals_xG_Diff"] = df_topplayers['Goals'] - df_topplayers['xG']
df_topplayers["Target_Accuracy_Per_Game"] = (df_topplayers['On Target Per Avg Match']/df_topplayers['Shots Per Avg Match'])*100
df_topplayers["Shots_to_Goal_conversion"] = (df_topplayers['Goals']/df_topplayers['OnTarget'])*100
df_topplayers.head()

In [None]:
df_top = df_topplayers.groupby(by = ["Player Names"])["Matches_Played","Goals",'Mins','OnTarget'].sum().reset_index()
df_top.head()

## 1. Goals
Goals win you games. So the most basic and obvious parameter to determine the contribution of a player is to determine how many goals he scored over the period.

In [None]:
plt.figure(figsize = (12,8))
df_top_by_goals = df_top.sort_values(by=['Goals'], ascending=False)
ax = sns.barplot(x='Goals', y='Player Names', data=df_top_by_goals)
ax.set_xlabel('Goals')

## Goal Winner : 1. Lionel Messi (5pts)
2. Robert Lewandowski (4pts)
3. Cristiano Ronaldo (3pts)
4. Ciro Immobile (2pts)
5. Luis Suarez (1pt)

In [None]:
df_top["Mins_per_goal"] = df_top['Mins']/df_top['Goals']
df_top["Mins_per_Target"] = df_top['Mins']/df_top['OnTarget']
df_top.head()

## 2. Minutes Per On-Target Shot

If the shot is on target then there is a high probability that ball might find the back of the net, hitting the target frequently is a very much desired skill in the world of football, lets see who comes out on top.

In [None]:
df_top_by_mins_shot = df_top.sort_values(by=['Mins_per_Target'], ascending=False)

import plotly.graph_objects as go

fig = go.Figure(go.Bar(
            x=df_top_by_mins_shot['Mins_per_Target'],
            y=df_top_by_mins_shot['Player Names'],
            orientation='h'))

fig.show()

## Minutes Per Goal Winner : 1. Lionel Messi (5pts)
2. Cristiano Ronaldo (4pts)
3. Robert Lewandowski (3pts)
4. Ciro Immobile (2pts)
5. Luis Suarez (1pt)

## 3. Minutes Per Goal
This parameter is very crucial in determining how frequent the player will find the back of the net. Any value with less than 90 minutes accounts for a goal per game expectancy. But with an average extra time of 4 mins in one match we can take the total time for one game to be around 94 mins.

In [None]:
df_mins_per_goal = df_top.sort_values(by=['Mins_per_goal'], ascending=False)

fig = go.Figure(go.Bar(
            x=df_mins_per_goal['Mins_per_goal'],
            y=df_mins_per_goal['Player Names'],
            marker=dict(color = [10*i for i in range(1,len(df_mins_per_goal['Mins_per_goal']))],
                     colorscale='viridis'),
            orientation='h'))

fig.show()

### Observations from the graph above:

Since Lewandowski and Messi have mins per goal stat at 91 and 94 respectively, one can expect them to score a goal every game on an average. A single goal is considered to be very useful if its a tactically defensive game where there is a dearth of goal.

## Minutes Per Goal Winner : 1. Robert Lewandowski (5pts)
2. Lionel Messi (4pts)
3. Cristiano Ronaldo (3pts)
4. Luis Suarez (2pts)
5. Ciro Immobile (1pt)

**Aggregation** is not the only way to determine the best of the best because all the top scorers have scored plenty of goals and also one may dominate one year and then will not score as many goals in the next. **Consistency** over the years is also important in determining who has been the best. By using some of the parameters that usually take back seat in front of parameters like goals and goals per game, we would take a more holistic approach. Following graphs will shed some light on those important parameters.

## 4. Goals-xG Score 

The following graph is basically difference between Goals and Expected Goals plotted in all 5 years. The winner of this category is one who has performed best over the years than what was expected.

In [None]:
fig = go.Figure()
Goals_diff_sum = {}
Goals_diff_mean = {}
for i in players_5years:
    df_player = df_topplayers[df_topplayers['Player Names'].isin([i])]
    Goals_diff_sum[i] = df_player['Goals_xG_Diff'].sum()
    Goals_diff_mean[i] = df_player['Goals_xG_Diff'].mean()
    fig.add_trace(go.Scatter(x=df_player['Year'], y=df_player["Goals_xG_Diff"],
                    mode='lines+markers',
                    name=i))
fig.update_layout(
    autosize=False,
    width=900,
    height=600,)

fig.show()

### Observations from the graph above:
1. Immobile and Aspas, who have never featured as a star striker at any of the big clubs have performed beyond expectations over the period.
2. Robert Lewandowski who has been the best player of 2020, has not been the most consistent in delivering to the expectation levels over this period.

In [None]:
df1 = pd.Series(Goals_diff_sum).sort_values(ascending = False).reset_index()
df2 = pd.Series(Goals_diff_mean).sort_values(ascending = False).reset_index()

df_final = pd.merge(df1, df2, on="index")
df_final.columns = ["Player", "Sum","Mean"]

df_final

## Goals-xG Winner : 1. Lionel Messi (5pts)
2. Ciro Immobile (4pts)
3. Timo Werner (3pts)
4. Iago Aspas (2pts)
5. Andrea Belloti (1pt)

## Target Accuracy
The average number of shots on target per match is an important parameter. Given the insane finishing skills of all the goal scorers, the higher percentage of target accuracy would lead to higher probability in getting a goal. The winner of this category will be decided on the final score which is calculated as the mean of average percentage change in target accuracy and mean target accuracy over the 4 year period.

In [None]:
Average_TA_deviation = {}
Mean_TA = {}
fig = go.Figure()

for i in players_5years:
    df_player = df_topplayers[df_topplayers['Player Names'].isin([i])]
    Average_TA_deviation[i] = df_player['Target_Accuracy_Per_Game'].pct_change().mean()*100
    Mean_TA[i] = df_player['Target_Accuracy_Per_Game'].mean()
    fig.add_trace(go.Scatter(x=df_player['Year'], y=df_player["Target_Accuracy_Per_Game"],
                    mode='lines+markers',
                    name=i))
fig.update_layout(
    autosize=False,
    width=900,
    height=600,)

fig.show()

### Observations from the graph above:
1. Cristiano Ronaldo and Iago Aspas may not have the highest shots on target percentage every year but they have been the most consistent over the period.
2. Old guys in the football career lingua, like Ciro Immobile and Fabio Quagliarella find it difficult to hit the target on a consistent basis, but that could also be because of the sides they play in, which may provide them with less chances due to lack in creativity in the midfield.
3. Andrej Kramaric edges out as the winner of this category based on highest mean target accuracy over the period and a decent positive percentage deviation over the period.

In [None]:
df1 = pd.Series(Average_TA_deviation).sort_values(ascending = False).reset_index()
df2 = pd.Series(Mean_TA).sort_values(ascending = False).reset_index()


df_final = pd.merge(df1, df2, on="index")
df_final.columns = ["Player","Average Percentage Change", "Mean Target Accuracy"]
df_final['Score'] = (df_final["Average Percentage Change"] + df_final["Mean Target Accuracy"])/2
df_final = df_final.sort_values(by = "Score", ascending = False)
df_final

## Target Accuracy Winner : 1. Andrej Kramaric (5pts)
2. Iago Aspas (4pts)
3. Lionel Messi (3pts)
4. Cristiano Ronaldo (2pts)
5. Robert Lewandowski (1pt)

## Shots Efficiency
This category equates **the shot conversion to goals** of all the players. The winner of this category will be decided on the final score which is calculated as the mean of average percentage change in shots efficiency and mean shots efficiency over the 4 year period.

In [None]:
Shots_efficiency_Avg_PC = {}
Mean_SE = {}

fig = go.Figure()

for i in players_5years:
    df_player = df_topplayers[df_topplayers['Player Names'].isin([i])]
    Shots_efficiency_Avg_PC[i] = df_player['Shots_to_Goal_conversion'].pct_change().mean()*100
    Mean_SE[i] = df_player['Shots_to_Goal_conversion'].mean()
    fig.add_trace(go.Scatter(x=df_player['Year'], y=df_player["Shots_to_Goal_conversion"],
                    mode='lines+markers',
                    name=i))

fig.update_layout(
    autosize=False,
    width=900,
    height=600,)

fig.show()


### Observations from the graph above:

1. Fabio Quagliarella has gradually increased his shot efficiency from 17.14 percent in 2016 to 55.55 percent in 2020. And he is 37 years old, which is said to be twilight years in a football career.
2. Lionel Messi, who is regarded as a GOAT (Greatest of all Time) of the game, declined in shots to goal conversion, in fact he declined so much that he is in the bottom of the list.
3. All the so called "underrated players" like Quagliarella, Immobile, Kramaric, and Belloti have better shots to goal conversion efficiency than two absolute titans of football world, Ronaldo and Messi.

In [None]:
df1 = pd.Series(Shots_efficiency_Avg_PC).sort_values(ascending = False).reset_index()
df2 = pd.Series(Mean_SE).sort_values(ascending = False).reset_index()


df_final = pd.merge(df1, df2, on="index")
df_final.columns = ["Player","Average Percentage Change", "Mean Shot Efficiency"]
df_final['Score'] = (df_final["Average Percentage Change"] + df_final["Mean Shot Efficiency"])/2
df_final = df_final.sort_values(by = "Score", ascending = False)
df_final

## Shots Efficiency Winner :  1. Ciro Immobile (5 pts)
2. Fabio Quagliarella (4 pts)
3. Robert Lewandowski (3 pts)
4. Andrej Kramaric (2 pts)
5. Andrea Belotti (1 pt)

# And the Best Award Goes to...

 ![](https://www.fcbarcelona.com/fcbarcelona/photo/2019/12/02/abbdc187-9e69-4cdb-8636-541d6b3a7b59/mini__R5I4403.JPG)

## 1.Lionel Messi (22/30)
2. Robert Lewandowski (16/30)
3. Ciro Immobile (14/30)
4. Cristiano Ronaldo (12/30)
5. Andrej Kramaric (7/30)

## If you liked this notebook, then do **Upvote** as it will keep me motivated in creating such kernels ahead. **Thanks!!**