# The Progressors: Class of 2021/22

Finishers are responsible for scoring goals. Creators are responsible for getting the ball to the finishers (assisting the finisher). I would like to suggest another key type of player in a side's attack: the progressors. The progressor's primary responsibility is to move the ball towards the goal. Without getting the ball into advanced positions on the pitch, there is no chance of scoring goals. This role is integral. You could describe the entire game of football simply as the repeated attempt to get the ball as near to the opponent's goal as possible. 

There are fundamentally two ways of progressing the ball: passing and carrying. There are progressors in the middle of the pitch (that move the ball into the final third) and progressors in more advanced positions (that move the ball into the penalty box). 

Goals and assists are the standard metrics we use to judge finishers and creators respectively. Thanks to the development of advanced metrics, we also have simple, publicly available metrics to measure progressor output. The below graphs illustrate the progression metrics for the top progressors in Europe's big five leagues and illuminate some of the ways in which progressors may differ from each other. There are four metrics covered:

* Number of Passes into the Final Third
* Number of Carries into the Final Third
* Number of Passes into the 18 Yard Box
* Number of Carries into the 18 Yard Box

All data has been downloaded from FBref. For the purposes of this season review I have opted to include metrics that have *not* been adjusted per 90. This is because we are specifically looking at who had the best progression output over the season. We focusing on which players progressed the ball the most for their side, rather than a player's rate of progression. This is much like how the Golden Boot winner is the player that scored the most goals over the season, regardless of how many minutes he played. Nonetheless, I have included per 90 metrics which can be viewed upon hovering if you're interested.

For the purposes of these graphs, we are looking at the top 50 progressors. The bar chart at the end contains the top 25 progressors when taking into account all four metrics.

In [170]:
%%HTML 
<script>
    function luc21893_refresh_cell(cell) {
        if( cell.luc21893 ) return;
        cell.luc21893 = true;
        console.debug('New code cell found...' );
        
        var div = document.createElement('DIV');            
        cell.parentNode.insertBefore( div, cell.nextSibling );
        div.style.textAlign = 'right';
        var a = document.createElement('A');
        div.appendChild(a);
        a.href='#'
        a.luc21893 = cell;
        a.setAttribute( 'onclick', "luc21893_toggle(this); return false;" );

        cell.style.visibility='hidden';
        cell.style.position='absolute';
        a.innerHTML = '[show code]';        
                
    }
    function luc21893_refresh() {                
        if( document.querySelector('.code_cell .input') == null ) {            
            // it apeears that I am in a exported html
            // hide this code
            var codeCells = document.querySelectorAll('.jp-InputArea')
            codeCells[0].style.visibility = 'hidden';
            codeCells[0].style.position = 'absolute';                        
            for( var i = 1; i < codeCells.length; i++ ) {
                luc21893_refresh_cell(codeCells[i].parentNode)
            }
            window.onload = luc21893_refresh;
        }                 
        else {
            // it apperas that I am in a jupyter editor
            var codeCells = document.querySelectorAll('.code_cell .input')
            for( var i = 0; i < codeCells.length; i++ ) {
                luc21893_refresh_cell(codeCells[i])
            }            
            window.setTimeout( luc21893_refresh, 1000 )
        }        
    }
    
    function luc21893_toggle(a) {
        if( a.luc21893.style.visibility=='hidden' ) {
            a.luc21893.style.visibility='visible';        
            a.luc21893.style.position='';
            a.innerHTML = '[hide code]';
        }
        else {
            a.luc21893.style.visibility='hidden';        
            a.luc21893.style.position='absolute';
            a.innerHTML = '[show code]';
        }
    }
    
    luc21893_refresh()
</script>

In [171]:
from statsbombpy import sb
%matplotlib inline
import json
from pandas.io.json import json_normalize
import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Arc, Rectangle, ConnectionPatch
from matplotlib.offsetbox import  OffsetImage
import matplotlib.patches as mpatches
from functools import reduce
import plotly.graph_objects as px
import plotly.express as px
import plotly.graph_objs as go
import warnings
from plotly.validators.scatter.marker import SymbolValidator

In [172]:
#read csv files --- data from fbref
carry18_df = pd.read_csv (r'carry18box.csv')
carry18_df['Player'] = carry18_df['Player'].str.split("\\").str[0]
carry18_df = carry18_df[["Player", "Squad", "Comp", "CPA", "90s"]]
carry18_df.rename(columns = {'CPA':'Carry18'}, inplace = True)

carry3rd_df = pd.read_csv (r'carry3rd.csv')
carry3rd_df['Player'] = carry3rd_df['Player'].str.split("\\").str[0]
carry3rd_df = carry3rd_df[["Player", "Squad", "Comp", "1/3", "90s"]]
carry3rd_df.rename(columns = {'1/3':'Carry3rd'}, inplace = True)

pass18_df = pd.read_csv (r'pass18box.csv')
pass18_df['Player'] = pass18_df['Player'].str.split("\\").str[0]
pass18_df = pass18_df[["Player", "Squad", "Comp", "xA", "PPA"]]
pass18_df.rename(columns = {'PPA':'Pass18'}, inplace = True)

pass3rd_df = pd.read_csv (r'pass3rd.csv')
pass3rd_df['Player'] = pass3rd_df['Player'].str.split("\\").str[0]
pass3rd_df = pass3rd_df[["Player", "Squad", "Comp", "1/3"]]
pass3rd_df.rename(columns = {'1/3':'Pass3rd'}, inplace = True)

final3rd_df = pd.merge(carry3rd_df, pass3rd_df)
box18_df = pd.merge(carry18_df, pass18_df)

pd.options.mode.chained_assignment = None

## Moving into the final third

João Cancelo (top right) sticks out as a particularly impressive final third progressor, with a high number of passes and carries into this area. Fellow Manchester City teammate Aymeric Laporte (bottom right) is also an accomplished progressor, but one with a distinct preference for passing the ball into that zone, rather than carrying it. Another noteworthy Manchester City progressor is Bernardo Silva (on the upper far left), who carries the ball into the final third a lot, but passes it in that direction less often than his peers.

Outside of the Premier League champions, West Ham's young captain, Declan Rice, produces a significant number of progressions into the final third, as well as the perennial Leo Messi. In the bottom right, you might also notice two La Liga veterans, Kroos and Busquets, who still shoulder significant responsibility for progression, but almost exclusively through passing.

In [173]:
final3rd_df = pd.merge(carry3rd_df, pass3rd_df)
final3rd_df['total3rd'] = final3rd_df['Carry3rd'] + final3rd_df['Pass3rd']
final3rd_df['total3rd'] = final3rd_df['Carry3rd'] + final3rd_df['Pass3rd']
final3rd_df['cp90'] = final3rd_df['Carry3rd'] / final3rd_df['90s']
final3rd_df['pp90'] = final3rd_df['Pass3rd'] / final3rd_df['90s']
final3rd_df.sort_values(by='total3rd', inplace=True, ascending=False)
#keep only top 50
prog3_df = final3rd_df.head(50)
prog3_df['cp90'] = prog3_df['cp90'].round(decimals = 3)
prog3_df['pp90'] = prog3_df['pp90'].round(decimals = 3)
prog3_df.rename(columns = {'Carry3rd':'Carries', 'Pass3rd':'Passes', 'total3rd':'Total', 'cp90':'Carries p90', 'pp90':'Passes p90'}, inplace = True)
fig = px.scatter(prog3_df, x="Passes", y="Carries", color="Total", hover_data=['Player', 'Squad', 'Comp', 'Passes p90', 'Carries p90'], color_continuous_scale=px.colors.sequential.Rainbow, title="Big 5 2021/22: passes into final 3rd vs carries into final 3rd")
fig.show()

## Progression in advanced positions

Vinícius Júnior's outstanding breakthrough season is reflected here. His ability to progress the ball into the penalty box is unmatched by any other player. He has registered a respectable number of passes into the box, but his primary method of progression into the box is through carries. Two other elite carriers of the ball into the box are Kylian Mbappé and Jack Grealish, with the latter particularly disinclined to pass the ball into the box. 

Stand out final third progressor João Cancelo also stands out for advanced progression. Although he produced an impressive number of final third carries, when it comes to progressing the ball into the box, Cancelo prefers to pass it (you'll find him on the far right of the chart). Current Ballon D'or winner Leo Messi and Ballon D'or hopeful Karim Benzema can also be found in this area of the chart, as can Alexander-Arnold, Benjamin Bourigeaud, and Bruno Fernandes. 

Perhaps more surprisingly, Gerard Deulofeu, a La Masia graduate who has never found his place in the Spanish national team, had an outstanding season as a progressor for Serie A side Udinese (he's at the upper right of the chart and is second only to Vini Jr for progression into the box).

In [174]:
box18_df['total18'] = box18_df['Pass18'] + box18_df['Carry18']
box18_df['cp9018'] = box18_df['Carry18'] / box18_df['90s']
box18_df['pp9018'] = box18_df['Pass18'] / box18_df['90s']
box18_df.sort_values(by='total18', inplace=True, ascending=False)
#keep only top 50
prog18_df = box18_df.head(50)
prog18_df['cp9018'] = prog18_df['cp9018'].round(decimals = 3)
prog18_df['pp9018'] = prog18_df['pp9018'].round(decimals = 3)
prog18_df.rename(columns = {'Carry18':'Carries', 'Pass18':'Passes', 'total18':'Total', 'cp9018':'Carries p90', 'pp9018':'Passes p90'}, inplace = True)
fig = px.scatter(prog18_df, x="Passes", y="Carries", color="Total", hover_data=['Player', 'Squad', 'Comp', 'Passes p90', 'Carries p90'], color_continuous_scale=px.colors.sequential.Rainbow, title="Big 5 2021/22: Passes into Box vs Carries into Box")
fig.show()

## The relationship between progression and creation

Progression is a different skill to creation. Progressors are skilled at moving the ball into dangerous areas of the pitch. Creators are skilled at finding a teammate in a good location to shoot from. Because progressors often operate high up the pitch, they tend to be involved in creation and finishing. The below graph plots each player's total progressions into the box against their xA (expected assists: the assists they were expected to provide based on the chances they created for their teammates). 

The player with the most progressions into the box, Vinícius Júnior, had a middling xA output compared to the other top progressors. This is not an indication of quality so much as style - perhaps Vini Jr takes shots himself once he progresses into the box, or provides balls a few steps earlier in the creation of a shot. Kylian Mbappé, another prolific box progressor, has one of the highest xA totals, suggesting to me that Mbappé is as much a creator as a progressor.

Salah, Alexander-Arnold also rank high for xA, though Salah has notably more progressions into the box. Again, there will be varied reasons for this. Alexander-Arnold's progressions might be more effective. Another theory is that Salah, EPL Golden Boot winner, tends to finish off his own progressions (many of his goals come from him dribbling into space and shooting). The above chart showed that Salah is a carrier, whilst Alexander-Arnold is a passer (or a crosser!).

In [175]:
fig = px.scatter(prog18_df, x="Total", y="xA", color="xA", hover_data=['Player', 'Squad', 'Comp', 'Passes p90', 'Carries p90'], color_continuous_scale=px.colors.sequential.Rainbow, title="Big 5 2021/22: Passes & Carries into the 18yard Box vs xA")
fig.show()

## The best progressors all over the place

Let's combine the progression metrics for the final third and the 18 yard box to see which players progress the ball the most often. There are a few insights clear from the below graph.

Firstly, the era of the fullback has truly begun, with João Cancelo and Trent Alexander-Arnold both providing excellent progression for their sides. We can see clearly that, despite the criticism he has received, Messi still has it. Declan Rice justifies the gigantic price tag West Ham have placed on him through his progression output this season. 

Vinícius Júnior's progression output is so similar to Mbappé's that it seems hilarious that Real Madrid wanted both of them at their club (the management at every elite club in Europe must be grateful that Mbappé to Madrid never materialised!). 

In [176]:
progtot_df = pd.merge(box18_df, final3rd_df)
progtot_df['progtot'] = progtot_df['Carry18'] + progtot_df['Pass18'] + progtot_df['Pass3rd'] + progtot_df['Carry3rd']
progtot_df.sort_values(by=['progtot'], inplace=True, ascending=False)
progtot25_df = progtot_df.head(25)
progtot25_df['Progressions p90'] = progtot25_df['progtot'] / progtot25_df['90s']
progtot25_df['Progressions p90'] = progtot25_df['Progressions p90'].round(decimals = 3)
progtot25_df.rename(columns = {'Carry18':'Carries into 18yard Box', 'Pass18':'Passes into 18yard Box', 'Pass3rd':'Passes into Final 3rd','Carry3rd':'Carries into Final 3rd'}, inplace = True)
fig = px.bar(progtot25_df, x="Player", y=["Carries into 18yard Box", "Passes into 18yard Box", "Carries into Final 3rd", "Passes into Final 3rd"], hover_data=['Progressions p90'], color_discrete_sequence=px.colors.qualitative.Dark2, title="Big 5 Leagues: All Progressions: Top 25")
fig.show()

## The 2021/22 Progressor Award goes to...

Such an award would require a much more rigorous analysis than I have provided here. We have looked exclusively at quantity of progressions made. I believe the next step in this work would be to investigate the quality of progressions made. I have a few ideas for this, including weighting progressions according to whether or not they were part of a a shot/goal ending sequence. Maybe next year. Nonetheless, there are a few players that I would like to highlight as particularly accomplished progressors.

### João Cancelo

A top progressor, both into the final third and the penalty box, Cancelo is the man to beat. He tops our table for overall progressions and he can do both, carries and passes. This season, Cancelo was one of the most important attacking threats in a side with the best choice of elite attackers in Europe (surely?). And he did it all from fullback!

### Vinícius Júnior

Vini Jr truly had a season to remember, a key player in Real's successful La Liga campaign and scoring the winner in the Champions League final. Vini Jr was also the best around at getting the ball into the box. A natural goalscorer like him knows that putting the ball in the box is key to winning titles. 

### Kylian Mbappé

Mbappé has been doing it for so long that it's easy to take him for granted. No matter how you cut it, he's one of the best in the world. He gets the ball into the box almost as much as Vini Jr and, considering he is a typical forward, he has decent involvement in final third progression too. He gets a mention for his high xA, indicative of a progressor-creator, a two-in-one threat. 