# Question
> How many players should a defence rush to minimise EPA?

After initially exploring all the data made available in this year's competition, we considered a few different ideas for our Notebook. When exploring one idea we discovered a simple, yet significant, outcome. Could the [Conventional Wisdom](https://en.wikipedia.org/wiki/Conventional_wisdom) of defensive play callers truly be this different to the reality presented by the data?

The analysis has been written as a narrative, so it is recommended you don't just skip to the end.

Can you spot the <a href="https://en.wikipedia.org/wiki/Easter_egg_(media)">Easter Egg</a> associated with the colour scheme?


# Summary

The reality presented by the data does not agree with the Conventional Wisdom of rushing 4 players the majority of the time. Although progress should be cautious and gradual, defensive play callers should look to significantly change the amount they use different numbers of rushers. They should reduce the amount they rush 4 players and should show a greater preference for rushing 5 and 6.

# Assumptions

For our analysis we rely upon the [EPA (Expected Points Added)](https://www.advancedfootballanalytics.com/index.php/home/stats/stats-explained/expected-points-and-epa-explained) value for each play. The power of this metric is how it allows us to [control](https://en.wikipedia.org/wiki/Controlling_for_a_variable) for other variables that we are not considering in our analysis. It also allows us to determine binary classifications for which plays are a 'success' for the defence (i.e. a negative EPA-value) and which ones are a 'failure' (i.e. a positive EPA-value). As the EPA-values are supplied from the NFL with the dataset we are confident in their accuracy, however this assumed accuracy is a fundamental assumption of our analysis.

It is also important to remember these conclusions are drawn from the dataset of *passing plays* only (as this was the only data supplied by the NFL). The results of the analysis may change if the data associated with rushing plays is also included. 

# Beginning of Analysis

This notebook only uses the data in `plays.csv`. To ensure a high enough sample size, the data is filtered to only keep plays with 3, 4, 5 or 6 rushers.

The following dictionaries are created from this data:
* **rushers_used** - storing the percentage of plays called with each number of rushers (3, 4, 5, or 6)
* **epas** - storing a list of the EPA (Expected Points Added) on each play, organised by number of rushers used

Create a plot for the percentages that each number of rushers is used:

In [None]:
import pandas as pd
from collections import Counter
import matplotlib.pyplot as plt
import statistics 
import random
import numpy as np
import operator
import matplotlib
matplotlib.use("Agg")
from matplotlib.animation import FFMpegWriter
from IPython.display import Video, Image

In [None]:
# Create a dictionary of nice NFL colors for plots
colours = {
    "JAX" : "#0092ab",
    "NE" : "#012756",
    "NO" : "#c7aa44",
    "CLE" : "#733b16",
    "DEN" : "#f96b21",
    "SEA" : "#306086",
    "BAL" : "#32347e",
    "WSH" : "#97191a",
    "CAR" : "#008ed0",
    "NYJ" : "#035936",
    "PHI" : "#005760",
    "LV" : "#abaaa8",
    "PIT" : "#fec401",
    "MIA" : "#007877",
    "ATL" : "#cb1023"
}

In [None]:
df = pd.read_csv("../input/nfl-big-data-bowl-2021/plays.csv")

# Filter the dataframe to only keep plays with numbers of rushers with a high sample size
rushers = [3,4,5,6]
df = df[df.numberOfPassRushers.isin(rushers)]

# Calculate percentage plays with each number of rushers
rushers_used = [(i, Counter(list(df.numberOfPassRushers))[i] / len(df) * 100.0) for i in Counter(list(df.numberOfPassRushers))]
rushers_used = {rush : percent for rush, percent in sorted(rushers_used, key=lambda item: item[0])}

# Create dictionary storing EPA-values for every play, by number of rushers
epas = {rush : [] for rush in rushers}
for epa_i, rush_i in zip(list(df.epa), list(df.numberOfPassRushers)):
    epas[rush_i].append(epa_i)

In [None]:
# Define helper function (adapted from https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/barchart.html#sphx-glr-gallery-lines-bars-and-markers-barchart-py)
def autolabel(rects, horizontal=False, percentageLabel = False):
    """Attach a text labels to bars in *rects*, displaying its size."""
    for rect in rects:
        if horizontal:
            if percentageLabel:
                width = str(round(rect.get_width(),1)) + "%"
                ax.annotate('{}'.format(width),
                        xy=(rect.get_width(), rect.get_y() + rect.get_height() / 5),
                        xytext=(19, 0),  # 3 points vertical offset
                        textcoords="offset points",
                        ha='center', va='bottom')
            else:
                width = round(rect.get_width(),1)
                ax.annotate('{}'.format(width),
                            xy=(rect.get_width(), rect.get_y() + rect.get_height() / 5),
                            xytext=(14, 0),  # 3 points vertical offset
                            textcoords="offset points",
                            ha='center', va='bottom')
        else:
            if percentageLabel:
                height = str(round(rect.get_height(),1)) + '%'
            else:
                height = round(rect.get_height(),1)
            
            ax.annotate('{}'.format(height),
                        fontsize=8,
                        xy=(rect.get_x() + rect.get_width() / 2, rect.get_height()),
                        xytext=(0, 3),  # 3 points vertical offset
                        textcoords="offset points",
                        ha='center', va='bottom')

In [None]:
# Plot percentages that each number of rushers is used
x = np.arange(len(rushers))  # the label locations
width = 0.75  # the width of the bars
fig, ax = plt.subplots(figsize=(10, 2))
rects = ax.barh(x, list(rushers_used.values()), width, color = colours["JAX"])

# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('Rushers', fontsize=14)
ax.set_xlabel('Play Call %', fontsize=14)
ax.set_yticks(x)
ax.set_yticklabels(rushers)

# Hide spines
for spine in ['right', 'top', 'bottom']:
    ax.spines[spine].set_visible(False)

# Remove ticks
plt.tick_params(axis='x', which='both',bottom=False,top=False,labelbottom=False)
plt.tick_params(axis='y', which='both',left=False)

autolabel(rects, horizontal=True, percentageLabel=True)
fig.tight_layout()

plt.savefig("rushers_used.png")
plt.close(fig)
Image("rushers_used.png")  

From this plot we can infer the Conventional Wisdom of defensive play callers on how many players to rush. They have a strong preference for rushing 4 players, followed by 5, 3 then 6 players. 

Now create a plot of the distributions of EPA-values (as histograms - noting the different scales on y-axis), for each number of rushers:

In [None]:
# Plot the EPA-value distributions
fig, axs = plt.subplots(2,2, figsize=(16, 9))

for i, rush in enumerate(rushers):
    
    ax = axs.flat[i]

    ax.hist(epas[rush], bins=150, color=colours["BAL"])
    ax.set_title(str(rush) + " Rushers", fontsize=16)
    ax.set_xlim([-12,9])
    ax.set_xlabel("EPA", fontsize=14)
    ax.set_ylabel("Number of Plays", fontsize=14)

plt.tight_layout()
plt.subplots_adjust(wspace=0.2, hspace=0.3)

plt.savefig("epas.png")
plt.close(fig)
Image("epas.png")  

An interesting feature of these distributions is the trough around the 0 EPA-value, which broadly splits two peaks. For a defence, an EPA-value above 0 indicates a 'failure' and an EPA-value below 0 represents a 'success'. Hence, a shallower trough around 0 indicates a 'safer' play call and a deeper trough a more 'risky' one. As the troughs deepen for rushing more players, this follows the conventional wisdom that the more players rushed, the greater the ***risk***.

But how much greater ***risk*** is there? 

It is worth re-emphasising an important premise:

> A play with an EPA-value less than 0 is deemed a 'success'

> A play with an EPA-value greater than 0 is deemed a 'failure'

We can take our analysis further by separating each distribution in two: a 'success' distribution and a 'failure' distribution. This allows us to consider the following:
1. For each number of rushers, the percentage of defensive plays that are a 'success' (i.e. have negative EPA-values)
2. The mean EPA-values for 'successful' and 'failure' defensive plays

In [None]:
# Create dictionary to seperate out distributions
epas_sep = {rush : {"success" : [], "failure" : []} for rush in rushers}
for rush in rushers:
    for epa_i in epas[rush]:
        if epa_i < 0:
            epas_sep[rush]["success"].append(epa_i)
        elif epa_i > 0:
            epas_sep[rush]["failure"].append(epa_i)
            
# Create dictionary to store mean EPA-values of successes and failures, along with percentage plays successful
epas_metric = {rush : {"mean - success" : [], "mean - failure" : [], "success %" : []} for rush in rushers}
for rush in rushers:
    epas_metric[rush]["mean - success"] = statistics.mean(epas_sep[rush]["success"])
    epas_metric[rush]["mean - failure"] = statistics.mean(epas_sep[rush]["failure"])
    epas_metric[rush]["success %"] = 100 * len(epas_sep[rush]["success"]) / (len(epas_sep[rush]["success"]) + len(epas_sep[rush]["failure"]))

            
# Plot the seperated EPA-value distributions
fig, axs = plt.subplots(2,2, figsize=(18, 10))

for i, rush in enumerate(rushers):
    
    ax = axs.flat[i]
    bins_success = int(150 * len(epas_sep[rush]["success"]) / (len(epas_sep[rush]["success"]) + len(epas_sep[rush]["failure"])))
    ax.axvline(statistics.mean(epas_sep[rush]["success"]), color=colours["MIA"], linestyle='dashed', linewidth=1, label='Success Mean')
    ax.axvline(statistics.mean(epas_sep[rush]["failure"]), color=colours["ATL"], linestyle='dashed', linewidth=1, label='Failure Mean')
    ax.hist(epas_sep[rush]["success"], bins=bins_success, color=colours["NYJ"])
    ax.hist(epas_sep[rush]["failure"], bins=150-bins_success, color=colours["WSH"])
    ax.set_title(str(rush) + " Rushers", fontsize=16)
    ax.set_xlim([-12,9])
    ax.set_xlabel("EPA", fontsize=14)
    ax.set_ylabel("Number of Plays", fontsize=14)
    ax.legend()
    
    # Calculate success % (i.e. ratio areas of distributions)
    text = "Successful Plays: " + str(round(100*len(epas_sep[rush]["success"]) / (len(epas_sep[rush]["success"]) + len(epas_sep[rush]["failure"])),1)) + " %"
    text += "\nSuccess Mean: " + str(round(statistics.mean(epas_sep[rush]["success"]),2)) + " points"
    text += "\nFailure Mean: " + str(round(statistics.mean(epas_sep[rush]["failure"]),2)) + " points"
    props = dict(facecolor=colours["NO"], alpha=0.5)
    ax.text(0.05, 0.95, text, transform=ax.transAxes,
        verticalalignment='top', bbox=props, fontsize=14)
    
    
plt.tight_layout()
plt.subplots_adjust(wspace=0.2, hspace=0.3)

plt.savefig("epas_sep.png")
plt.close(fig)
Image("epas_sep.png") 

Now we have two sets of data for comparison:
* **rushers_used** - percentage of actual plays **called** with each number of rushers (3, 4, 5, or 6)
* **monte** - from the Monte Carlo Analysis, the percentage of samples that each number of rushers (3, 4, 5, or 6) had the smallest EPA-value (implying it's the best choice)

Before we discuss these plots, it may be worth reminding yourself of the play call percentage plot above, which shows that 4 players are rushed the majority of the time, followed by 5, 3, then 6 players. This is also indicated by the y-axis limits of the EPA-distribution plots.

From the above plots, we can draw some initial conclusions:

**Primary**:
* The percentage of plays which are a 'success' increases with increasing number of rushers
* As the number of rushers is increased from 4 to 6, the mean EPA-value of the successful plays improves - i.e. the ***reward*** increases with increasing numbers of rushers
* However, as the number of rushers is increased, the mean EPA-value of the failure plays worsens - i.e. the ***risk*** increases with increasing numbers of rushers

**Secondary** (i.e. less interesting):
* Plays which rush 3 or 4 players have similar mean EPA-values, with 4-rushers having a higher 'success' rate. This supports the conventional wisdom of rushing 4 players more frequently than 3

These initial conclusions imply that the Conventional Wisdom of defensive play callers, in rushing 4 players the majority of the time, needs to be reconsidered. Let's deepen our analysis further.

### Monte Carlo - Analysis 1

Using the [Monte Carlo Method](https://en.wikipedia.org/wiki/Monte_Carlo_method), perform an analysis to calculate the percentages that each given number of rushers has the smallest EPA-value for a given play. 

For example, take a random EPA-value sample from each of the 4 distributions, comparing the values. Whichever value is smallest, that number of rushers is the best. Repeat 400,000 times*. Hence we can calculate the percentage that each number rushers yields the best result (i.e. has the smallest EPA). 

\* a sensitivity analysis was carried out, determining that 400,000 samples was sufficient for convergence within 0.1%.

**N.B.** the power of using the Monte Carlo Method is that it takes full account of the highly non-standard EPA distributions.


The video shows the Monte Carlo Analysis as it progresses, with the final results being presented in the plot below.

**N.B.** For the benefit of viewing, the video frames correspond to a logarithmic progression of samples.

In [None]:
# Using the Monte Carlo method (i.e. randomly sampling values from the EPA distributions), calculate the percentage that each rush number has the smallest EPA

N_samples = 400000

# Create dictionary to store percentages resulting from Monte Carlo simulations
monte = {rush : None for rush in rushers}

# Store all sample results (i.e. the number of rushers which had smallest EPA for each N_samples)
samples = []

# Create figure for animation and final plot
fig, ax = plt.subplots(figsize=(8, 4))
x = np.arange(len(rushers))  # the label locations
width = 0.35  # the width of the bars

# Create FFMpegWriter object for animation
writer = FFMpegWriter(fps=10)

# Keep frames on logarithmic scale (as more interesting at start)
n_frames = 200
frames = [int(k) for k in np.logspace(1, np.log10(N_samples), n_frames, endpoint=True)]

with writer.saving(fig, "monte.mp4", 100):
    
    # Monte Carlo Simulation
    for i in range(N_samples):

        # Store current samples EPA-value for each number of rushers
        sample = {rush : 0 for rush in rushers}
        for rush in rushers:
            # Randomly sample distribution
            sample[rush] = random.sample(epas[rush],1)[0]

        # Keep number rushers which had smallest EPA 
        samples.append([rush for rush, epa in sorted(sample.items(), key=lambda item: item[1])][0])

        # Create frame for animation
        if i+1 in frames:

            # Get percentanges for each number of rushers
            samples_percentages = [(i, Counter(samples)[i] / len(samples) * 100.0) for i in Counter(samples)]
            monte = {rush : percent for rush, percent in sorted(samples_percentages, key=lambda item: item[0])}

            # Clear previous frame
            ax.clear()
            
            # Add bars
            rects1 = ax.barh(x + width/2, list(rushers_used.values()), width, label='Reality', color=colours["SEA"])
            rects2 = ax.barh(x - width/2, list(monte.values()), width, label='Monte Carlo', color=colours["MIA"])
            autolabel(rects1, horizontal=True, percentageLabel=True)
            autolabel(rects2, horizontal=True, percentageLabel=True)

            # Add some text for labels, title and custom x-axis tick labels, etc.
            ax.set_ylabel('Rushers', fontsize=14)
            ax.set_xlabel('% of calls or samples', fontsize=14)
            ax.set_yticks(x)
            ax.set_yticklabels(rushers)
            ax.legend(bbox_to_anchor=(0.67,0.75),loc="lower left")
            ax.set_xlim([0,125])

            # Hide spines
            for spine in ['right', 'top', 'bottom']:
                ax.spines[spine].set_visible(False)

            # Remove ticks
            ax.tick_params(axis='x', which='both',bottom=False,top=False,labelbottom=False)
            ax.tick_params(axis='y', which='both',left=False)

            # Create text to annotate
            text = "Sample count: " + str(i+1) + "\n\nEPA-values:\n"
            text += "3 rushers = " + str(round(sample[3],2))
            text += "\n4 rushers = " + str(round(sample[4],2))
            text += "\n5 rushers = " + str(round(sample[5],2))
            text += "\n6 rushers = " + str(round(sample[6],2)) + "\n"
            text += r"$\bf{Best}$ $\bf{is}$ $\bf{" + str(samples[len(samples)-1]) +  r"}$ $\bf{rushers}$"
            ax.text(0.67, 0.7, text, transform=ax.transAxes, verticalalignment='top')

            fig.tight_layout()

            # Write frame to animation
            writer.grab_frame()

plt.close(fig)
Video("monte.mp4")    

In [None]:
# Replot final values
fig, ax = plt.subplots(figsize=(8, 4))

rects1 = ax.barh(x + width/2, list(rushers_used.values()), width, label='Reality', color=colours["SEA"])
rects2 = ax.barh(x - width/2, list(monte.values()), width, label='Monte Carlo', color=colours["MIA"])

# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('Rushers', fontsize=14)
ax.set_xlabel('% of calls or samples', fontsize=14)
ax.set_yticks(x)
ax.set_yticklabels(rushers)
ax.legend(fontsize=12)
ax.set_xlim([0,70])

# Hide spines
for spine in ['right', 'top', 'bottom']:
    ax.spines[spine].set_visible(False)

# Remove ticks
plt.tick_params(axis='x', which='both',bottom=False,top=False,labelbottom=False)
plt.tick_params(axis='y', which='both',left=False)

autolabel(rects1, horizontal=True, percentageLabel=True)
autolabel(rects2, horizontal=True, percentageLabel=True)

fig.tight_layout()

plt.savefig("monte.png")
plt.close(fig)
Image("monte.png") 

These results show a stark contrast between the actual number of rushers used by defensive play callers and their respective success.

The results of the Monte Carlo analysis agree with our initial conclusions from earlier. They imply that defensive play callers ***should*** have a preference for **rushing more players**. 

These results imply that the ***reward*** increases with increasing numbers of players that are rushed. But, as discussed earlier, we also need to consider the ***risk***.


### Monte Carlo - Analysis 2

To investigate ***risk***, we need to consider a few additional results from our Monte Carlo Analysis.

Namely, the percentage of samples that each number of rushers has the:
* **smallest EPA-value** (i.e. best out of 4) - **previously the only result considered**
* **second-smallest EPA-value** (i.e. 2nd best out of 4)
* **third-smallest EPA-value** (i.e. 3rd best out of 4)
* **fourth-smallest**, or largest, EPA-value (i.e. 4th best out of 4)

For example, to calculate the **second-smallest EPA-value**, take a random EPA sample from each of the 4 distributions. Compare EPA values, with whichever is ***second-smallest***, that number of rushers is the ***second-best***. Repeating 400,000 times, hence calculate the percentage of samples that each number rushers yields the ***second-best*** result (i.e. ***second-smallest*** EPA).

Considering the all of the 4 results above allows us to investigate the balance between the ***reward*** and ***risk*** associated with each number of rushers. The choice of how many players to rush isn't just about achieving the best outcome, but also about avoiding the bad outcomes. 

In [None]:
# Using the Monte Carlo method (i.e. randomly sampling values from the EPA distributions), calculate the percentage that each rush number has the smallest, second-smallest, third-smallest, and worst EPA-value

# Create dictionary of dictionaries to store percentages resulting from Monte Carlo simulations
monte_place = {place : {rush : None for rush in rushers} for place in range(4)}

# Store all sample results
samples = {place : [] for place in range(4)}

# Monte Carlo Simulation
for i in range(1, N_samples):

    # Store current sample EPA for each number rushers
    sample = {rush : 0 for rush in rushers}
    for rush in rushers:
        # Randomly sample distribution
        sample[rush] = random.sample(epas[rush],1)[0]

    # Keep number rushers which had EPA corresponding to 'place'
    for place in range(4):
        samples[place].append([rush for rush, epa in sorted(sample.items(), key=lambda item: item[1])][place])

# Get percentanges for each number of rushers
for place in range(4):
    samples_percentages = [(i, Counter(samples[place])[i] / len(samples[place]) * 100.0) for i in Counter(samples[place])]
    monte_place[place] = {rush : percent for rush, percent in sorted(samples_percentages, key=lambda item: item[0])}

Now plot the results of the second Monte Carlo Analysis. (For clarity, the 'Reality' of rushers used is not included in this plot).

In [None]:
fig, ax = plt.subplots(figsize=(12, 6))

x = np.arange(len(rushers))  # the label locations
width = 0.175  # the width of the bars

rects1 = ax.barh(x + 1.5*width, list(monte_place[0].values()), width, label='Best (1st of 4)', color=colours["DEN"])
rects2 = ax.barh(x + .5*width, list(monte_place[1].values()), width, label='Second-best (2nd of 4)', color=colours["NYJ"])
rects3 = ax.barh(x - .5*width, list(monte_place[2].values()), width, label='Third-best (3rd of 4)', color=colours["ATL"])
rects4 = ax.barh(x - 1.5*width, list(monte_place[3].values()), width, label='Worst (4th of 4)', color=colours["CAR"])


# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_xlabel('% of samples', fontsize=14)
ax.set_ylabel('Rushers', fontsize=14)
ax.set_yticks(x)
ax.set_yticklabels(rushers)
ax.legend(fontsize=12)
ax.set_xlim([0,45])

# Hide spines
for spine in ['right', 'top', 'bottom']:
    ax.spines[spine].set_visible(False)

# Remove ticks
plt.tick_params(axis='x', which='both',bottom=False,top=False,labelbottom=False)
plt.tick_params(axis='y', which='both',left=False)

autolabel(rects1, horizontal=True, percentageLabel=True)
autolabel(rects2, horizontal=True, percentageLabel=True)
autolabel(rects3, horizontal=True, percentageLabel=True)
autolabel(rects4, horizontal=True, percentageLabel=True)

fig.tight_layout()

plt.savefig("monte_place.png")
plt.close(fig)
Image("monte_place.png") 

From this new plot, we may initially naively conclude that this plot supports the Conventional Wisdom of defensive play callers. The plot appears to indicate that, although rushing 6 players may give the best outcome the highest percentage of the time, it gives the lowest percentages for being the 2nd and 3rd best, and the highest percentage for being the worst. On the other hand, although rushing 4 players has the approximate joint-worse probability in achieving the best outcome, it has the highest percentages for being the 2nd and 3rd best, and the lowest percentage for being the worst.

The issue with the above conclusions are that they forget the interdependency between each result. It is not a useful comparison to compare the percentage that each number of rushers was the 2nd best; what is a useful comparison is to compare the percentage that each number of rushers was the ***2nd best or better***. It is the **cumulative** percentages which are important for drawing conclusions.

Hence, we will plot the results in a cumulative form. To investigate both ***risk*** and ***reward*** we need to plot the results cumulatively in both 'directions'.

In [None]:
fig, axs = plt.subplots(2,1, figsize=(9, 9))

x = np.arange(len(rushers))  # the label locations
width = 0.175  # the width of the bars

# Reward plot
ax = axs[0]
rects1 = ax.bar(x - 1.5*width, list(monte_place[0].values()), width, label='Best (1st)',color=colours["BAL"])
rects2 = ax.bar(x - .5*width, [a+b for a,b in zip(list(monte_place[1].values()),list(monte_place[0].values()))], width, label='Second-best or better (1st or 2nd)',color=colours["JAX"])
rects3 = ax.bar(x + .5*width, [a+b+c for a,b,c in zip(list(monte_place[2].values()),list(monte_place[1].values()),list(monte_place[0].values()))], width, label='Third-best or better (1st, 2nd or 3rd)',color=colours["PHI"])
rects4 = ax.bar(x + 1.5*width,[a+b+c+d for a,b,c,d in zip(list(monte_place[3].values()),list(monte_place[2].values()),list(monte_place[1].values()),list(monte_place[0].values()))], width, label='Worst or better (1st, 2nd, 3rd or 4th)',color=colours["MIA"])

autolabel(rects1, percentageLabel=True)
autolabel(rects2, percentageLabel=True)
autolabel(rects3, percentageLabel=True)
autolabel(rects4, percentageLabel=True)

# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('% of samples',fontsize=14)
ax.set_xlabel('Rushers',fontsize=14)
ax.set_xticks(x)
ax.set_xticklabels(rushers)
ax.set_yticks([0,20,40,60,80,100])
ax.set_yticklabels([0,20,40,60,80,100])
ax.legend(ncol=2,loc="upper center",frameon=False,fontsize=10)
ax.set_title("Reward",fontsize=16)
ax.set_ylim([0,135])

# Hide spines
for spine in ['right', 'left', 'top']:
    ax.spines[spine].set_visible(False)

# Remove ticks
ax.tick_params(axis='x', which='both',bottom=False,top=False)
ax.tick_params(axis='y', which='both',left=False, labelleft=False)

# Risk plot
ax = axs[1]
rects1 = ax.bar(x - 1.5*width,[a+b+c+d for a,b,c,d in zip(list(monte_place[0].values()),list(monte_place[1].values()),list(monte_place[2].values()),list(monte_place[3].values()))], width, label='Best or worse (1st, 2nd, 3rd or 4th)',color=colours["DEN"])
rects2 = ax.bar(x - .5*width, [a+b+c for a,b,c in zip(list(monte_place[1].values()),list(monte_place[2].values()),list(monte_place[3].values()))], width, label='Second-best or worse (2nd, 3rd or 4th)',color=colours["CLE"])
rects3 = ax.bar(x + .5*width, [a+b for a,b in zip(list(monte_place[2].values()),list(monte_place[3].values()))], width, label='Third-best or worse (2nd, 3rd or 4th)',color=colours["WSH"])
rects4 = ax.bar(x + 1.5*width, list(monte_place[3].values()), width, label='Worst (4th)',color=colours["ATL"])

autolabel(rects1, percentageLabel=True)
autolabel(rects2, percentageLabel=True)
autolabel(rects3, percentageLabel=True)
autolabel(rects4, percentageLabel=True)

# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('% of samples',fontsize=14)
ax.set_xlabel('Rushers',fontsize=14)
ax.set_xticks(x)
ax.set_xticklabels(rushers)
ax.set_yticks([0,20,40,60,80,100])
ax.set_yticklabels([0,20,40,60,80,100])
ax.legend(ncol=2,loc="upper center",frameon=False,fontsize=10)
ax.set_title("Risk",fontsize=16)
ax.set_ylim([0,135])

# Hide spines
for spine in ['right', 'left', 'top']:
    ax.spines[spine].set_visible(False)

# Remove ticks
ax.tick_params(axis='x', which='both',bottom=False,top=False)
ax.tick_params(axis='y', which='both',left=False, labelleft=False)


fig.tight_layout()
plt.subplots_adjust(hspace=0.4)

plt.savefig("risk_reward.png")
plt.close(fig)
Image("risk_reward.png") 

Hopefully you are not too confused at this point! If so, it may be worth going back a few steps having a quick refresh.

So what does this final plot tells us? 

Begin by considering the **'Reward'** subplot. As with the first Monte Carlo analysis, the *'Best'* bars show that the ***reward*** increases with increased numbers of rushers. But the additional results of this analysis allow us to look deeper than this. From the *'Second-best or better'* bars, we can also see that the trend of increasing ***reward*** with more rushers continues. This trend flattens off (and reverses slightly) when considering the *'Third-best or better'* bars, which demonstrates the higher ***risk*** associated with higher numbers of rushers.

We can consider risk further with the second **'Risk'** subplot. As would be expected from Conventional Wisdom, if we just consider the *'Worst'* bars, they show that rushing 4 players poses the lowest ***risk***, followed by 3 rushers, with 5 and 6 rushers being approximately the same. However, if we now consider the *'Third-best or worse'* and *'Second-best or worse'* bars we can see that the opposite trend exists: *the higher the number of rushers, the lower the* ***risk***.

Hence, we have shown that. 
* Rushing more players gives a higher chance of achieving the ***best*** outcome
* The Conventional Wisdom of rushing 4 players the majority of the time gives the highest chance of avoiding the ***worst*** outcome
* **But**, rushing more players reduces the risk of the ***second-worst*** and ***third-worst*** outcomes (a bit mind-bending to think about, heh?)

We have demonstrated there is a huge gap between the frequency of each number of rushers used by defensive play callers and the reality of their impacts on the game. Hence, we draw a 
quite radical conclusion: 

> Defensive play callers should look to significantly change the amount they use different numbers of rushers. They should reduce the amount they rush 4 players and should show a greater preference for rushing 5 and 6.

However, it is important to remember an important principle: **Equilibrium**. What we have found is the **gradient**, the direction in which change needs to occur. Defensive play callers should undoubtedly look to increase the amount they rush more players, but this should occur in a gradual fashion. For example, the irregularity of rushing 6 players lends it the advantage of surprise; a significant increase in using 6 players will reduce the surprise element and so could impact the effectiveness. 

Understanding the balance between ***risk*** and ***reward*** will allow defensive play callers to make more informed decisions on how many players to rush. Rushing more passers increases volatility, but on average has the greatest impact on swinging the scoreboard in the Team's favour. Defensive play callers can use this knowledge, with their current appetite for risk (depending on the game situation), to make better decisions to help win Football games!

# About the Authors

Ben and Arthur met in their first year of university, studying Engineering at the University of Cambridge. They began following the NFL avidly in 2016, supporting the New York Giants and the Chicago Bears respectively. No thanks to their teams, their enjoyment increased as their knowledge of the game deepened. They decided the NFL's Big Data Bowl 2021 was a great opportunity to have some fun and do something very different to their day jobs.