# Domain knowledge and adding context through annotations

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In this notebook we will touch on how a bit of domain knowledge and the addition of plot annotations can improve the utility of our data visualizations. We again return to the LeBron James shot selection data.

In [None]:
df = pd.read_csv("../../data/lebron_james_taken_shots.csv")
rookie = df.loc[df.season=='2003-04'].copy()
nineteenth = df.loc[df.season=='2020-21'].copy()

Make a bivariate KDE plot for the `rookie` and `nineteenth` `DataFrame`s using `seaborn`'s `jointplot` placing the `'x'` variable on the horizontal axis and the `'y'` variable on the vertical axis.

Use the following arguments as well:
- `fill=True`,
- `thresh=0`,
- `levels=5`,
- `cmap="Greys"`.

Store the resulting `JointGrid` objects in variables that and then call:
- `variable.ax_marg_x.remove()`,
- `variable.ax_marg_y.remove()`.

<i> Note: you may need to install the `contourpy` package for your code to work.</i>

In [None]:
## Code here



In [None]:
## Code here



What you have created is a heatmap of the shot distributions for LeBron James's rookie and 19$^\text{th}$ seasons in the NBA. The darker the region, the more shots he has taken there.

While we should be able to tell that the plots look different, it may be difficult to interpret how they are different, particularly if we are not basketball fans. We could provide some additional context to these two plots by annotating the charts with additional court information. Each shot is taken on a court with the following demarcating lines.

<img src="nba_court.png" width="40%"></img>

The small circle at the bottom denotes the hoop that the players would like to get the ball into with each shot. The larger arc represents the line the demarcates where a made shot is worth 2 or 3 points. A shot made above the arc is worth three points, while a shot made below the arc is worth 2 points.

The code written below provides a function that uses `matplotlib`'s `Patch` object to draw the demarcating lines of an NBA court. `draw_court` takes in an `Axes` object and then draws the lines on that object.

In [None]:
from matplotlib.patches import Circle, Rectangle, Arc

def draw_court(ax=None, color='black', lw=2, outer_lines=False):
    # If an axes object isn't provided to plot onto, just get current one
    if ax is None:
        ax = plt.gca()

    # Create the various parts of an NBA basketball court

    # Create the basketball hoop
    # Diameter of a hoop is 18" so it has a radius of 9", which is a value
    # 7.5 in our coordinate system
    hoop = Circle((0, 0), radius=7.5, linewidth=lw, color=color, fill=False)

    # Create backboard
    backboard = Rectangle((-30, -7.5), 60, -1, linewidth=lw, color=color)

    # The paint
    # Create the outer box 0f the paint, width=16ft, height=19ft
    outer_box = Rectangle((-80, -47.5), 160, 190, linewidth=lw, color=color,
                          fill=False)
    # Create the inner box of the paint, widt=12ft, height=19ft
    inner_box = Rectangle((-60, -47.5), 120, 190, linewidth=lw, color=color,
                          fill=False)

    # Create free throw top arc
    top_free_throw = Arc((0, 142.5), 120, 120, theta1=0, theta2=180,
                         linewidth=lw, color=color, fill=False)
    # Create free throw bottom arc
    bottom_free_throw = Arc((0, 142.5), 120, 120, theta1=180, theta2=0,
                            linewidth=lw, color=color, linestyle='dashed')
    # Restricted Zone, it is an arc with 4ft radius from center of the hoop
    restricted = Arc((0, 0), 80, 80, theta1=0, theta2=180, linewidth=lw,
                     color=color)

    # Three point line
    # Create the side 3pt lines, they are 14ft long before they begin to arc
    corner_three_a = Rectangle((-220, -47.5), 0, 140, linewidth=lw,
                               color=color)
    corner_three_b = Rectangle((220, -47.5), 0, 140, linewidth=lw, color=color)
    # 3pt arc - center of arc will be the hoop, arc is 23'9" away from hoop
    # I just played around with the theta values until they lined up with the 
    # threes
    three_arc = Arc((0, 0), 475, 475, theta1=22, theta2=158, linewidth=lw,
                    color=color)

    # Center Court
    center_outer_arc = Arc((0, 422.5), 120, 120, theta1=180, theta2=0,
                           linewidth=lw, color=color)
    center_inner_arc = Arc((0, 422.5), 40, 40, theta1=180, theta2=0,
                           linewidth=lw, color=color)

    # List of the court elements to be plotted onto the axes
    court_elements = [hoop, backboard, outer_box, inner_box, top_free_throw,
                      bottom_free_throw, restricted, corner_three_a,
                      corner_three_b, three_arc, center_outer_arc,
                      center_inner_arc]

    if outer_lines:
        # Draw the half court line, baseline and side out bound lines
        outer_lines = Rectangle((-250, -47.5), 500, 470, linewidth=lw,
                                color=color, fill=False)
        court_elements.append(outer_lines)

    # Add the court elements onto the axes
    for element in court_elements:
        ax.add_patch(element)

    return ax

In [None]:
## Demonstarting draw_court
fig,ax = plt.subplots(figsize=(6,8))

draw_court(ax=ax)

plt.xlim(-250,250)
plt.ylim(-100,500)

Now remake your two heat maps from above. This time use the `.ax_joint` attribute of a `JointGrid` along with `draw_court` to draw the court lines on top of your heat map.

In [None]:
## Code here



In [None]:
## Code here



Do you see the difference these annotations make? Describe the differences between LeBron's rookie and 19$^\text{th}$ season shot distributions in terms of the court.

How did the addition of some domain knowledge and plot annotations assist in your analysis?

#### Write here




-----------------------------------------------
This notebook was written for the Erdős Institute by Matthew Osborne, Ph. D., 2023.

Any potential redistributors must seek and receive permission from Matthew Tyler Osborne, Ph.D. prior to redistribution. Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)