## Who guards the guards?

NBA teams value lockdown defenders -- a player who can keep their opponent's star from controlling the game. Although some defensive specialists are widely recognized around the league, and may play extensively despite limited offensive games, others may be overlooked.

In this notebook, I'll use publicly-released data from the NBA to identify the players most frequently tasked with challenging defensive assignments -- limited to guards for now -- and look at some related questions, like:
- Does top-defender-dom persist from year to year?
- How are teams with two important offensive players defended? How do teams with two top defenders assign them?
- How do these matchups change in the playoffs?

### What's our universe of "important offensive players"?

We're going to start by using a Usage leaderboard, which reflects the proportion of a team's possessions in which that player was last to touch the ball (either by shooting it or turning it over). Because turnovers are most likely when a player is either dribbling the ball or passing the ball, this component is a reasonable approximation for players who spend the most time handling the ball, even if they don't shoot as often themselves.

We could consider incorporating Assist Ratio, which is the proportion of possessions for which that player receives credit for an assist (the last pass leading directly to a made shot), but assists are noisier than the components of usage. For one, assists are determined subjectively by official scorekeepers on the basis of whether that last pass was sufficiently proximate to the shot -- scorekeepers are tied to an arena and have been demonstrated to show a bias in awarding more assists to the home team. In addition, two passes of equal quality will not be treated identically because assists are only awarded if the shot is made, so a miss (or a shooting foul drawn) can't be assisted. As a result, Assist Ratio is dependent on whether the game is home or on the road, and on the shooting ability of a player's teammate (and to a smaller extent on the skill of the defender guarding that teammate). So we'll set it aside for now.

In addition, we'll focus on Guards for now -- we want a relatively homogeneous pool of offensive players so that a standout defender is likely to be matched up against most or all of them. In particular, a player who can match up against a point guard could also handle other perimeter players but not necessarily centers.

In [1]:
# Our first step will be to pull a leaderboard for Usage from stats.nba.com and turn it into a pandas dataframe.
# Here, I'm following the workflow helpfully laid out by Greg Reda (http://www.gregreda.com/2015/02/15/web-scraping-finding-the-api/)
# and Savvas Tjortjoglu (http://savvastjortjoglou.com/nba-shot-sharts.html) that they used to obtain other sets of stats from the same site.

import requests
import pandas as pd
import seaborn as sns
%matplotlib inline

In [7]:
# we'll save the URL as a string first

# this gets us a regular-season data from 2018-19 in JSON format; 
# the MeasureType=Advanced parameter gets us the Usage stat, among others
usage_url = 'https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country='+ \
                '&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height='+ \
                '&LastNGames=0&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0'+ \
                '&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience='+ \
                '&PlayerPosition=G&PlusMinus=N&Rank=N&Season=2018-19&SeasonSegment=&SeasonType=Regular+Season'+ \
                '&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight='

The server won't accept the request using the default parameters from requests.get(), so we need to send what it sees when I load the page manually (the headers).
I'm not super-confident how this conforms to the TOS for the NBA Stats site, so I'm going to endeavor to send a minimal number of GET requests, at least, no more than I would use when playing around with the full site.

In [None]:
http_headers = {'Accept': 'application/json', 'x-nba-stats-token': 'true', 'X-NewRelic-ID': 'VQECWF5UChAHUlNTBwgBVw==',
                'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131',
                'x-nba-stats-origin': 'stats', 'Referer': 'https://stats.nba.com/players/advanced/?sort=USG_PCT&dir=-1&CF=GP*G*5:MIN*G*20&Season=2018-19&SeasonType=Regular%20Season'}

usage_output = requests.get(usage_url, headers=http_headers)


In [None]:
# now take that JSON output and turn it into a dataframe
headers1 = usage_output.json()['resultSets'][0]['headers']
players1 = usage_output.json()['resultSets'][0]['rowSet']

usage_df = pd.DataFrame(players1, columns=headers1)

usage_df.head()

In [None]:
# it's not a leaderboard yet, so we'll need to filter to eliminate random cases (guys who rarely play)
# and then pare down based on a threshold -- say 20 or 25%

### Who guards those players?

In [None]:
# now we take the list of important offensive players' player IDs
offensive_list = df2.playerid[]

# and use it as the source for a new query to stats.nba.com to get the list of players they matched up against
# is there any overlap (two-way players)?

### Is this consistent from year to year?

In [None]:
# we quickly repeat the same exercise but for the 2017-18 season (retaining almost all defenders)
# pair the years against each other by defensive player
# plot pairwise in a scatterplot

### Elite teammates

In [None]:
# identify cases in the single-season data where two players from the same team are both
# 1) important offensive players or 2) defensive standouts
# do their matchups look different from others?

### The Playoffs

In [None]:
# return to stats.nba.com to pull playoff data (probably 2017-18 for now)
# look to see if the following patterns hold:
# - proportion of high-usage players (since rotations shorten)
# - ability of defenders to retain their matchups (more switching)
# - new names (Iguodala)?