# Which jersey number is the "dunk-iest"? 

### In other words, players wearing what jersey number are responsible for a plurality of dunks?

## Step 1

### Box scores don't capture stats on dunks (at least, most don't). But we do have this information in our play-by-play data. Let's find [this dunk](https://www.youtube.com/watch?v=QqWmpUF6HHg) to get an idea of the data we'll need.



In [0]:
from pandas.io import gbq
project_id = '[YOUR_PROJECT_ID]'


In [4]:
FGCU_vs_Georgetown_q = """
SELECT
  scheduled_date,
  away_market,
  home_market,
  elapsed_time_sec,
  team_market,
  player_full_name,
  event_type,
  shot_type
FROM
  `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa`
WHERE
  season = 2012
  AND away_market = "Florida Gulf Coast"
  AND home_market = "Georgetown"
  AND elapsed_time_sec > 2270
  AND elapsed_time_sec < 2300
GROUP BY
  scheduled_date,
  away_market,
  home_market,
  elapsed_time_sec,
  team_market,
  player_full_name,
  event_type,
  shot_type
ORDER BY
  elapsed_time_sec ASC
"""

FGCU_vs_Georgetown = gbq.read_gbq(query=FGCU_vs_Georgetown_q, dialect ='standard', project_id=project_id)
FGCU_vs_Georgetown

Requesting query... ok.
Job ID: job_Fyo6swEhrsi7YZaKhMn-ZrcqMuj5
Query running...
Query done.
Cache hit.

Retrieving results...
Got 12 rows.

Total time taken 0.5 s.
Finished at 2018-03-16 08:03:48.


Unnamed: 0,scheduled_date,away_market,home_market,elapsed_time_sec,team_market,player_full_name,event_type,shot_type
0,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2271,Georgetown,"PORTER JR.,OTTO",REBOUND,
1,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2271,Georgetown,"TRAWICK,JABRIL",MISS,3PTR
2,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2276,Georgetown,"PORTER JR.,OTTO",MISS,JUMPER
3,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2276,Georgetown,"BOWEN,AARON",REBOUND,
4,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2279,Georgetown,"BOWEN,AARON",GOOD,TIPIN
5,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2286,Florida Gulf Coast,"FIELER,CHASE",GOOD,DUNK
6,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2286,Florida Gulf Coast,"COMER,BRETT",ASSIST,
7,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2292,Georgetown,"BOWEN,AARON",GOOD,FT
8,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2292,Florida Gulf Coast,"MURRAY,EDDIE",SUB,
9,2013-03-22T00:00:00,Florida Gulf Coast,Georgetown,2292,Florida Gulf Coast,"GRAF,DAJUAN",SUB,


## Step 2

### To find dunks, it looks like we'll need plays where event_type = "GOOD" and shot_type = "DUNK". So, which jersey number gets the most dunks?


In [5]:
dunks_made_q = """
SELECT
  SAFE_CAST(jersey_num AS INT64) AS jersey,
  COUNTIF(event_type = "GOOD"
    AND shot_type = "DUNK") AS dunks_made
FROM
  `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa`
WHERE
  home_division_alias = "D1"
  AND away_division_alias = "D1"
  AND SAFE_CAST(jersey_num AS INT64) IS NOT NULL
GROUP BY
  jersey
ORDER BY
  dunks_made DESC
"""

dunks_made = gbq.read_gbq(query=dunks_made_q, dialect ='standard', project_id=project_id)
dunks_made

Requesting query... ok.
Job ID: job_3hL3MmXGipozBTT78CxXH5psEX8I
Query running...
Query done.
Processed: 489.9 MB
Standard price: $0.00 USD

Retrieving results...
Got 39 rows.

Total time taken 1.97 s.
Finished at 2018-03-16 08:03:51.


Unnamed: 0,jersey,dunks_made
0,1,8806
1,5,8620
2,23,8532
3,21,7978
4,24,7051
5,0,6823
6,2,6644
7,15,6543
8,32,6454
9,4,6238


## Step 3

### Looks like number 1. Neat. But what about the jersey number that gets the most field goals overall?

In [6]:
shots_made_q = """
SELECT
  SAFE_CAST(jersey_num AS INT64) AS jersey,
  COUNTIF(event_type = "GOOD"
    AND shot_type = "DUNK") AS dunks_made,
  COUNTIF(event_type = "MISS"
    AND shot_type = "DUNK") AS dunks_missed,
  COUNTIF(event_type = "GOOD"
    AND shot_type != "FT") AS shots_made,
  COUNTIF(event_type = "MISS"
    AND shot_type != "FT") AS shots_missed
FROM
  `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa`
WHERE
  home_division_alias = "D1"
  AND away_division_alias = "D1"
  AND SAFE_CAST(jersey_num AS INT64) IS NOT NULL
GROUP BY
  jersey
ORDER BY
  shots_made DESC
"""

shots_made = gbq.read_gbq(query=shots_made_q, dialect ='standard', project_id=project_id)
shots_made

Requesting query... ok.
Job ID: job_CliORTBNANQFn7htYPfjSMqpdfkO
Query running...
Query done.
Processed: 489.9 MB
Standard price: $0.00 USD

Retrieving results...
Got 39 rows.

Total time taken 2.22 s.
Finished at 2018-03-16 08:03:54.


Unnamed: 0,jersey,dunks_made,dunks_missed,shots_made,shots_missed
0,1,8806,1099,173493,237358
1,3,6003,773,156844,219777
2,5,8620,1004,148208,201356
3,2,6644,880,136920,188938
4,0,6823,846,119356,157367
5,11,5614,659,113218,156022
6,23,8532,1010,109808,140576
7,10,4806,567,106534,145931
8,4,6238,792,106506,141164
9,21,7978,966,99052,120652


## Step 4

### Also number 1! But these are just totals. What about which jersey number makes the most of their dunk attempts?

In [7]:
shots_pct_q = """
SELECT
  SAFE_CAST(jersey_num AS INT64) AS jersey,
  COUNTIF(shot_type = "DUNK") AS dunks_att,
  COUNTIF(event_type = "GOOD"
    AND shot_type = "DUNK") AS dunks_made,
  IF(COUNTIF(shot_type = "DUNK")>0,
    COUNTIF(event_type = "GOOD"
      AND shot_type = "DUNK") / COUNTIF(shot_type = "DUNK"),
    0) AS dunks_made_pct,
  COUNTIF(shot_type != "FT") AS shots_att,
  COUNTIF(event_type = "GOOD"
    AND shot_type != "FT") AS shots_made,
  IF(COUNTIF(event_type = "GOOD"
      AND shot_type != "FT")>0,
    COUNTIF(event_type = "GOOD"
      AND shot_type != "FT") / COUNTIF(shot_type != "FT"),
    0) AS shots_made_pct
FROM
  `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa`
WHERE
  home_division_alias = "D1"
  AND away_division_alias = "D1"
  AND SAFE_CAST(jersey_num AS INT64) IS NOT NULL
GROUP BY
  jersey
ORDER BY
  dunks_made_pct DESC
"""

shots_pct = gbq.read_gbq(query=shots_pct_q, dialect ='standard', project_id=project_id)
shots_pct

Requesting query... ok.
Job ID: job_iUEjYFTvABAwn4UMijNQNRQRhGyA
Query running...
Query done.
Processed: 489.9 MB
Standard price: $0.00 USD

Retrieving results...
Got 39 rows.

Total time taken 2.1 s.
Finished at 2018-03-16 08:03:57.


Unnamed: 0,jersey,dunks_att,dunks_made,dunks_made_pct,shots_att,shots_made,shots_made_pct
0,53,278,259,0.931655,6281,3039,0.48384
1,51,206,190,0.92233,4776,2203,0.461265
2,52,763,693,0.908257,15798,7444,0.471199
3,40,1927,1750,0.908147,32606,15529,0.476262
4,31,2993,2713,0.906448,82182,37198,0.45263
5,42,2520,2283,0.905952,54682,26858,0.491167
6,24,7806,7051,0.90328,206448,91015,0.440862
7,35,4693,4237,0.902834,85055,39912,0.469249
8,32,7163,6454,0.901019,178428,81283,0.455551
9,55,1743,1569,0.900172,37983,17689,0.465708


##### Note: It looks like some jersey numbers were incorrectly recorded at some point: "6", "7", and "99" are not valid numbers in NCAA basketball. Across 50,000+ games, you're bound to record an incorrect value at some point.

### Number "1" isn't even near the top!

## Step 5

### So which jersey attempts and makes the most dunks compared to the number of total FGs they attempt and make?

In [8]:
dunks_shots_q = """
SELECT
  SAFE_CAST(jersey_num AS INT64) AS jersey,
  COUNTIF(shot_type = "DUNK") AS dunks_att,
  COUNTIF(shot_type != "FT") AS FGs_att,
  COUNTIF(shot_type = "DUNK") / COUNTIF(shot_type != "FT") AS dunk_att_pct,
  COUNTIF(event_type = "GOOD"
    AND shot_type = "DUNK") AS dunks,
  COUNTIF(event_type = "GOOD"
    AND shot_type != "FT") AS FGs,
  IF(COUNTIF(event_type = "GOOD"
      AND shot_type != "FT")>0,
    COUNTIF(event_type = "GOOD"
      AND shot_type = "DUNK") / COUNTIF(event_type = "GOOD"
      AND shot_type != "FT"),
    0) AS dunk_pct
FROM
  `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa`
WHERE
  home_division_alias = "D1"
  AND away_division_alias = "D1"
GROUP BY
  jersey
ORDER BY
  dunk_pct DESC
"""

dunks_shots = gbq.read_gbq(query=dunks_shots_q, dialect ='standard', project_id=project_id)
dunks_shots

Requesting query... ok.
Job ID: job_iiyd9jBmfYXFrbm7LBVmAA-ZVFee
Query running...
Query done.
Processed: 489.9 MB
Standard price: $0.00 USD

Retrieving results...
Got 40 rows.

Total time taken 2.26 s.
Finished at 2018-03-16 08:04:00.


Unnamed: 0,jersey,dunks_att,FGs_att,dunk_att_pct,dunks,FGs,dunk_pct
0,54.0,614,9230,0.066522,547,4766,0.114771
1,40.0,1927,32606,0.0591,1750,15529,0.112692
2,35.0,4693,85055,0.055176,4237,39912,0.106159
3,44.0,4245,75543,0.056193,3805,36258,0.104942
4,43.0,1024,19006,0.053878,920,9016,0.102041
5,50.0,1344,25385,0.052945,1194,12325,0.096876
6,52.0,763,15798,0.048297,693,7444,0.093095
7,55.0,1743,37983,0.045889,1569,17689,0.088699
8,45.0,1692,35615,0.047508,1502,17011,0.088296
9,41.0,1273,27537,0.046229,1139,12917,0.088178


## So players wearing #1 get the most dunks.

### By our original question, "#1" is the answer, but there's more to it than that. You could argue that "#53" or "#54" is the dunk-iest number - it all depends on how you look at it.