# Project Overview

This is the second problem in the coding assessment for Big League Advance. The goal is to create a table of conditional probabilities that detail the chances each team has of getting a draft pick in the NBA Lottery. In this scenario the lottery has expanded to 16 teams total, with the top five picks being determined by the lotto drawing.

# Importing Basic Libraries

In [1]:
#These are the libraries I typically use in my analysis so I find it easier to import them all at once
#If I need more libraries I will import them as needed

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
plt.style.use('fivethirtyeight')
%matplotlib inline

# Setting Up the Problem

The first thing I will do is create a set of lists. I will then fill in the blank lists with various probabilities for each pick and then combine all of the lists into a dataframe.

In [2]:
#Here I am creating a simple list with the team numbers
teams =       [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]

#Here is a list with the amount of chances each team gets. The sum of these chances is 1000
chances =     [114,113,112,111,99,89,79,69,59,49,39,29,19,9,6,4]

#Here is a list for the first pick. Probability was simple for this one. Simply team chances/total chances
firstpick =   [0.114, 0.113, 0.112, 0.111, 0.099, 0.089, 0.079, 0.069, 0.059, 0.049, 0.039, 0.029, 0.019, 0.009, 0.006, 0.004]

#Here is a blank list for the second pick. Will be filled in with conditional probabilities
secondpick =  [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.]

#Here is a blank list for the third pick. Will be filled in with conditional probabilities
thirdpick =   [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.]

#Here is a blank list for the fourth pick. Will be filled in with conditional probabilities
fourthpick =  [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.]

#Here is a blank list for the fifth pick. Will be filled in with conditional probabilities
fifthpick =   [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.]

#Here I am creating an 11x16 series of lists full of zeros to fill in with values after the top 5 lottery picks are chosen
postpicks = [[0]*16 for _ in range(11)]

# Calculations for Probabilities of Each Pick

1.) The probability for getting the first pick is simply chances/total chances. For example, Team 1 has 114 chances and there are 1000 total chances. So P(1st Pick|Worst Record) = 114/1000 = .114

2.) The probability for getting the second pick is Pn/(1000-Pi) where Pn is the chances of the second pick and Pi is the chances of the team that picked first. For example, if Team 2 picked first, then the probability of Team 1 to get the second pick is 114/(1000-113). However, because we do not KNOW who picked first, we have to create a conditional probability where Team X selects first then Team Y selects second. That is calculated by Py * Sum[Px/(1-Px] where x is every instance of not team y picking first.

3.) The probability for getting the third pick is Pn/(1000-Pi-Pk) where Pn is the chances of the third pick, Pi is the chances of either pick 1 or 2, and Pk is also the chances of pick 1 or 2. As before, we have to iterate over all scenarios where two different teams get the first two picks.

4.) We will do the same process for picks four and five. After that is finished, the picks are determined by worst remaining record for 6-16.

In [3]:
#Before setting up the lottery picks, I am creating a function that I will call in the middle of iterating over each pick
#This function will be used to calculate picks 6 through 16


def sixtosixteen(probs,postpicks,restprob):
    #Range is 11 since there will be 11 picks remaining after the top 5
    for z in range(11):
        #Many values will be zero since post lotto picks are determined solely by remaining worst record
        nonzeros = [r for r, e in enumerate(probs) if e !=0]
        team = min(nonzeros)
        postpicks[z][team] = postpicks[z][team] + restprob
        probs[team] = 0
    return postpicks

In [4]:
#I am going to iterate over several lists that I created above
#The range is 16 for 16 teams
for i in range(16):
    #Here I am removing the chances of the team that picked first (iterates over each scenario)
    eliminate_1 = firstpick[:i]+[0.]+firstpick[i+1:]
    #Here I am totalling the sum of the probabilities of each instance where a team was removed
    totalprob = sum(eliminate_1)
    #Here is the actual calculation for the conditional probability of each team 
    condprob = [(firstpick[i]*(x/totalprob)) for x in eliminate_1]
    #Here I am filling in the blank list for second pick with the conditional probabilities
    secondpick = [x+y for x,y in zip(secondpick,condprob)]
    for j in range(16):
        #Here I am setting up the instances of the first two picks being done. j is pick three since i was pick two
        if i != j:
            #Here we are removing j, after i was already removed
            eliminate_2 = eliminate_1[:j]+[0.]+eliminate_1[j+1:]
            #Here I am totalling the sum of the probabilities of each instance where two teams were removed
            totalprob = sum(eliminate_2)
            #Here is the actual calculation for the conditional probability of each team
            condprob2 = [(condprob[j]*(x/totalprob)) for x in eliminate_2]
            #Here I am filling in the blank list for third pick with the conditional probabilities
            thirdpick = [x+y for x,y in zip(thirdpick,condprob2)]
            for k in range(16):
                #Here I am setting up the instances of the first three picks being done. j and i were 2 and 3
                if j != k:
                    #Here we are removing k, after i and j were already removed
                    eliminate_3 = eliminate_2[:k]+[0.]+eliminate_2[k+1:]
                    #Here I am totalling the sum of the probabilities of each instance where three teams were removed
                    totalprob = sum(eliminate_3)
                    #Here is the actual calculation for the conditional probability of each team
                    condprob3 = [(condprob2[k]*(x/totalprob)) for x in eliminate_3]
                    #Here I am filling in the blank list for fourth pick with the conditional probabilities
                    fourthpick = [x+y for x,y in zip(fourthpick,condprob3)]   
                    for m in range(16):
                        #Here I am setting up the instances of the first four picks being done. j,i, and k were 2,3, and 4
                        if k != m:
                            #Here we are removing m, after i,j, and k were already removed
                            eliminate_4 = eliminate_3[:m]+[0.]+eliminate_3[m+1:]
                            #Here I am totalling the sum of the probabilities of each instance where four teams were removed
                            totalprob = sum(eliminate_4)
                            #Here is the actual calculation for the conditional probability of each team
                            condprob4 = [(condprob3[m]*(x/totalprob)) for x in eliminate_4]
                            #Here I am filling in the blank list for fifth pick with the conditional probabilities
                            fifthpick = [x+y for x,y in zip(fifthpick,condprob4)]
                            for p in range(16):
                                #Here I am setting up the remaining picks left in the draft
                                if (i!= p) and (j!= p) and (k!= p) and (m!= p):
                                    restprob = condprob4[p]
                                    eliminate_5 = eliminate_4[:p]+[0.]+eliminate_4[p+1:]
                                    postpicks = sixtosixteen(eliminate_5,postpicks,restprob)
                
                
                
                
                
                
#Here I am printing out the lists for picks 2-5               
print(secondpick)
print(thirdpick)
print(fourthpick)
print(fifthpick)

#Here I am printing the sum of conditional probabilities for each pick. They should each equal 1. 
print(sum(secondpick))
print(sum(thirdpick))
print(sum(fourthpick))
print(sum(fifthpick))

[0.11046816539648968, 0.10964293389150861, 0.10881483649539843, 0.10798388287933292, 0.09779311603057485, 0.0889993169632134, 0.07994095605882101, 0.07062655840344596, 0.0610642866990324, 0.05126196031621248, 0.041227073157537895, 0.030966810416907364, 0.020488064313952566, 0.009797448875787242, 0.0065499056937587345, 0.004374684408026337]
[0.1065373303971223, 0.1058931936520546, 0.10524385839395109, 0.10458931994097552, 0.09632773500108896, 0.08886742904494047, 0.08088411382279348, 0.072380311526806, 0.06336026693716985, 0.05382962554011003, 0.04379516357955705, 0.03326456145786907, 0.02224621341173762, 0.010749067614519209, 0.007207949862083267, 0.004823859817221329]
[0.10212448692362917, 0.10166735974274657, 0.10120334840716977, 0.10073241398658515, 0.09452774193034068, 0.0885483225689449, 0.08180291366397417, 0.07426899742757236, 0.06592974580675945, 0.05677359375941363, 0.04679372502853851, 0.03598752964897639, 0.02435606887828487, 0.011903568116122927, 0.008008686847560276, 0.005

In [5]:
#Here I am printing out the lists for picks 6-16

print(postpicks)

[[0.469748500140766, 0.273959687126413, 0.14774425983451964, 0.07130598531453737, 0.029955432696845814, 0.007286134886921827, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0.19898052259756555, 0.25441802088221555, 0.22915174091738647, 0.17890173563285072, 0.10339999897680466, 0.035147980993184016, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0.07399237523831563, 0.15560472669477476, 0.22585265929678758, 0.24782788664978628, 0.20078256672165992, 0.09593978539867601, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0.02332950667529628, 0.078714529750046, 0.16377797879000183, 0.2524570991395996, 0.2865523700232242, 0.1951685156218349, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0.006640840915682166, 0.03270689374296664, 0.09571829387835687, 0.20587069807758324, 0.3301662265628834, 0.3288970468225329, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0.001628625260691197, 0.011285821818687834, 0.04585996470444641, 0.13803594899338165, 0.3188260098322746, 0.4843636293905233, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0.0003299303912972046, 0.00315644098160

In [6]:
#Here I am converting picks 6-16 to a dataframe

last_picks = pd.DataFrame(postpicks)

In [7]:
#Here I am transposing the dataframe

last_picks = last_picks.T

In [8]:
#Here is a brief look at the picks 6-16 dataframe

last_picks.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,0.469749,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.27396,0.198981,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.147744,0.254418,0.073992,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.071306,0.229152,0.155605,0.02333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.029955,0.178902,0.225853,0.078715,0.006641,0.0,0.0,0.0,0.0,0.0,0.0


In [9]:
#Here I am adding column names to the last picks dataframe

last_picks.columns =['Sixth Pick', 'Seventh Pick', 'Eighth Pick', 'Ninth Pick', 'Tenth Pick', 'Eleventh Pick', 
                     'Twelth Pick', 'Thirteenth Pick', 'Fourteenth Pick', 'Fifteenth Pick', 'Sixteenth Pick']

In [10]:
#Here is a look at the dataframe after adding column names

last_picks.head()

Unnamed: 0,Sixth Pick,Seventh Pick,Eighth Pick,Ninth Pick,Tenth Pick,Eleventh Pick,Twelth Pick,Thirteenth Pick,Fourteenth Pick,Fifteenth Pick,Sixteenth Pick
0,0.469749,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.27396,0.198981,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.147744,0.254418,0.073992,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.071306,0.229152,0.155605,0.02333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.029955,0.178902,0.225853,0.078715,0.006641,0.0,0.0,0.0,0.0,0.0,0.0


In [11]:
#Here I am converting the lottery pick lists into a dataframe

lottery_picks = pd.DataFrame(
    {'Team': teams,
     'Chances': chances,
     'First Pick': firstpick,
     'Second Pick': secondpick,
     'Third Pick': thirdpick,
     'Fourth Pick': fourthpick,
     'Fifth Pick': fifthpick
    })

In [12]:
#Here is a brief look at the newly created dataframe

lottery_picks.head()

Unnamed: 0,Team,Chances,First Pick,Second Pick,Third Pick,Fourth Pick,Fifth Pick
0,1,114,0.114,0.110468,0.106537,0.102124,0.097122
1,2,113,0.113,0.109643,0.105893,0.101667,0.096856
2,3,112,0.112,0.108815,0.105244,0.101203,0.096583
3,4,111,0.111,0.107984,0.104589,0.100732,0.096302
4,5,99,0.099,0.097793,0.096328,0.094528,0.092286


In [13]:
#Here I am combining the lotto picks and post lotto picks into one dataframe

total_probabilities = pd.concat([lottery_picks, last_picks], axis=1, join='inner')

In [14]:
#Here is the finished dataframe

total_probabilities.head()

Unnamed: 0,Team,Chances,First Pick,Second Pick,Third Pick,Fourth Pick,Fifth Pick,Sixth Pick,Seventh Pick,Eighth Pick,Ninth Pick,Tenth Pick,Eleventh Pick,Twelth Pick,Thirteenth Pick,Fourteenth Pick,Fifteenth Pick,Sixteenth Pick
0,1,114,0.114,0.110468,0.106537,0.102124,0.097122,0.469749,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2,113,0.113,0.109643,0.105893,0.101667,0.096856,0.27396,0.198981,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,3,112,0.112,0.108815,0.105244,0.101203,0.096583,0.147744,0.254418,0.073992,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,4,111,0.111,0.107984,0.104589,0.100732,0.096302,0.071306,0.229152,0.155605,0.02333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,5,99,0.099,0.097793,0.096328,0.094528,0.092286,0.029955,0.178902,0.225853,0.078715,0.006641,0.0,0.0,0.0,0.0,0.0,0.0


In [15]:
#Here I am exporting the dataframe to a csv file

total_probabilities.to_csv("DraftProbabilityTable.csv")

# Sources

1.) https://math.la.asu.edu/~rich/puzzles/prob007sb.html
    
2.) http://www.ajuronline.org/uploads/Volume%202/Issue%203/23E-FlorkeArt.pdf

3.) http://www.celticshub.com/2017/05/16/watch-lottery-love-math/

4.) https://squared2020.com/2017/09/30/how-nba-draft-lottery-probabilities-are-constructed/