<a href="https://colab.research.google.com/github/heinerkace/GoodStart/blob/main/BasketballNeuralNetwork.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Nueral Network NBA Player Classification**

In this project we will use a training data set of Basketball players to classify the individuals of a set of Basketball prospects to determine what their standing should be as a potential NBA player. We will be using the MLPRegressor Neural Network to create this model for classification.

The 4 potential categories are:
  1. Role Player
  2. Contributer
  3. Franchise Player
  4. Superstar

The data set has already been cleaned.

Variables in our Dataset include

Player_Name - Player's Name

Position_ID - ID that represents the position the player plays

Shots - Total number of Shots taken over season

Makes - Total number of Makes in season

Points - Total number of Points in season

Assits - Total number of Assists in season

Blocks - Total number of Blocks in season

Fouls - Total number of Fouls in season

Years_Exp - Years of playing

Team_Value - Contains one of the four categories for player classification. This will be the dependent variable of our model.



In [5]:
from google.colab import drive
drive.mount('/content/drive/', force_remount=True)

Mounted at /content/drive/


In [3]:
import pandas as pd
import numpy as np

Read in the Players training data and the Prospects data for scoring.

In [6]:
players = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Players.csv')
prospects = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Prospects.csv')

In [7]:
players.head()

Unnamed: 0,Player_Name,Position_ID,Shots,Makes,Points,Assists,Blocks,Fouls,Years_Exp,Team_Value
0,Joel Nelson,5,1128,373,1383,138,28,30,8,Contributor
1,Kurt Walters,5,970,456,1656,114,29,58,10,Franchise Player
2,Woodrow Patrick,3,1116,413,1195,108,36,60,12,Franchise Player
3,Delbert Lowe,3,1443,506,1787,112,35,70,13,Contributor
4,Cedric Barton,4,2340,843,1076,131,22,63,13,Franchise Player


In [8]:
print(players['Team_Value'].value_counts())

Team_Value
Contributor         86
Role Player         79
Franchise Player    59
Superstar           39
Name: count, dtype: int64


We need to convert the 4 categories in the Team_Value column to numerical values. We'll do this using the np.where method

In [9]:
players['Team_Value'] = np.where(players['Team_Value'] == 'Role Player', 1,
np.where(players['Team_Value'] == 'Contributor', 2,
np.where(players['Team_Value'] == 'Franchise Player', 3, 4)))

Now let's take a look at the Team_Value colum again.

In [10]:
print(players['Team_Value'].value_counts())

Team_Value
2    86
1    79
3    59
4    39
Name: count, dtype: int64


In [None]:
players.info()
prospects.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 263 entries, 0 to 262
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Player_Name  263 non-null    object
 1   Position_ID  263 non-null    int64 
 2   Shots        263 non-null    int64 
 3   Makes        263 non-null    int64 
 4   Points       263 non-null    int64 
 5   Assists      263 non-null    int64 
 6   Blocks       263 non-null    int64 
 7   Fouls        263 non-null    int64 
 8   Years_Exp    263 non-null    int64 
 9   Team_Value   263 non-null    int64 
dtypes: int64(9), object(1)
memory usage: 20.7+ KB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 59 entries, 0 to 58
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Player_Name  59 non-null     object
 1   Position_ID  59 non-null     int64 
 2   Shots        59 non-null     int64 
 3   Makes        59 non-null     int64 
 4  

Import the MLPRegressor library from Scikitlearn

In [11]:
from sklearn.neural_network import MLPRegressor as nnet

We need to classify the Independent variables and Dependent Variables

All variables except Player_Name and Team_Value will be the Independent predictor variables.

Team_Value will be the Dependent variable.

In [12]:
IndVars = players[['Position_ID', 'Shots', 'Makes', 'Points', 'Assists', 'Blocks', 'Fouls', 'Years_Exp']]
DepVar = players['Team_Value']

Now let's create our model. We'll set the max iterations to 10000 just to give lots of room for the model to exhaust its search, with 7 hidden layers.

In [13]:
playersModel = nnet(max_iter=10000, random_state=0, hidden_layer_sizes=7)
playersModel.fit(IndVars, DepVar)

Now let's apply our model to the Prospects data set - excluding the first row.

In [14]:
ProspectsPredictions = playersModel.predict(prospects.iloc[:, 1:9])

Now we need to convert all the predicted scores for the Team_Value category back into their categorical format.

Then we'll display the dataframe to get a look at the results.

In [17]:
prospects['PredScore'] = np.around(ProspectsPredictions, 0).tolist()
prospects.loc[prospects['PredScore'] <= 1.0, 'Predicted_Team_Value'] = 'Role Player'
prospects.loc[prospects['PredScore'] == 2.0, 'Predicted_Team_Value'] = 'Contributor'
prospects.loc[prospects['PredScore'] == 3.0, 'Predicted_Team_Value'] = 'Franchise Player'
prospects.loc[prospects['PredScore'] >= 4.0, 'Predicted_Team_Value'] = 'Superstar'
prospects.drop('PredScore', axis=1, inplace=True)
pd.set_option('display.max_rows', None)
prospects.head(25)

Unnamed: 0,Player_Name,Position_ID,Shots,Makes,Points,Assists,Blocks,Fouls,Years_Exp,Predicted_Team_Value
0,Gary Price,5,1060,403,1411,145,29,32,8,Contributor
1,Raul Little,5,931,488,1739,122,31,61,11,Superstar
2,Roman Richards,3,1194,425,1267,114,38,63,13,Franchise Player
3,Geoffrey Lloyd,3,1486,531,1858,119,36,76,14,Franchise Player
4,Jesus Huff,4,2504,885,1098,138,22,64,14,Role Player
5,Jan Becker,4,70,46,52,147,30,70,2,Superstar
6,John Mcguire,3,1540,685,2237,147,53,51,13,Superstar
7,Robert Holloway,5,1918,948,1228,116,52,33,11,Superstar
8,Herbert Watkins,2,578,164,175,123,37,24,9,Franchise Player
9,Stewart Chavez,5,1989,1004,1347,146,38,31,14,Superstar


Now let's answer some questions using our results:

1. How many players are classified as Superstars?
2. What is the minimum number of shots a player must make to be considered a superstar?
3. For how many years on Average have superstar players played?
4. How many superstar players have less than 10 years of experience?

# **2. How many players are classified as Superstars?**


To find this we'll use the value_counts() method to find the number in each category.

In [None]:
prospects['Predicted_Team_Value'].value_counts()

Unnamed: 0_level_0,count
Predicted_Team_Value,Unnamed: 1_level_1
Franchise Player,17
Contributor,14
Superstar,14
Role Player,14


# *Answer: 14 players are classified as Superstars*

# **2. What is the minimum number of shots a player must make to be considered a superstar?**

To find this we can separate the Superstar players from the original data and make a new dataframe.

Then we can either use the .min() method or the describe method to find the minimum shots.

In [18]:
Supers = prospects[prospects['Predicted_Team_Value'] == 'Superstar']

In [19]:
Supers

Unnamed: 0,Player_Name,Position_ID,Shots,Makes,Points,Assists,Blocks,Fouls,Years_Exp,Predicted_Team_Value
1,Raul Little,5,931,488,1739,122,31,61,11,Superstar
5,Jan Becker,4,70,46,52,147,30,70,2,Superstar
6,John Mcguire,3,1540,685,2237,147,53,51,13,Superstar
7,Robert Holloway,5,1918,948,1228,116,52,33,11,Superstar
9,Stewart Chavez,5,1989,1004,1347,146,38,31,14,Superstar
11,Drew Kelley,2,794,468,1487,127,27,41,6,Superstar
15,Fernando Rowe,4,2334,1237,4187,145,29,44,10,Superstar
19,Lance Goodwin,3,857,335,353,136,44,54,13,Superstar
32,Lynn Williams,5,559,230,564,152,37,32,9,Superstar
48,Pete Ingram,4,1516,957,1999,113,22,70,7,Superstar


0.6571428571428571


# *Answer: The minimum number of Makes is 46 in one season*

This answer shows us that Makes themselves aren't a huge predictor. What's more influential in the model is the ratio of Makes to Shots Taken.

Let's add a new column to our data set that shows the make percentage of each player as well.

In [24]:
Supers.insert(4, 'MakePercentage', Supers['Makes']/Supers['Shots'])

In [25]:
Supers

Unnamed: 0,Player_Name,Position_ID,Shots,Makes,MakePercentage,Points,Assists,Blocks,Fouls,Years_Exp,Predicted_Team_Value
1,Raul Little,5,931,488,0.524168,1739,122,31,61,11,Superstar
5,Jan Becker,4,70,46,0.657143,52,147,30,70,2,Superstar
6,John Mcguire,3,1540,685,0.444805,2237,147,53,51,13,Superstar
7,Robert Holloway,5,1918,948,0.494265,1228,116,52,33,11,Superstar
9,Stewart Chavez,5,1989,1004,0.504776,1347,146,38,31,14,Superstar
11,Drew Kelley,2,794,468,0.589421,1487,127,27,41,6,Superstar
15,Fernando Rowe,4,2334,1237,0.529991,4187,145,29,44,10,Superstar
19,Lance Goodwin,3,857,335,0.390898,353,136,44,54,13,Superstar
32,Lynn Williams,5,559,230,0.411449,564,152,37,32,9,Superstar
48,Pete Ingram,4,1516,957,0.631266,1999,113,22,70,7,Superstar


Now let's just see what the minimum make percentage is for superstars

In [26]:
Supers['MakePercentage'].min()

0.33386837881219905

33% is the minimum make percentage. While this is probably influential, another variable like Years_Exp might be a better predictor of Superstar status.

## **3. For how many years on Average have superstar players played?**

In [28]:
Supers['Years_Exp'].mean()

9.428571428571429

**Superstars have an average of 9.42 years of experience. This confirms what we thought in the last question that years of experience tend to be higher for superstars**

# **How many superstar players have less than 10 years of experience?**

In [35]:
print(Supers[Supers['Years_Exp'] < 10].count())

Player_Name             6
Position_ID             6
Shots                   6
Makes                   6
MakePercentage          6
Points                  6
Assists                 6
Blocks                  6
Fouls                   6
Years_Exp               6
Predicted_Team_Value    6
dtype: int64


There are 6 superstars with less than 10 years of experience.

Let's see who those players are:

In [36]:
print(Supers[Supers['Years_Exp'] < 10])

         Player_Name  Position_ID  Shots  Makes  MakePercentage  Points  \
5        Jan  Becker            4     70     46        0.657143      52   
11      Drew  Kelley            2    794    468        0.589421    1487   
32    Lynn  Williams            5    559    230        0.411449     564   
48      Pete  Ingram            4   1516    957        0.631266    1999   
49  Randall  Parsons            1   1159    754        0.650561    1428   
52  Michael  Padilla            2   1079    699        0.647822     898   

    Assists  Blocks  Fouls  Years_Exp Predicted_Team_Value  
5       147      30     70          2            Superstar  
11      127      27     41          6            Superstar  
32      152      37     32          9            Superstar  
48      113      22     70          7            Superstar  
49      121      29     41          7            Superstar  
52      119      45     59          8            Superstar  
