# NBA All-Star Weekend Challenge

Your task...answer the following questions given the dataset given about the NBA All-Star Game from the year 2000 - 2016

Become more familiar with the dataset <a href="https://www.kaggle.com/fmejia21/nba-all-star-game-20002016">here</a>

Once completed, commit this notebook to github and submit the link to the google classroom assignment.

<a href="https://classroom.google.com/u/2/c/NDc4MzEzMjI5Nzla/a/NTE3OTYxNzM2OTNa/details">Google Classroom</a>

# What is the average weight of all players who played during this time?

In [9]:
import pandas as pd
df = pd.read_csv('files/AllStars.csv')

avg_wt = df['WT'].mean()

print(f'The average weight of all players during this time is: {round(avg_wt,2)}lbs')


The average weight of all players during this time is: 228.75lbs


# What team is represented the most during this time?

In [60]:
#groups results by team, then counts the players from each team and sorts them as ascending=False - so highest is at the top of the list
#Then we just index the first index (highest players #) via .iloc[0]

most_rep_team = df.groupby(['Team'], as_index=False).count().sort_values('Player', ascending=False).reset_index()
most_rep_team.iloc[0]['Team']

print(f"The team most represented during this time was the: {most_rep_team.iloc[0]['Team']} with {most_rep_team.iloc[0]['Player']} players.")

The team most represented during this time was the: Miami Heat with 28 players.


# What draft class is represeted the most during this time?

In [74]:
#Creates new column 'Short Draft Class', since we only are interested in the 4 characters representing the draft year
df['Short Draft Class'] = df['NBA Draft Status'].str.slice(0,4)
#Grouped by 'Short Draft Class' (aka the year the player was drafted) and then count of each occurring draft year
#Then sorted so highest is at the top (like we did with the team rankings)
ranked_drafts = df.groupby([df['Short Draft Class']]).count().sort_values('Player', ascending=False).reset_index()
ranked_drafts

print(f"The draft class represented most during this time was the class of: {ranked_drafts.iloc[0]['Short Draft Class']} with {ranked_drafts.iloc[0]['Player']} players coming from that class.")

The draft class represented most during this time was the class of: 1996 with 63 players coming from that class.


# Are foriegn players more prevalent in the All-Star Game during this time?

In [82]:
#Create a new column determining if the player is foreign or not-foreign
df['Foreign'] = 'TEST'

#Simple conditional check, where if the player's 'Nationality' column has anything other than 'United States', they are classified as 'Yes' for Foreign
df.loc[df['Nationality'] == 'United States', 'Foreign'] = 'No'
df.loc[df['Nationality'] != 'United States', 'Foreign'] = 'Yes'
#Group and count process we've used for other answers
foreign_check = df.groupby(['Foreign']).count().sort_values('Player', ascending=False).reset_index()

print(f"Foreign players are not more prevalent in the All-Star game, as American players outnumber them {foreign_check.iloc[0]['Player']} to {foreign_check.iloc[1]['Player']}")

Foreign players are not more prevalent in the All-Star game, as American players outnumber them 365 to 74


# How often are the Western All-Stars voted in by fans?

In [119]:
#Start by looking at JUST the western conference players, they have 3 different allstar categories:
#Fan Vote Selection, Coach Selection and Replacement Selection
#Single out ONLY the players/allstars that fit one of these 3 selection categories
western_selection_df = df.loc[(df['Selection Type'] == 'Western All-Star Fan Vote Selection') | (df['Selection Type'] == 'Western All-Star Coaches Selection') | (df['Selection Type'] == 'Western All-Star Replacement Selection')].reset_index()
#Group them by the selection category type, then get hte count of each selection type group
western_players = western_selection_df.groupby(['Selection Type']).count()['Player']
#Added a little rounding/division to also calculate % of time the players from Western Conference are voted in by fans
print(f'Historically, All-Stars from the Western Conference are voted in by fans {western_players[1]} out of {sum(western_players[0:3])} times, or {round((western_players[1]/sum(western_players[0:3]))*100,2)}% of the time.')


Historically, All-Stars from the Western Conference are voted in by fans 85 out of 219 times, or 38.81% of the time.


# How many times does Steph Curry make the All-Star game during this time period?

In [143]:
#playing around with different ways to access the player by player name, since player names are duplicated if they are selected multiple different years,
#we can then just groupby/count the occurrences of his name (since no one else in the league shares his name)
steph_curry_selections = df.loc[df['Player'] == 'Stephen Curry'].groupby('Player').count().reset_index().iloc[0]['Year']
print(f'From 2000 to 2016 Steph Curry made the All-Star game {steph_curry_selections} times.')

From 2000 to 2016 Steph Curry made the All-Star game: 4 times.


# How many Shooting Gaurds have made the All-Star game during this time?

In [154]:
#Adding a 'count' column, to try a new approach at making counting certain occurrences (in this instance Shooting Guard/SG being in the All-Star Game)
df['count'] = 1
df.groupby(['Pos']).count()['count']
allstar_sg = df.groupby(['Pos']).count()['count']['SG']

print(f'From 2000 to 2016, there has been a total of {allstar_sg} Shooting Guards.')

From 2000 to 2016, there has been a total of 57 Shooting Guards.


# How many Chicago Bulls players have made the All-Star Game during this time period?

In [156]:
bulls_allstar = df.groupby(['Team']).count()['count']['Chicago Bulls']
print(f'During this time period there have been {bulls_allstar} players from the Chicago Bulls that have made the All-Star Game.')

During this time period there have been 12 players from the Chicago Bulls that have made the All-Star Game.
