# What Makes a Top 10 NBA Draft Pick?
**Analyzing Player Attributes from Top 10 NBA First Round Draft Picks**

Authors: William Miller, Matthew Muccio

## Project Abstract

Every June, the National Basketball Association (NBA) holds a draft, where each of the thirty teams have an oppurtunity to select two top prospects to join their organization. With only two rounds in the draft, and only two chances per team - it is crucial that a team does proper research, scouting, and analysis to ensure that their draft picks have a significant impact on their odds at winning a championship.

Using the NBA data API for all players currently in the league, we will examine top 10 draft picks and see how the compare to all players in the league. Then we will make use of various individual player attributes to determine what properties correlate to top NBA draft picks. After having read through our data analysis, we hope that you will understand the importance of research and scouting that NBA teams undergo in selecting top draft picks.

### Project Outline

1. Project Introduction
  - A. Libraries and Dependencies
  - B. Data Sources
  - C. Importing and Examining the Dataset
  - D. Getting Top 10 Draft Picks
  - E. Various Data Trends of Top 10 Draft Picks
2. EDA of Top 10 Draft Picks
  - A. Top 10 Draft Picks by Position
  - B. How many current NBA players are top 10 picks?
  - C. How many current NBA players are not top 10 picks?
  - D. Top 10 Draft Picks by Size
  - E. Top 10 Draft Picks by Place of Origin
3. Analysis of Top 10 Draft Picks' Attributes
  - A. What Attributes Matter Most? 
  - B. Comparing Attributes
  - C. Visualizing Key Attributes
4. Finding Key Attributes Using Multiple Linear Regression
  - A. Null Hypothesis Testing
  - B. Using SciKit-Learn and StatsModel for Regression Model
5. Predicting the Ideal Draft Pick Based on Player Attributes with ML
  - A. Training and Testing
6. Project Conclusion
  - A. Closing Statement About Attributes
  - B. Closing Statement About Ideal Draft Pick Prediction

****

# 1. Introduction

## 1A. Libraries and Dependencies

- Matplotlib - pyplot: 
- Pandas: 
- scikit-learn: 
- Seaborn: 
- statsmodels - api: 

In [4]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn import linear_model
from sklearn import model_selection
from statsmodels import api as sm

## 1B. Data Sources

The dataset used in our analysis includes all information from almost 500 current players in the NBA. The NBA releases an updated version of this data everyday. It contains information such as player names, height, weight, college, draft number, and country of origin. We will be looking at information from only the players who were drafted in the top 10 of their class. 

The dataset can be found [here](http://data.nba.net/10s/prod/v1/2018/players.json). It comes directly from the NBA website.

## 1C. Importing and Examining the Dataset

Loading the JSON into a Pandas dataframe. We organize the dataframe into only the columns of data we will use for analysis. These include: 
    - First Name
    - Last Name
    - Position
    - Height (Feet)
    - Height (Inches)
    - Weight (Pounds)
    - Date of Birth (Year)
    - Date of Birth (Month)
    - Date of Birth (Day)
    - NBA Debut Year
    - Number of Years in NBA
    - College
    - Last affiliation (College or Location)
    - Country
    - Draft Round Number
    - Draft Pick Number
    - Draft Year
    - Team

Since team identification for each player is stored as a number instead of a team name, we need to convert the id's to team names. To do this we will be looking at another page of data that the NBA provides. The data page can be found [here](http://data.nba.net/) and includes all of the information of teams that are associated with the NBA, including the team id's. We can use this data to match the team id with an actual team name in our dataframe.

## 1D. Getting Top 10 Draft Picks

Since our project focuses on Top 10 draft picks, we will organize our data to isolate only players who were drafted within the top ten of their class.

## 1E. Various Data Trends