# Notebook - WNBA Playoffs Qualification Prediction
The objective of this project is developing a machine learning model that is able to predict which WNBA teams will be qualified to the playoffs in the next season, based on data on the previous seasons.

Authors:
- 
- 
- Pedro Gomes

## Step 1: Data Analysis
We started by importing all data from the .csv files into DataFrames:

In [48]:
import pandas as pd

awards_players = pd.read_csv('dataset/awards_players.csv')
coaches = pd.read_csv('dataset/coaches.csv')
players = pd.read_csv('dataset/players.csv')
players_teams = pd.read_csv('dataset/players_teams.csv')
series_post = pd.read_csv('dataset/series_post.csv')
teams = pd.read_csv('dataset/teams.csv')
teams_post  = pd.read_csv('dataset/teams_post.csv')

After importing the .csv files, we started taking a look at each of the tables, to see the data we had available to work on. We also took some notes about each of the table's attributes, to make it easier to analyse and understand the data. Below, we have an explanation of each attribute of eah table and a sample of lines taken from the table.

## Table: Awards-Players

This table represents an association between a player and an award she received.

### Attribute Specification

| Attribute Name | Description |
| -- | -- |
|playerID|Player identifier|
|award|Name of the award|
|year|Year the player was awarded with this award|
|lgID|League identifier for the league where the player was awarded|

### Table Sample

In [49]:
awards_players.head(15)

Unnamed: 0,playerID,award,year,lgID
0,thompti01w,All-Star Game Most Valuable Player,1,WNBA
1,leslili01w,All-Star Game Most Valuable Player,2,WNBA
2,leslili01w,All-Star Game Most Valuable Player,3,WNBA
3,teaslni01w,All-Star Game Most Valuable Player,4,WNBA
4,swoopsh01w,All-Star Game Most Valuable Player,6,WNBA
5,douglka01w,All-Star Game Most Valuable Player,7,WNBA
6,fordch01w,All-Star Game Most Valuable Player,8,WNBA
7,cashsw01w,All-Star Game Most Valuable Player,10,WNBA
8,coopemi01w,Coach of the Year,1,WNBA
9,hugheda99w,Coach of the Year,2,WNBA


### Table Analysis

In [50]:
print(awards_players.describe())

            year
count  95.000000
mean    5.789474
std     2.747900
min     1.000000
25%     3.500000
50%     7.000000
75%     8.000000
max    10.000000


## Table: Coaches

Represents an association between a Coach, a year of the league, a team and the gathered stats about it.

### Attribute Specification
| Attribute Name | Description |
|--|--|
|coachID| Indicates which coach the stats refer to|
|year| Indicates the year the stats refer to|
|tmID| Indicates the team the stats refer to|
|lgID| Indicates the league the stats refer to|
|stint| Period of time that a player, coach, or other individual spends with a particular team or in the league itself|
|won| Number of matches won by the team in the specified year|
|lost| Number of matches lost by the team in the specified year|
|post_wins| Number of wins during playoffs|
|post_losses| Number of losses during playoffs|

### Table Sample


In [51]:
coaches.head(15)

Unnamed: 0,coachID,year,tmID,lgID,stint,won,lost,post_wins,post_losses
0,adamsmi01w,5,WAS,WNBA,0,17,17,1,2
1,adubari99w,1,NYL,WNBA,0,20,12,4,3
2,adubari99w,2,NYL,WNBA,0,21,11,3,3
3,adubari99w,3,NYL,WNBA,0,18,14,4,4
4,adubari99w,4,NYL,WNBA,0,16,18,0,0
5,adubari99w,5,NYL,WNBA,1,7,9,0,0
6,adubari99w,6,WAS,WNBA,0,16,18,0,0
7,adubari99w,7,WAS,WNBA,0,18,16,0,2
8,adubari99w,8,WAS,WNBA,1,0,4,0,0
9,aglerbr99w,1,MIN,WNBA,0,15,17,0,0


### Table Analysis

In [52]:
print(coaches.describe())

             year       stint         won        lost   post_wins  post_losses
count  162.000000  162.000000  162.000000  162.000000  162.000000   162.000000
mean     5.314815    0.364198   14.672840   14.623457    1.166667     1.172840
std      2.896715    0.693861    6.403445    5.678789    1.953656     1.316782
min      1.000000    0.000000    0.000000    2.000000    0.000000     0.000000
25%      3.000000    0.000000   10.000000   11.000000    0.000000     0.000000
50%      5.000000    0.000000   16.000000   15.000000    0.000000     0.000000
75%      8.000000    0.000000   18.750000   18.000000    1.000000     2.000000
max     10.000000    2.000000   28.000000   30.000000    7.000000     5.000000


## Table : Players

Associates a series of stats with a player.

|Attribute|Description|
|--|--|
|bioID| Indicates the player the stats refer to.|
|pos| Position the player plays in ??? ||
|firstseason| (aparece todos os valores a 0)|
|lastseason| (aparece todos os valores a 0)|
|height| Player's height|
|weight| Player's weight|
|college| College the player attended to|
|collegeOther| Another college the player attended to|
|birthDate| Player's date of birth|
|deathDate| Player's date of death ("0000-00-00" in case the player is still alive)|

### Table Sample

In [53]:
players.head(15)

Unnamed: 0,bioID,pos,firstseason,lastseason,height,weight,college,collegeOther,birthDate,deathDate
0,abrahta01w,C,0,0,74.0,190,George Washington,,1975-09-27,0000-00-00
1,abrossv01w,F,0,0,74.0,169,Connecticut,,1980-07-09,0000-00-00
2,adairje01w,C,0,0,76.0,197,George Washington,,1986-12-19,0000-00-00
3,adamsda01w,F-C,0,0,73.0,239,Texas A&M,Jefferson College (JC),1989-02-19,0000-00-00
4,adamsjo01w,C,0,0,75.0,180,New Mexico,,1981-05-24,0000-00-00
5,adamsmi01w,,0,0,0.0,0,,,0000-00-00,0000-00-00
6,adubari99w,,0,0,0.0,0,,,0000-00-00,0000-00-00
7,aglerbr99w,,0,0,0.0,0,,,0000-00-00,0000-00-00
8,aguilel01w,G,0,0,67.0,165,George Washington,,1976-10-15,0000-00-00
9,ajavoma01w,G,0,0,68.0,160,Rutgers,,1986-05-07,0000-00-00


### Table Analysis

In [54]:
print(players.describe())

       firstseason  lastseason      height      weight
count        893.0       893.0  893.000000  893.000000
mean           0.0         0.0   65.500560  145.415454
std            0.0         0.0   20.940425   61.275703
min            0.0         0.0    0.000000    0.000000
25%            0.0         0.0   68.000000  140.000000
50%            0.0         0.0   72.000000  162.000000
75%            0.0         0.0   75.000000  180.000000
max            0.0         0.0   80.000000  254.000000


## Table: Players-Teams
Associates a player with a year, the team they played in that year and a set of stats on their performance.

|Attribute|Description|
|--|--|
|playerID| Identifies a player|
|year| Year the player played in the team|
|stint|???|
|tmID| Identifies the team the player played in|
|lgID| Identifies the league the stats refer to|
|GP| Games Played|
|GS| Games Started(???)|
|minutes| Minutes Played|
|points| Points Scored (???)|
|oRebounds| Offensive Rebounds|
|dRebounds| Defensive Rebounds|
|rebounds| Total Rebounds|
|assists| Assists|
|steals| Steals|
|blocks| Blocks|
|turnovers| Turnovers|
|PF| ???|
|fgAttempted| Field Goals Attempted|
|fgMade| Field Goals Made|
|ftAttempted| Free Throws Attempted|
|ftMade| Free Throws Made|
|threeAttempted| Three Point Field Goals Attempted|
|threeMade| Three Point Field Goals Made|
|dq|???|
|PostGP| Games Played in the Playoffs|
|PostGS| Games Started in the Playoffs|
|PostMinutes| Minutes Played in the Playoffs|
|PostPoints| Points Scored in the Playoffs|
|PostoRebounds| Offensive Rebounds in the Playoffs|
|PostdRebounds| Defensive Rebounds in the Playoffs|
|PostRebounds| Total Rebounds in the Playoffs|
|PostAssists| Assists in the Playoffs|
|PostSteals| Steals in the Playoffs|
|PostBlocks| Blocks in the Playoffs|
|PostTurnovers| Turnovers in the Playoffs|
|PostPF| ???|
|PostfgAttempted||
|PostfgMade||
|PostftAttempted||
|PostftMade||
|PostthreeAttempted||
|PostthreeMade||
|PostDQ||

### Table Sample

In [55]:
players_teams.head(15)

Unnamed: 0,playerID,year,stint,tmID,lgID,GP,GS,minutes,points,oRebounds,...,PostBlocks,PostTurnovers,PostPF,PostfgAttempted,PostfgMade,PostftAttempted,PostftMade,PostthreeAttempted,PostthreeMade,PostDQ
0,abrossv01w,2,0,MIN,WNBA,26,23,846,343,43,...,0,0,0,0,0,0,0,0,0,0
1,abrossv01w,3,0,MIN,WNBA,27,27,805,314,45,...,0,0,0,0,0,0,0,0,0,0
2,abrossv01w,4,0,MIN,WNBA,30,25,792,318,44,...,1,8,8,22,6,8,8,7,3,0
3,abrossv01w,5,0,MIN,WNBA,22,11,462,146,17,...,2,3,7,23,8,4,2,8,2,0
4,abrossv01w,6,0,MIN,WNBA,31,31,777,304,29,...,0,0,0,0,0,0,0,0,0,0
5,abrossv01w,7,0,MIN,WNBA,34,2,724,263,44,...,0,0,0,0,0,0,0,0,0,0
6,abrossv01w,8,0,MIN,WNBA,34,29,843,345,53,...,0,0,0,0,0,0,0,0,0,0
7,abrossv01w,9,0,CON,WNBA,6,0,107,34,3,...,0,3,8,24,11,4,2,5,0,0
8,adamsjo01w,4,0,MIN,WNBA,10,0,96,33,10,...,0,0,0,0,0,0,0,0,0,0
9,aguilel01w,3,0,UTA,WNBA,28,0,141,43,0,...,0,0,1,0,0,0,0,0,0,0


### Table Analysis

In [56]:
print(players_teams.describe())

              year        stint           GP           GS      minutes   
count  1876.000000  1876.000000  1876.000000  1876.000000  1876.000000  \
mean      5.326226     0.113539    24.320896    12.438166   501.269190   
std       2.905475     0.422574    10.460614    13.641697   359.566117   
min       1.000000     0.000000     1.000000     0.000000     0.000000   
25%       3.000000     0.000000    17.000000     0.000000   165.000000   
50%       5.000000     0.000000    29.000000     5.000000   459.000000   
75%       8.000000     0.000000    32.000000    29.000000   826.250000   
max      10.000000     3.000000    34.000000    34.000000  1234.000000   

            points    oRebounds    dRebounds     rebounds      assists  ...   
count  1876.000000  1876.000000  1876.000000  1876.000000  1876.000000  ...  \
mean    176.261727    24.388060    54.334755    78.722814    39.031983  ...   
std     161.983839    23.325974    48.347088    69.210226    40.147037  ...   
min       0.00000

# Table : Series-Post

##  Table Sample

In [57]:
series_post.head(15)

Unnamed: 0,year,round,series,tmIDWinner,lgIDWinner,tmIDLoser,lgIDLoser,W,L
0,1,FR,A,CLE,WNBA,ORL,WNBA,2,1
1,1,FR,B,NYL,WNBA,WAS,WNBA,2,0
2,1,FR,C,LAS,WNBA,PHO,WNBA,2,0
3,1,FR,D,HOU,WNBA,SAC,WNBA,2,0
4,1,CF,E,HOU,WNBA,LAS,WNBA,2,0
5,1,CF,F,NYL,WNBA,CLE,WNBA,2,1
6,1,F,G,HOU,WNBA,NYL,WNBA,2,0
7,2,FR,A,CHA,WNBA,CLE,WNBA,2,1
8,2,FR,B,NYL,WNBA,MIA,WNBA,2,1
9,2,FR,C,LAS,WNBA,HOU,WNBA,2,0


## Table Analysis

In [58]:
print(series_post.describe())

           year          W          L
count  70.00000  70.000000  70.000000
mean    5.50000   2.071429   0.614286
std     2.89302   0.259399   0.572127
min     1.00000   2.000000   0.000000
25%     3.00000   2.000000   0.000000
50%     5.50000   2.000000   1.000000
75%     8.00000   2.000000   1.000000
max    10.00000   3.000000   2.000000
