# Data Mining Project - WNBA Playoffs Prediction - G24

## Business Understanding

#### Our data
Basketball tournaments are usually split in two parts. First, all teams play each other aiming to achieve the greatest number of wins possible. Then, at the end of the first part of the season, a pre determined number of teams which were able to win the most games are qualified to the playoff season, where they play series of knock-out matches for the trophy.

For the 10 years, data from players, teams, coaches, games and several other metrics were gathered and arranged on this dataset. The goal is to use this data to predict which teams will qualify for the playoffs in the next season.



#### Competition Format
The 12 teams in the WNBA are split into an Eastern Conference and a Western Conference. WNBA fixtures begin with preseason games in May before each team plays 20 home games and 20 road games during the regular season.

The aim for every team is to qualify for the Playoffs, which begin in September each year.

The WNBA teams with the eight best regular season records regardless of standing qualify for the Playoffs. Higher seeds matchup with lower seeds, so the top seed faces the eight seed, the second seed faces the seven seed and so on.

When it comes to betting on the Playoffs, the first round are best-of-three series. The semifinals and final are both best-of-five, meaning WNBA teams need to record three wins to claim victory in the series.

## Database Connection

We used a free service to host our database. The Database is in MySQL.

In [1]:
# DB Credentials
import json

with open("config.json") as config_file:
    config = json.load(config_file)

host = config["db_host"]
user = config["db_user"]
password = config["db_password"]
database = config["db_database"]

In [2]:
import mysql.connector

connection = mysql.connector.connect(
    host=host,
    user=user,
    password=password,
    database=database
)

cursor = connection.cursor()

def execute(query):
    cursor.execute(query)
    connection.commit()
    return cursor.fetchall()

def fetch(query):
    cursor.execute(query)
    return cursor.fetchall()

SELECT = "SELECT * FROM " # + table_name
INSERT = "INSERT INTO " # + table_name + " VALUES " + values
UPDATE = "UPDATE " # + table_name + " SET " + column_name + " = " + value
DELETE = "DELETE FROM " # + table_name + " WHERE " + column_name + " = " + value

 The data about the players, teams and coaches consist of following relations:

    awards_players (96 objects) - each record describes awards and prizes received by players across 10 seasons,
    coaches (163 objects) - each record describes all coaches who've managed the teams during the time period,
    players (894 objects) - each record contains details of all players,
    players_teams (1877 objects) - each record describes the performance of each player for each team they played,
    series_post (71 objects) - each record describes the series' results,
    teams (143 objects) - each record describes the performance of the teams for each season,
    teams_post (81 objects) - each record describes the results of each team at the post-season.


In [3]:
awards_players = fetch(SELECT + "awards_players") # awards and prizes received by players across 10 seasons,
coaches = fetch(SELECT + "coaches") # all coaches who've managed the teams during the time period,
players = fetch(SELECT + "players") # details of all players,
players_teams = fetch(SELECT + "players_teams") # performance of each player for each team they played,
series_post = fetch(SELECT + "series_post") # series' results,
teams = fetch(SELECT + "teams") # performance of the teams for each season,
teams_post = fetch(SELECT + "teams_post") # results of each team at the post-season.

## Data Understanding

In [None]:
#print(coaches.description)

for coach in coaches:
    print(coach)

('adamsmi01w', 5, 'WAS', 'WNBA', 0, 17, 17, 1, 2)
('adubari99w', 1, 'NYL', 'WNBA', 0, 20, 12, 4, 3)
('adubari99w', 2, 'NYL', 'WNBA', 0, 21, 11, 3, 3)
('adubari99w', 3, 'NYL', 'WNBA', 0, 18, 14, 4, 4)
('adubari99w', 4, 'NYL', 'WNBA', 0, 16, 18, 0, 0)
('adubari99w', 5, 'NYL', 'WNBA', 1, 7, 9, 0, 0)
('adubari99w', 6, 'WAS', 'WNBA', 0, 16, 18, 0, 0)
('adubari99w', 7, 'WAS', 'WNBA', 0, 18, 16, 0, 2)
('adubari99w', 8, 'WAS', 'WNBA', 1, 0, 4, 0, 0)
('aglerbr99w', 1, 'MIN', 'WNBA', 0, 15, 17, 0, 0)
('aglerbr99w', 2, 'MIN', 'WNBA', 0, 12, 20, 0, 0)
('aglerbr99w', 3, 'MIN', 'WNBA', 1, 6, 13, 0, 0)
('aglerbr99w', 9, 'SEA', 'WNBA', 0, 22, 12, 1, 2)
('aglerbr99w', 10, 'SEA', 'WNBA', 0, 20, 14, 1, 2)
('allenso99w', 1, 'SAC', 'WNBA', 0, 21, 11, 0, 2)
('allenso99w', 2, 'SAC', 'WNBA', 1, 6, 6, 0, 0)
('bibbyhe01w', 6, 'LAS', 'WNBA', 1, 13, 15, 0, 0)
('boguemu01w', 6, 'CHA', 'WNBA', 2, 3, 7, 0, 0)
('boguemu01w', 7, 'CHA', 'WNBA', 0, 11, 23, 0, 0)
('bouceje01w', 8, 'SAC', 'WNBA', 0, 19, 15, 1, 2)
('boucej

In [5]:
connection.close()