NBA data insights

Task description

Create a Python 3.* that parses the NBA statistics provided in the attached files
Dump the statistics into a MySQL database in a normalized format
Create a user facing functionality to retrieve the following data points:
1. The best player in terms of productivity for each week of the selected season (each point/rebound/assist counts the same)
2. Prediction of a match result between two teams (the prediction model is up to you to create, the more interesting, the best)
The program can be web facing (FLASK) or command line only

Project Skeleton

In a first approach, our ML models will be consumed through API Rest or through CLI. Based on this idea, the file structure is as follows

├─ requirements.txt   <- Python library dependency
├─ README.md          <- The top-level README for this project.
├─ makefile           <- Shortcuts
├─ src                <- Implemented python modules
├─ models             <- AI generated models
├─ eda                <- Generated notebooks for exploratory data analysis
└─ data               <- Used data

How to replicate the enviroment?

Python Version: Python 3.8.*
Enviroment:

Replicate python enviroment with requirements.txt

# using pip
$ pip install -r requirements.txt

# using conda
$ conda create --name <env_name> --file requirements.txt

Export enviroment variables

export MYSQL_USER=?
export MYSQL_PASSWORD=?
export MYSQL_ROOT_PASSWORD=?
export MYSQL_DATABASE=?
export MYSQL_PORT=?
export MYSQL_HOST=?

NOTE: These variables are defined in database.conf file

Database

Copy the database.conf file to src/mysql_db folder

Create a mysql container

$ docker-compose --file src/mysql_db/docker-compose.yml up  --build -d

Dump data to Database

$ cd src/server
$ python populate_database.py

MAKE shortcuts

    $  make start-db
    $  make drop-db

Server

Run flask server application

$ export FLASK_APP=src/server/server.py
$ flask run

Postam collection with examples https://www.getpostman.com/collections/5d81d74ebf90f6a7649b

MAKE shortcuts
```
    $  make run-server
```

NOTE: The content of the folders model and data, and the file database.config are given by request. rocio.x.linares95@gmail.com.

Prediction Models

The goal of this model is predict if the home team of a game is going to win:

```
INPUT:{
        "GAME_ID": [<autogenerated_str>],     #  "20400425"
        "GAME_DATE_EST": [<date>],            #  "2003-12-30"
        "TEAM_ID_home": [<team_id>],          #  1610612759
        "TEAM_ID_away": [<team_id>],          #  1610612747
    }
```

```json
OUTPUT:{
       {"HOME_TEAM_WINS_PREDICTION": <prediction>}     #1 - YES | 0 - NO
```

Features

Two set of features were tested. These all features were extracted from rankings and games statistics.

40 features
41 features (extended)

See more details at this notebook: nba_features_extraction.ipynb

Experiment #1. Sklearn Classifiers Benchmarking

Sklearn tested classifiers:

Naive Bayes - Bernoulli
Nearest Neighbors
Decision Tree
Random Forest
Neural Net
AdaBoost
Stratified Gradient Descent - log
Stratified Gradient Descent - modified_huber

Classifiers benchmark:

See more details at this notebooks: nba_sklearn_model.ipynb , nba_sklearn_model_extended.ipynb

Experiment #2. Pytorch Classifier Model

See more details at this notebooks: nba_pytorch_model.ipynb , nba_pytorch_model_extended.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

eda

eda

models

models

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

makefile

makefile

requirements.txt

requirements.txt

Repository files navigation

NBA data insights

Task description

Project Skeleton

How to replicate the enviroment?

Database

Server

Prediction Models

Features

Experiment #1. Sklearn Classifiers Benchmarking

Experiment #2. Pytorch Classifier Model

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
data		data
eda		eda
models		models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
makefile		makefile
requirements.txt		requirements.txt

License

rocioxl/nba-data-insights

Folders and files

Latest commit

History

Repository files navigation

NBA data insights

Task description

Project Skeleton

How to replicate the enviroment?

Database

Server

Prediction Models

Features

Experiment #1. Sklearn Classifiers Benchmarking

Experiment #2. Pytorch Classifier Model

About

Resources

License

Stars

Watchers

Forks

Languages