<div align="center" style="font-size: 24px;">
  <strong>NBA Game Predictions for the 2022-2023 Season</strong>
</div>


### Introduction:
Step into the captivating world of NBA predictions for the thrilling 2022-2023 season! This dataset serves as your passport to a realm where basketball meets data science, offering a comprehensive look at every aspect of the game.


Imagine this dataset as a treasure trove of insights, meticulously compiled to provide a panoramic view of team performances, pre-game predictions, and the nail-biting outcomes that define the NBA season. Within its digital pages lie a myriad of statistical gems waiting to be unearthed.


At its core, this dataset is a testament to the marriage of sports and analytics. Through sophisticated metrics like ELO ratings, which gauge team strength based on historical performance, and CARMELO predictions, which forecast player contributions, you'll gain a deeper understanding of the dynamics at play on the court.


But it's not just about numbers – it's about the stories they tell. Each data point represents a moment of triumph, a missed opportunity, or a game-changing play. Whether you're a seasoned analyst looking to fine-tune your predictive models or a curious enthusiast eager to explore the intricacies of basketball analytics, this dataset offers something for everyone.


So, immerse yourself in the drama of buzzer-beaters, underdog victories, and championship aspirations. Uncover the patterns that define success, dissect the strategies that lead to victory, and marvel at the sheer complexity of the game. With this dataset as your guide, the world of NBA predictions is yours to explore – one stat at a time.

### Loading the Dataset:

1. **Import Pandas**: This line brings in the pandas library and assigns it the alias `pd`, which we'll use to access its functions and features.

2. **Load Dataset**: Here, we use `pd.read_csv(url)` to load the NBA game predictions dataset from the provided URL into a DataFrame named `nba_data`.

3. **Display First Few Rows**: Finally, `nba_data` is used to show the first few rows of the DataFrame, offering an initial glimpse into the dataset's contents.

In [1]:
# importing required libary.

import pandas as pd

# Loading the dataset from My github.

url = "https://raw.githubusercontent.com/GangasrinivasKatraji/NBA-Game-Predictions/main/Dataset/nba_elo_latest.csv"

nba_data = pd.read_csv(url)

# Display the first five rows of the dataset.

nba_data



Unnamed: 0,date,season,neutral,playoff,team1,team2,elo1_pre,elo2_pre,elo_prob1,elo_prob2,...,carm-elo2_post,raptor1_pre,raptor2_pre,raptor_prob1,raptor_prob2,score1,score2,quality,importance,total_rating
0,2022-10-18,2023,0,,BOS,PHI,1657.639749,1582.247327,0.732950,0.267050,...,,1693.243079,1641.876729,0.670612,0.329388,126,117,96,13,55
1,2022-10-18,2023,0,,GSW,LAL,1660.620307,1442.352444,0.862011,0.137989,...,,1615.718147,1472.173711,0.776502,0.223498,123,109,67,20,44
2,2022-10-19,2023,0,,IND,WAS,1399.201934,1440.077372,0.584275,0.415725,...,,1462.352663,1472.018225,0.599510,0.400490,107,114,37,28,33
3,2022-10-19,2023,0,,DET,ORL,1393.525172,1366.089249,0.675590,0.324410,...,,1308.969909,1349.865183,0.563270,0.436730,113,109,3,1,2
4,2022-10-19,2023,0,,ATL,HOU,1535.408152,1351.164973,0.837022,0.162978,...,,1618.256817,1283.328356,0.917651,0.082349,117,107,24,1,13
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1315,2023-06-01,2023,0,f,DEN,MIA,1649.957428,1640.358994,0.652693,0.347307,...,,1706.318434,1648.857733,0.736836,0.263164,104,93,97,100,99
1316,2023-06-04,2023,0,f,DEN,MIA,1656.989505,1633.326917,0.670812,0.329188,...,,1716.477363,1646.757209,0.744086,0.255914,108,111,97,99,98
1317,2023-06-07,2023,0,f,MIA,DEN,1641.650915,1648.665508,0.630711,0.369289,...,,1650.416984,1703.748040,0.522547,0.477453,94,109,97,100,99
1318,2023-06-09,2023,0,f,MIA,DEN,1623.302940,1667.013482,0.580306,0.419694,...,,1647.613213,1716.743939,0.499753,0.500247,95,108,97,100,99


### Data Summary:

So, this dataset gives you the lowdown on NBA games. You've got stuff like when the game went down, which season it was, and whether it was on neutral turf or part of the playoffs. They've also listed the home and away teams, their ratings before and after the game, and even the probability of each team winning based on those ratings.

There are these rating systems called **ELO**, **CARMELO**, and **RAPTOR**, and they're all about predicting game outcomes. You'll see the teams' scores, a measure of the game's quality, and how important it was.

Basically, this dataset's a goldmine if you're into dissecting NBA games, figuring out who's likely to win, and diving deep into game stats and significance.

Here is a breakdown of the columns included in the dataset:

- **Date**: Indicates the date when the game was played.

- **Season**: Denotes the NBA season year during which the game occurred.

- **Neutral**: A binary indicator (0 or 1) signifying whether the game was played on neutral ground.

- **Playoff**: Indicates whether the game was a playoff match.

- **Team1**: An abbreviation representing the home team.

- **Team2**: An abbreviation representing the away team.

- **ELO1_pre**: The home team's ELO rating before the game.

- **ELO2_pre**: The away team's ELO rating before the game.

- **ELO_prob1**: The probability of the home team winning based on ELO ratings.

- **ELO_prob2**: The probability of the away team winning based on ELO ratings.

- **ELO1_post**: The home team's ELO rating after the game.

- **ELO2_post**: The away team's ELO rating after the game.

- **CARMELO1_pre**: The home team's CARMELO rating before the game.

- **CARMELO2_pre**: The away team's CARMELO rating before the game.

- **CARMELO_prob1**: The probability of the home team winning based on CARMELO ratings.

- **CARMELO_prob2**: The probability of the away team winning based on CARMELO ratings.

- **CARMELO1_post**: The home team's CARMELO rating after the game.

- **CARMELO2_post**: The away team's CARMELO rating after the game.

- **RAPTOR1_pre**: The home team's RAPTOR rating before the game.

- **RAPTOR2_pre**: The away team's RAPTOR rating before the game.

- **RAPTOR_prob1**: The probability of the home team winning based on RAPTOR ratings.

- **RAPTOR_prob2**: The probability of the away team winning based on RAPTOR ratings.

- **Score1**: The score of the home team.

- **Score2**: The score of the away team.

- **Quality**: A measure of the game's quality derived from pre-game ELO ratings.

- **Importance**: A measure of the game's significance.

- **Total_rating**: A composite rating for the game, presumably considering multiple factors.

In [2]:
# Import the NBADataAnalyzer class from the Src.datasummary module
from Src.datasummary import NBADataAnalyzer

# Assuming nba_data is your dataset
# Initialize an instance of NBADataAnalyzer with the dataset
analyzer = NBADataAnalyzer(nba_data)


Getting the Info of the Dataset:

In [3]:
analyzer.get_info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1320 entries, 0 to 1319
Data columns (total 27 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   date            1320 non-null   object 
 1   season          1320 non-null   int64  
 2   neutral         1320 non-null   int64  
 3   playoff         90 non-null     object 
 4   team1           1320 non-null   object 
 5   team2           1320 non-null   object 
 6   elo1_pre        1320 non-null   float64
 7   elo2_pre        1320 non-null   float64
 8   elo_prob1       1320 non-null   float64
 9   elo_prob2       1320 non-null   float64
 10  elo1_post       1320 non-null   float64
 11  elo2_post       1320 non-null   float64
 12  carm-elo1_pre   0 non-null      float64
 13  carm-elo2_pre   0 non-null      float64
 14  carm-elo_prob1  0 non-null      float64
 15  carm-elo_prob2  0 non-null      float64
 16  carm-elo1_post  0 non-null      float64
 17  carm-elo2_post  0 non-null      f

Loading the first five rows of the Dataset:

In [4]:
analyzer.get_first_rows()

Unnamed: 0,date,season,neutral,playoff,team1,team2,elo1_pre,elo2_pre,elo_prob1,elo_prob2,...,carm-elo2_post,raptor1_pre,raptor2_pre,raptor_prob1,raptor_prob2,score1,score2,quality,importance,total_rating
0,2022-10-18,2023,0,,BOS,PHI,1657.639749,1582.247327,0.73295,0.26705,...,,1693.243079,1641.876729,0.670612,0.329388,126,117,96,13,55
1,2022-10-18,2023,0,,GSW,LAL,1660.620307,1442.352444,0.862011,0.137989,...,,1615.718147,1472.173711,0.776502,0.223498,123,109,67,20,44
2,2022-10-19,2023,0,,IND,WAS,1399.201934,1440.077372,0.584275,0.415725,...,,1462.352663,1472.018225,0.59951,0.40049,107,114,37,28,33
3,2022-10-19,2023,0,,DET,ORL,1393.525172,1366.089249,0.67559,0.32441,...,,1308.969909,1349.865183,0.56327,0.43673,113,109,3,1,2
4,2022-10-19,2023,0,,ATL,HOU,1535.408152,1351.164973,0.837022,0.162978,...,,1618.256817,1283.328356,0.917651,0.082349,117,107,24,1,13


Getting the shape of the Dataset:

In [5]:
analyzer.get_shape()

(1320, 27)

Finding the Null values of the Dataset:

In [6]:
analyzer.get_missing_values()

date                 0
season               0
neutral              0
playoff           1230
team1                0
team2                0
elo1_pre             0
elo2_pre             0
elo_prob1            0
elo_prob2            0
elo1_post            0
elo2_post            0
carm-elo1_pre     1320
carm-elo2_pre     1320
carm-elo_prob1    1320
carm-elo_prob2    1320
carm-elo1_post    1320
carm-elo2_post    1320
raptor1_pre          0
raptor2_pre          0
raptor_prob1         0
raptor_prob2         0
score1               0
score2               0
quality              0
importance           0
total_rating         0
dtype: int64

Finding the STD:

In [7]:
analyzer.get_descriptive_stats()

Unnamed: 0,season,neutral,elo1_pre,elo2_pre,elo_prob1,elo_prob2,elo1_post,elo2_post,carm-elo1_pre,carm-elo2_pre,...,carm-elo2_post,raptor1_pre,raptor2_pre,raptor_prob1,raptor_prob2,score1,score2,quality,importance,total_rating
count,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,0.0,0.0,...,0.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0,1320.0
mean,2023.0,0.001515,1511.867655,1511.311315,0.627059,0.372941,1510.664651,1512.514319,,,...,,1503.864076,1499.075101,0.603891,0.396109,115.630303,113.030303,50.511364,32.458333,41.744697
std,0.0,0.03891,88.962661,89.571409,0.151149,0.151149,89.560581,89.306155,,,...,,116.590601,116.514412,0.187301,0.187301,11.991075,12.00192,27.217232,29.408726,24.238657
min,2023.0,0.0,1264.103229,1257.300726,0.181341,0.066293,1257.300726,1271.086891,,,...,,955.234235,945.804075,0.027498,0.017256,80.0,79.0,0.0,0.0,0.0
25%,2023.0,0.0,1461.568345,1460.86653,0.533438,0.263124,1460.990452,1460.476484,,,...,,1445.29216,1432.346974,0.490977,0.257375,108.0,105.0,29.0,9.0,21.0
50%,2023.0,0.0,1524.876508,1523.136739,0.640681,0.359319,1521.917992,1525.368845,,,...,,1523.778627,1522.906299,0.61785,0.38215,116.0,113.0,52.0,24.0,43.5
75%,2023.0,0.0,1576.453207,1577.82554,0.736876,0.466562,1576.753366,1577.66317,,,...,,1585.715871,1579.358239,0.742625,0.509023,124.0,121.0,73.0,49.0,57.0
max,2023.0,1.0,1705.343075,1719.448667,0.933707,0.818659,1705.343075,1719.448667,,,...,,1733.775148,1728.915073,0.982744,0.972502,175.0,176.0,99.0,100.0,100.0


Removing the null values of the Dataset:

In [8]:
analyzer.remove_null_values()

In [9]:
analyzer.get_missing_values()

date              0.0
season            0.0
neutral           0.0
playoff           0.0
team1             0.0
team2             0.0
elo1_pre          0.0
elo2_pre          0.0
elo_prob1         0.0
elo_prob2         0.0
elo1_post         0.0
elo2_post         0.0
carm-elo1_pre     0.0
carm-elo2_pre     0.0
carm-elo_prob1    0.0
carm-elo_prob2    0.0
carm-elo1_post    0.0
carm-elo2_post    0.0
raptor1_pre       0.0
raptor2_pre       0.0
raptor_prob1      0.0
raptor_prob2      0.0
score1            0.0
score2            0.0
quality           0.0
importance        0.0
total_rating      0.0
dtype: float64

Now we have cleaned the dataset and removed all the null values. It is ready to perform Exploroatary Data Analytics.