# Ranking America's Best Sports Teams (Pt. I)

As a data enthusiast with a deep passion for sports, I’ve always been fascinated by the intersection of analytics and athletic performance. Having followed soccer for the past 15 years, I recently developed a keen interest in U.S. sports. While certain teams—such as the Kansas City Chiefs in the NFL—have dominated their respective leagues in recent years, I wanted to take a broader approach. Rather than focusing on individual franchises, I set out to analyze how states as a whole have performed across the four major professional leagues: the NFL, NBA, MLB, and NHL.

The project is the first step in identifying the best states, based on the performance of their teams across the 4 Major leagues. This step focuses on acquiring and preparing the underlying data.

**The table below highlights the average win percentage of all teams within a state, along with the count of divisional titles won over the last 5 years**}

In [19]:
import pandas as pd
from IPython.display import display


performanceByState = pd.read_csv('sports_performance.csv')
performanceByState = performanceByState.sort_values(by = 'StateWinPCT', ascending = False).reset_index().drop('index', axis=1)

display(performanceByState)

Unnamed: 0,State,NHL_WinPCT,NHL_Teams,NHL_ConferenceTitles,NBA_WinPCT,NBA_Teams,NBA_ConferenceTitles,MLB_WinPCT,MLB_Teams,MLB_ConferenceTitles,NFL_WinPCT,NFL_Teams,NFL_ConferenceTitles,StateWinPCT,TotalConferenceTitles
0,Wisconsin,,,,63.01,1.0,1.0,55.79,1.0,0.0,64.29,1.0,0.0,61.03,1.0
1,Missouri,53.03,1.0,0.0,,,,47.67,2.0,0.0,78.57,1.0,4.0,59.76,4.0
2,New York,50.44,3.0,0.0,53.41,2.0,0.0,54.24,2.0,1.0,73.49,1.0,0.0,57.89,1.0
3,Alberta,56.8,2.0,1.0,,,,,,,,,,56.8,1.0
4,Manitoba,55.64,1.0,0.0,,,,,,,,,,55.64,0.0
5,Minnesota,58.42,1.0,0.0,52.86,1.0,0.0,50.28,1.0,0.0,58.33,1.0,0.0,54.97,0.0
6,Massachusetts,62.73,1.0,0.0,66.03,1.0,2.0,49.86,1.0,0.0,39.29,1.0,0.0,54.48,2.0
7,Pennsylvania,43.7,2.0,0.0,60.11,1.0,0.0,47.74,2.0,1.0,60.71,2.0,2.0,53.07,3.0
8,Maryland,,,,,,,49.72,1.0,0.0,55.67,2.0,0.0,52.69,0.0
9,Colorado,63.68,1.0,1.0,64.03,1.0,1.0,40.74,1.0,0.0,41.67,1.0,0.0,52.53,2.0


## Data Acquisition Guide:

This section explains how the data for the table above was gathered. The process is illustrated using the NBA as an example, and the same approach can be applied to the other three leagues.

#### Step I: Get the NBA Team and Location Details using `nba_api`

In [21]:
# pip install nba_api

In [22]:
import nba_api
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)

In [25]:
from nba_api.stats.static import teams

teams_df = pd.DataFrame(teams.get_teams())
teams_df.head()

Unnamed: 0,id,full_name,abbreviation,nickname,city,state,year_founded
0,1610612737,Atlanta Hawks,ATL,Hawks,Atlanta,Georgia,1949
1,1610612738,Boston Celtics,BOS,Celtics,Boston,Massachusetts,1946
2,1610612739,Cleveland Cavaliers,CLE,Cavaliers,Cleveland,Ohio,1970
3,1610612740,New Orleans Pelicans,NOP,Pelicans,New Orleans,Louisiana,2002
4,1610612741,Chicago Bulls,CHI,Bulls,Chicago,Illinois,1966


The `stats.static` endpoint of nba_api includes the `get_teams()` function that can be called to get a list of all NBA teams, along with the corresponding city and state details

In [27]:
#Filter out the rows we are interested in
teams_df = teams_df[['id', 'full_name', 'nickname', 'city', 'state']]

#### Step II: Get the Win Percentages of all NBA teams across the last 5 years

The `leaguestandings` endpoint contains the league table for each season, along with several other team statistics

In [31]:
from nba_api.stats.endpoints import leaguestandings
leaguestandings.LeagueStandings(season = '2024-25').get_data_frames()[0].head()

Unnamed: 0,LeagueID,SeasonID,TeamID,TeamCity,TeamName,Conference,ConferenceRecord,PlayoffRank,ClinchIndicator,Division,DivisionRecord,DivisionRank,WINS,LOSSES,WinPCT,LeagueRank,Record,HOME,ROAD,L10,Last10Home,Last10Road,OT,ThreePTSOrLess,TenPTSOrMore,LongHomeStreak,strLongHomeStreak,LongRoadStreak,strLongRoadStreak,LongWinStreak,LongLossStreak,CurrentHomeStreak,strCurrentHomeStreak,CurrentRoadStreak,strCurrentRoadStreak,CurrentStreak,strCurrentStreak,ConferenceGamesBack,DivisionGamesBack,ClinchedConferenceTitle,ClinchedDivisionTitle,ClinchedPlayoffBirth,EliminatedConference,EliminatedDivision,AheadAtHalf,BehindAtHalf,TiedAtHalf,AheadAtThird,BehindAtThird,TiedAtThird,Score100PTS,OppScore100PTS,OppOver500,LeadInFGPCT,LeadInReb,FewerTurnovers,PointsPG,OppPointsPG,DiffPointsPG,vsEast,vsAtlantic,vsCentral,vsSoutheast,vsWest,vsNorthwest,vsPacific,vsSouthwest,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec,PreAS,PostAS
0,0,22024,1610612739,Cleveland,Cavaliers,East,26-6,1,,Central,8-1,1,40,9,0.816,,40-9,24-3,16-6,6-4,8-2,7-3,0-0,2-3,29-4,10,W 10,7,W 7,15,3,3,W 3,1,W 1,4,W 4,0.0,0.0,0,0,0,0,0,34-3,3-6,3-0,35-0,5-9,0-0,40-8,35-9,19-6,32-0,23-1,23-6,122.6,111.8,10.8,26-6,9-2,8-1,9-3,14-3,5-1,5-0,4-2,10-5,1-0,,,,,,,,5-0,12-3,12-1,40-9,
1,0,22024,1610612760,Oklahoma City,Thunder,West,22-8,1,,Northwest,7-1,1,39,9,0.813,1.0,39-9,21-3,17-6,7-3,9-1,7-3,0-0,0-3,31-1,12,W 12,6,W 6,15,2,2,W 2,-1,L 1,2,W 2,0.0,0.0,0,0,0,0,0,30-3,8-6,1-0,36-1,3-6,0-2,38-8,21-9,22-7,32-3,16-3,32-9,117.2,104.5,12.6,17-1,6-0,4-1,7-0,22-8,7-1,8-2,7-5,10-4,2-0,,,,,,,,4-0,11-4,12-1,39-9,
2,0,22024,1610612738,Boston,Celtics,East,25-9,2,,Atlantic,6-2,1,35,15,0.7,,35-15,16-9,19-6,7-3,5-5,7-3,3-2,6-3,22-4,7,W 7,7,W 7,7,2,1,W 1,3,W 3,3,W 3,5.5,0.0,0,0,0,0,0,27-5,6-10,2-0,30-6,4-6,1-3,35-11,24-15,16-8,23-2,22-4,25-8,117.3,108.5,8.8,25-9,6-2,11-4,8-3,10-6,3-1,3-3,4-2,10-6,1-0,,,,,,,,4-1,12-2,8-6,35-15,
3,0,22024,1610612763,Memphis,Grizzlies,West,19-12,2,,Southwest,8-4,1,34,16,0.68,,34-16,21-5,13-11,9-1,8-2,5-5,0-0,4-3,25-5,8,W 8,2,W 2,6,2,6,W 6,1,W 1,3,W 3,6.0,0.0,0,0,0,0,0,26-4,8-12,0-0,29-4,4-12,1-0,34-16,29-16,13-11,32-3,24-10,17-5,123.5,115.6,7.9,15-4,5-3,5-1,5-0,19-12,7-2,4-6,8-4,9-5,2-0,,,,,,,,3-3,10-4,10-4,34-16,
4,0,22024,1610612745,Houston,Rockets,West,19-10,3,,Southwest,9-2,2,32,17,0.653,2.0,32-17,15-8,17-8,5-5,5-5,7-3,2-1,6-4,16-5,7,W 7,7,W 7,5,3,-1,L 1,-2,L 2,-3,L 3,7.5,1.5,0,0,0,0,0,27-5,5-11,0-1,28-5,4-12,0-0,31-11,25-16,18-12,19-1,26-9,16-9,113.7,108.8,5.0,13-7,4-3,5-2,4-2,19-10,5-4,5-4,9-2,11-4,0-2,,,,,,,,3-2,11-4,7-5,32-17,


The API returns several team statistics for the season, including:
* Team details - Name, Division
* Team's Conference Record and Division Record
* Season Statistics - Wins, Losses, League Rank, Home and Away Record
* Team's Record against East, Atlantic, Central... teams
* Aggregated Game Statistics:
    * Matches where the team was ahead at halftime or end of the third quarter
    * Matches where the team scored 100+ points
    * Matches where the team had fewer turnovers than the opponent  

In [32]:
# Get the leaguestandings for the last 5 seasons and filter out only the required columns

# Create an empty dataframe where the results will be stored
nba_last5 = pd.DataFrame(columns=['SeasonID', 'TeamID', 'TeamCity', 'TeamName', 'WINS', 'LOSSES', 'PlayoffRank', 'DivisionRecord', 'WinPCT', 'ClinchedConferenceTitle'])

# Get records for the last 5 seasons
seasons = ['2020-21','2021-22','2022-23','2023-24','2024-25']
for season in seasons:
    ls = leaguestandings.LeagueStandings(season=season)
    ls_df = ls.get_data_frames()[0]
    ls_df = ls_df[['SeasonID', 'TeamID', 'TeamCity', 'TeamName', 'WINS', 'LOSSES', 'PlayoffRank', 'DivisionRecord', 'WinPCT', 'ClinchedConferenceTitle']]
    nba_last5 = pd.concat([nba_last5, ls_df], axis=0)

In [34]:
nba_last5.sample(5)

Unnamed: 0,SeasonID,TeamID,TeamCity,TeamName,WINS,LOSSES,PlayoffRank,DivisionRecord,WinPCT,ClinchedConferenceTitle
27,22023,1610612759,San Antonio,Spurs,22,60,14,3-13,0.268,0
29,22020,1610612766,Charlotte,Hornets,33,39,0,8-4,0.458,0
14,22021,1610612739,Cleveland,Cavaliers,44,38,8,10-6,0.537,0
6,22020,1610612755,Philadelphia,76ers,49,23,0,10-2,0.681,1
3,22020,1610612743,Denver,Nuggets,47,25,0,9-3,0.653,0


In [36]:
nba_last5['GP'] = nba_last5['WINS'] + nba_last5['LOSSES']

In [37]:
# Aggregate the Records at a Team Level
nba_last5_combined = nba_last5.groupby(['TeamName', 'TeamID']).agg(
    {
        'WINS' : 'sum',
        'LOSSES' : 'sum',
        'GP': 'sum',
        'ClinchedConferenceTitle' : 'sum'
    }
)

nba_last5_combined.reset_index(inplace = True)

#Calculate the Win Pct over the last 5 years
nba_last5_combined['WinPCT'] = nba_last5_combined['WINS'] / nba_last5_combined['GP']
nba_last5_combined.head()

Unnamed: 0,TeamName,TeamID,WINS,LOSSES,GP,ClinchedConferenceTitle,WinPCT
0,76ers,1610612755,220,146,366,1,0.601093
1,Bucks,1610612749,230,136,366,1,0.628415
2,Bulls,1610612741,177,191,368,0,0.480978
3,Cavaliers,1610612739,205,162,367,0,0.558583
4,Celtics,1610612738,243,125,368,1,0.660326


#### Step III: Combine the Win Percentage of the Team with the Location Details Dataframe

In [42]:
nba_final = pd.merge(left = nba_last5_combined, right = teams_df, left_on = 'TeamID', right_on = 'id')
nba_final = nba_final[['full_name', 'city', 'state', 'WinPCT', 'ClinchedConferenceTitle']]
nba_final.columns = ['Team', 'City', 'State', 'WinPCT', 'ConferenceTitle']

In [44]:
nba_final

Unnamed: 0,Team,City,State,WinPCT,ConferenceTitle
0,Philadelphia 76ers,Philadelphia,Pennsylvania,0.601093,1
1,Milwaukee Bucks,Milwaukee,Wisconsin,0.628415,1
2,Chicago Bulls,Chicago,Illinois,0.480978,0
3,Cleveland Cavaliers,Cleveland,Ohio,0.558583,0
4,Boston Celtics,Boston,Massachusetts,0.660326,1
5,Los Angeles Clippers,Los Angeles,California,0.577657,0
6,Memphis Grizzlies,Memphis,Tennessee,0.559783,0
7,Atlanta Hawks,Atlanta,Georgia,0.5,0
8,Miami Heat,Miami,Florida,0.567123,1
9,Charlotte Hornets,Charlotte,North Carolina,0.372603,0


The same steps were then applied to MLB, NHL, and NFL using their respective Python libraries. Once the data was collected, state-level aggregations were calculated by averaging the win percentages of all teams within each state. This process resulted in the final table (shown above), which ranks states based on their performance across the four major leagues over the last five years.