# Web Scraping Last 20 Regular Season NBA Players Stats

This little scraping project is to demostrate how to scraped last 20 regular season NBA players stats off ESPN websites and export in a csv format for futher analysis. 

First of all, we will need to inport the beautiful soup, urllib.request, pandas and csv packages.

In [310]:
import bs4 as bs        
import urllib.request    
import csv 
import pandas as pd

Looking at the ESPN website, I found out that https://www.espn.com/nba/teams will display each team stats url. We can use this website to scraped all the individual team stats url.

In [311]:
domain_url = 'https://www.espn.com' # This will be added with the team url to form the url we want to scraped

the_url = 'https://www.espn.com/nba/teams' # This is the page we are scraping team urls from

all_team_stat_url = []
req = urllib.request.Request(the_url)
sauce = urllib.request.urlopen(req).read()
soup = bs.BeautifulSoup(sauce, 'lxml')
for i in range(30): # There are 30 nba teams in total
    stat_href = soup.find_all('div', attrs = {'class' : 'pl3'})[i].find_all('a', attrs = {'class':'AnchorLink'})[1].get('href') 
    stat_href_split = (stat_href.rsplit('/',1)[0])# getting rid of team name after the team abreviation
    for i in range(1,21):
        if i < 10: 
            season_url = '/season/200{}'.format(i)
        else:
            season_url = '/season/20{}'.format(i)
        team_stat_url = domain_url + stat_href_split + season_url # each team url that we are going to scraped
        all_team_stat_url.append(team_stat_url)
print('Each team url:\n')    
print(*all_team_stat_url, sep = '\n')


Each team url:

https://www.espn.com/nba/team/stats/_/name/bos/season/2001
https://www.espn.com/nba/team/stats/_/name/bos/season/2002
https://www.espn.com/nba/team/stats/_/name/bos/season/2003
https://www.espn.com/nba/team/stats/_/name/bos/season/2004
https://www.espn.com/nba/team/stats/_/name/bos/season/2005
https://www.espn.com/nba/team/stats/_/name/bos/season/2006
https://www.espn.com/nba/team/stats/_/name/bos/season/2007
https://www.espn.com/nba/team/stats/_/name/bos/season/2008
https://www.espn.com/nba/team/stats/_/name/bos/season/2009
https://www.espn.com/nba/team/stats/_/name/bos/season/2010
https://www.espn.com/nba/team/stats/_/name/bos/season/2011
https://www.espn.com/nba/team/stats/_/name/bos/season/2012
https://www.espn.com/nba/team/stats/_/name/bos/season/2013
https://www.espn.com/nba/team/stats/_/name/bos/season/2014
https://www.espn.com/nba/team/stats/_/name/bos/season/2015
https://www.espn.com/nba/team/stats/_/name/bos/season/2016
https://www.espn.com/nba/team/stats/_/na

Once we got all the teams url we will begin scraping the players stats from each team in each season and append it to the scaped_data dictonary.

In [312]:
stats_we_are_scraping = ["Season","Team", "Name", "GP", "GS","MIN", "PTS","OR", "DR", 
                          "REB", "AST", "STL", "BLK", "TO", "PF", "AST/TO", "PER"]  # declaring all the stats we are scraping

scraped_data = {key : [] for key in stats_we_are_scraping} #putting all fields to one set of dectionary

for i, url in enumerate(all_team_stat_url, start = 1): 
    req = urllib.request.Request(url)
    sauce = urllib.request.urlopen(req).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    
    season = soup.find('h1', attrs = {'class':'headline headline__h1 dib'}).text.split('Stats ')[1]
    team_name = soup.find('span', attrs = {'class':'db pr3 nowrap'}).text + ' ' + soup.find('span', attrs = {'class':'db fw-bold'}).text
    name_list = soup.find_all('a', attrs = {'class': 'AnchorLink', 'tabindex':'0', 'data-player-uid': True})
    print("Scraping:", team_name, "Players {} Regular Season stats -- ".format(season), i, "/", len(all_team_stat_url))
    
    for i in range(int(len(name_list)/2)):
        scraped_data['Season'].append(season + ' Season')
        scraped_data['Team'].append(team_name)
        scraped_data['Name'].append(name_list[i].text)
        all_stats = soup.find_all('tr', attrs = {'class':'Table__TR Table__TR--sm Table__even','data-idx' : i})[1].find_all('span')
        scraped_data['GP'].append(int(all_stats[0].text)) #changing str to int to make it sortable
        scraped_data['GS'].append(int(all_stats[1].text))
        scraped_data['MIN'].append(float(all_stats[2].text))
        scraped_data['PTS'].append(float(all_stats[3].text))
        scraped_data['OR'].append(float(all_stats[4].text))
        scraped_data['DR'].append(float(all_stats[5].text))
        scraped_data['REB'].append(float(all_stats[6].text))
        scraped_data['AST'].append(float(all_stats[7].text))
        scraped_data['STL'].append(float(all_stats[8].text))
        scraped_data['BLK'].append(float(all_stats[9].text))
        scraped_data['TO'].append(float(all_stats[10].text))
        scraped_data['PF'].append(float(all_stats[11].text))
        scraped_data['AST/TO'].append(float(all_stats[12].text))
        scraped_data['PER'].append(float(all_stats[13].text))
print('Done!')



Scraping: Boston Celtics Players 2000-01 Regular Season stats --  1 / 600
Scraping: Boston Celtics Players 2001-02 Regular Season stats --  2 / 600
Scraping: Boston Celtics Players 2002-03 Regular Season stats --  3 / 600
Scraping: Boston Celtics Players 2003-04 Regular Season stats --  4 / 600
Scraping: Boston Celtics Players 2004-05 Regular Season stats --  5 / 600
Scraping: Boston Celtics Players 2005-06 Regular Season stats --  6 / 600
Scraping: Boston Celtics Players 2006-07 Regular Season stats --  7 / 600
Scraping: Boston Celtics Players 2007-08 Regular Season stats --  8 / 600
Scraping: Boston Celtics Players 2008-09 Regular Season stats --  9 / 600
Scraping: Boston Celtics Players 2009-10 Regular Season stats --  10 / 600
Scraping: Boston Celtics Players 2010-11 Regular Season stats --  11 / 600
Scraping: Boston Celtics Players 2011-12 Regular Season stats --  12 / 600
Scraping: Boston Celtics Players 2012-13 Regular Season stats --  13 / 600
Scraping: Boston Celtics Players 2

Scraping: Chicago Bulls Players 2009-10 Regular Season stats --  110 / 600
Scraping: Chicago Bulls Players 2010-11 Regular Season stats --  111 / 600
Scraping: Chicago Bulls Players 2011-12 Regular Season stats --  112 / 600
Scraping: Chicago Bulls Players 2012-13 Regular Season stats --  113 / 600
Scraping: Chicago Bulls Players 2013-14 Regular Season stats --  114 / 600
Scraping: Chicago Bulls Players 2014-15 Regular Season stats --  115 / 600
Scraping: Chicago Bulls Players 2015-16 Regular Season stats --  116 / 600
Scraping: Chicago Bulls Players 2016-17 Regular Season stats --  117 / 600
Scraping: Chicago Bulls Players 2017-18 Regular Season stats --  118 / 600
Scraping: Chicago Bulls Players 2018-19 Regular Season stats --  119 / 600
Scraping: Chicago Bulls Players 2019-20 Regular Season stats --  120 / 600
Scraping: Cleveland Cavaliers Players 2000-01 Regular Season stats --  121 / 600
Scraping: Cleveland Cavaliers Players 2001-02 Regular Season stats --  122 / 600
Scraping: Cle

Scraping: Denver Nuggets Players 2016-17 Regular Season stats --  217 / 600
Scraping: Denver Nuggets Players 2017-18 Regular Season stats --  218 / 600
Scraping: Denver Nuggets Players 2018-19 Regular Season stats --  219 / 600
Scraping: Denver Nuggets Players 2019-20 Regular Season stats --  220 / 600
Scraping: Minnesota Timberwolves Players 2000-01 Regular Season stats --  221 / 600
Scraping: Minnesota Timberwolves Players 2001-02 Regular Season stats --  222 / 600
Scraping: Minnesota Timberwolves Players 2002-03 Regular Season stats --  223 / 600
Scraping: Minnesota Timberwolves Players 2003-04 Regular Season stats --  224 / 600
Scraping: Minnesota Timberwolves Players 2004-05 Regular Season stats --  225 / 600
Scraping: Minnesota Timberwolves Players 2005-06 Regular Season stats --  226 / 600
Scraping: Minnesota Timberwolves Players 2006-07 Regular Season stats --  227 / 600
Scraping: Minnesota Timberwolves Players 2007-08 Regular Season stats --  228 / 600
Scraping: Minnesota Timb

Scraping: Golden State Warriors Players 2018-19 Regular Season stats --  319 / 600
Scraping: Golden State Warriors Players 2019-20 Regular Season stats --  320 / 600
Scraping: LA Clippers Players 2000-01 Regular Season stats --  321 / 600
Scraping: LA Clippers Players 2001-02 Regular Season stats --  322 / 600
Scraping: LA Clippers Players 2002-03 Regular Season stats --  323 / 600
Scraping: LA Clippers Players 2003-04 Regular Season stats --  324 / 600
Scraping: LA Clippers Players 2004-05 Regular Season stats --  325 / 600
Scraping: LA Clippers Players 2005-06 Regular Season stats --  326 / 600
Scraping: LA Clippers Players 2006-07 Regular Season stats --  327 / 600
Scraping: LA Clippers Players 2007-08 Regular Season stats --  328 / 600
Scraping: LA Clippers Players 2008-09 Regular Season stats --  329 / 600
Scraping: LA Clippers Players 2009-10 Regular Season stats --  330 / 600
Scraping: LA Clippers Players 2010-11 Regular Season stats --  331 / 600
Scraping: LA Clippers Players 2

Scraping: Charlotte Hornets Players 2006-07 Regular Season stats --  427 / 600
Scraping: Charlotte Hornets Players 2007-08 Regular Season stats --  428 / 600
Scraping: Charlotte Hornets Players 2008-09 Regular Season stats --  429 / 600
Scraping: Charlotte Hornets Players 2009-10 Regular Season stats --  430 / 600
Scraping: Charlotte Hornets Players 2010-11 Regular Season stats --  431 / 600
Scraping: Charlotte Hornets Players 2011-12 Regular Season stats --  432 / 600
Scraping: Charlotte Hornets Players 2012-13 Regular Season stats --  433 / 600
Scraping: Charlotte Hornets Players 2013-14 Regular Season stats --  434 / 600
Scraping: Charlotte Hornets Players 2014-15 Regular Season stats --  435 / 600
Scraping: Charlotte Hornets Players 2015-16 Regular Season stats --  436 / 600
Scraping: Charlotte Hornets Players 2016-17 Regular Season stats --  437 / 600
Scraping: Charlotte Hornets Players 2017-18 Regular Season stats --  438 / 600
Scraping: Charlotte Hornets Players 2018-19 Regular 

Scraping: Houston Rockets Players 2013-14 Regular Season stats --  534 / 600
Scraping: Houston Rockets Players 2014-15 Regular Season stats --  535 / 600
Scraping: Houston Rockets Players 2015-16 Regular Season stats --  536 / 600
Scraping: Houston Rockets Players 2016-17 Regular Season stats --  537 / 600
Scraping: Houston Rockets Players 2017-18 Regular Season stats --  538 / 600
Scraping: Houston Rockets Players 2018-19 Regular Season stats --  539 / 600
Scraping: Houston Rockets Players 2019-20 Regular Season stats --  540 / 600
Scraping: Memphis Grizzlies Players 2000-01 Regular Season stats --  541 / 600
Scraping: Memphis Grizzlies Players 2001-02 Regular Season stats --  542 / 600
Scraping: Memphis Grizzlies Players 2002-03 Regular Season stats --  543 / 600
Scraping: Memphis Grizzlies Players 2003-04 Regular Season stats --  544 / 600
Scraping: Memphis Grizzlies Players 2004-05 Regular Season stats --  545 / 600
Scraping: Memphis Grizzlies Players 2005-06 Regular Season stats -

Once we scraped all the data in to Scraped_data dictionary, we will create a dataframe using Pandas and export to a CSV file.

In [316]:
df = pd.DataFrame(scraped_data)
df.to_csv('01-20 Regular Season NBA Players Stats.csv') #export stats to a csv file
df.index += 1 #setting index starting from 1 
df.head(30)

Unnamed: 0,Season,Team,Name,GP,GS,MIN,PTS,OR,DR,REB,AST,STL,BLK,TO,PF,AST/TO,PER
1,2000-01 Season,Boston Celtics,Paul Pierce,82,82,38.0,25.3,1.1,5.2,6.4,3.1,1.7,0.8,3.2,3.1,1.0,0.0
2,2000-01 Season,Boston Celtics,Antoine Walker,81,81,41.9,23.4,1.9,7.0,8.9,5.5,1.7,0.6,3.7,3.1,1.5,0.0
3,2000-01 Season,Boston Celtics,Bryant Stith,78,74,32.1,9.7,0.8,2.8,3.6,2.2,1.2,0.2,1.2,2.3,1.9,0.0
4,2000-01 Season,Boston Celtics,Kenny Anderson,33,28,25.7,7.5,0.5,1.7,2.2,4.1,1.3,0.1,1.6,1.9,2.6,0.0
5,2000-01 Season,Boston Celtics,Vitaly Potapenko,82,7,23.2,7.5,2.5,3.5,6.0,0.8,0.6,0.3,1.3,2.8,0.6,0.0
6,2000-01 Season,Boston Celtics,Eric Williams,81,11,21.5,6.6,0.8,1.8,2.6,1.4,0.8,0.2,0.9,2.2,1.5,0.0
7,2000-01 Season,Boston Celtics,Tony Battie,40,25,21.1,6.5,1.8,4.0,5.8,0.4,0.7,1.5,0.9,3.2,0.4,0.0
8,2000-01 Season,Boston Celtics,Milt Palacio,58,6,19.7,5.9,0.4,1.3,1.8,2.6,0.8,0.0,1.4,1.4,1.9,0.0
9,2000-01 Season,Boston Celtics,Doug Overton,7,1,20.6,5.4,0.4,1.7,2.1,2.7,0.6,0.0,1.9,2.1,1.5,0.0
10,2000-01 Season,Boston Celtics,Chris Carr,35,0,8.8,4.8,0.3,0.9,1.3,0.3,0.1,0.1,0.5,1.3,0.6,0.0


We could perform some analysis on this dataframe. For instance, we could find out the most efficient player in the league. (Please note, I elect only player who play more than 20 games can qualify. This will have a better representation of the efficiency ranking in the league.)

In [314]:
df_gp20 = df[df['GP'] > 20]
df_gp20.sort_values('PER', ascending = False).head()

Unnamed: 0,Season,Team,Name,GP,GS,MIN,PTS,OR,DR,REB,AST,STL,BLK,TO,PF,AST/TO,PER
2088,2008-09 Season,Cleveland Cavaliers,LeBron James,81,81,37.7,28.4,1.3,6.3,7.6,7.2,1.7,1.1,3.0,1.7,2.4,31.76
3329,2019-20 Season,Milwaukee Bucks,Giannis Antetokounmpo,57,57,30.9,29.6,2.3,11.5,13.7,5.8,1.0,1.0,3.7,3.0,1.6,31.71
7572,2012-13 Season,Miami Heat,LeBron James,76,76,37.9,26.8,1.3,6.8,8.0,7.3,1.7,0.9,3.0,1.4,2.4,31.67
5188,2015-16 Season,Golden State Warriors,Stephen Curry,79,79,34.2,30.1,0.9,4.6,5.4,6.7,2.1,0.2,3.3,2.0,2.0,31.56
2104,2009-10 Season,Cleveland Cavaliers,LeBron James,76,76,39.0,29.7,0.9,6.4,7.3,8.6,1.6,1.0,3.4,1.6,2.5,31.19


We could also find out who average double double this seaon. We could tell that only LeBron James average double double in PTS and AST.

In [315]:
df_pts_reb = df_gp20.loc[(df['PTS'] > 10) & (df['REB'] >10 ), ] #Players who average double double in PTS and REB
df_pts_ast = df_gp20.loc[(df['PTS'] > 10) & (df['AST'] >10 ), ] #Players who average double double in PTS and AST

df_pts_reb.append(df_pts_ast) # Appending both dataframe together


Unnamed: 0,Season,Team,Name,GP,GS,MIN,PTS,OR,DR,REB,AST,STL,BLK,TO,PF,AST/TO,PER
104,2006-07 Season,Boston Celtics,Al Jefferson,69,60,33.6,16.0,3.4,7.5,11.0,1.3,0.7,1.5,2.0,3.4,0.6,19.90
509,2000-01 Season,New York Knicks,Marcus Camby,63,63,33.8,12.0,3.1,8.4,11.5,0.8,1.0,2.2,1.0,3.3,0.8,0.00
526,2001-02 Season,New York Knicks,Marcus Camby,29,29,34.7,11.1,3.1,8.0,11.1,1.1,1.2,1.7,1.4,3.7,0.8,0.00
576,2004-05 Season,New York Knicks,Kurt Thomas,80,80,35.7,11.5,2.1,8.3,10.4,2.0,0.9,1.0,1.2,3.9,1.6,14.19
615,2006-07 Season,New York Knicks,David Lee,58,12,29.8,10.7,3.4,7.0,10.4,1.8,0.8,0.4,1.6,2.7,1.1,20.31
626,2007-08 Season,New York Knicks,Zach Randolph,69,68,32.5,17.6,2.7,7.6,10.3,2.0,0.9,0.2,2.7,2.8,0.8,18.37
644,2008-09 Season,New York Knicks,David Lee,81,74,34.9,16.0,3.2,8.6,11.7,2.1,1.0,0.3,1.9,3.2,1.2,19.07
663,2009-10 Season,New York Knicks,David Lee,81,81,37.3,20.2,2.8,8.9,11.7,3.6,1.0,0.5,2.3,3.2,1.6,22.28
725,2012-13 Season,New York Knicks,Tyson Chandler,66,66,32.8,10.4,4.1,6.6,10.7,0.9,0.6,1.1,1.3,2.8,0.7,18.88
811,2017-18 Season,New York Knicks,Enes Kanter,71,71,25.8,14.1,3.8,7.1,11.0,1.5,0.5,0.5,1.7,2.6,0.9,24.02
