# Scraping Using BeautifulSoup
I am still new to web scraping so I wanted to practice and outline to myself what I was doing.  This could obviously be done in a more efficient manner but I wanted to use this notebook for learning and reference.  

In [2]:
import bs4 as bs
import urllib.request
import html5lib
import pandas as pd

In [3]:
guards_advanced = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-guards-181885").read()
guards_touches = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-guards-touches-201726").read()
forwards_advanced = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-forwards-181887").read()
forwards_touches = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-forwards-touches-201728").read()
centers_advanced = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-centers-181888").read()
centers_touches = urllib.request.urlopen("https://rotogrinders.com/pages/nba-advanced-player-stats-centers-touches-201727").read()

In [4]:
guards_advanced = bs.BeautifulSoup(guards_advanced, 'lxml')
guards_touches = bs.BeautifulSoup(guards_touches, 'lxml')
forwards_advanced = bs.BeautifulSoup(forwards_advanced, 'lxml')
forwards_touches = bs.BeautifulSoup(forwards_touches, 'lxml')
centers_advanced = bs.BeautifulSoup(centers_advanced, 'lxml')
centers_touches = bs.BeautifulSoup(centers_touches, 'lxml')

## Guards Advanced

Note: This page is updated weekly on Thursdays. Players must have played at least 70 minutes this season to appear on this page.
Percentage of Workload Stats

Percentage of Workload = (Players Avg. Stat)/(Team’s Avg Stat)

Other Advanced Statistics

eFG% – Adjusted field goal percentage based on the idea that 3 pointers are 50% more valuable than 2 point field goals.
TS% – True Shooting Percentage, adjusted for 3-pointers and free throws.
USG% – Percentage of offensive percentage where a player contributes while on the floor.
O-Rt – Points Contributed to by Player Per 100 Possessions
D-Rt – Points Allowed by Player Per 100 Possessions.

Team Advanced Statistics

PF – Player Fouls Per Game
O-Rt – Points Per 100 Possessions
D-Rt – Points Allowed Per 100 Possessions.
Net-Rt – O-Rating minus D-Rating
eFG% – Adjusted field goal percentage based on the idea that 3 pointers are 50% more valuable than 2 point field goals.
TS% – True Shooting Percentage, adjusted for 3-pointers and free throws.
Ast% – Percentage of field goals made from an assist
Reb% – Percentage of missed shots rebounded by team
AST/TO – Assists Divided by Turnovers
TORatio – Teams Plays per game ending in a turnover

Player Touches Advanced Stats

TCH/G – Number of times a player touches the ball per game.
Tch/Min – Number of touches per minute played
PossTm – Average time per game where the player possesses the ball
Post – Touches in the post
Paint – Touches in the paint
PPG – Points Per Game
Pts/Tch – Points Scored Divided by Touches Per Game
FPPG – Fantasy Points Per Game
FP/Tch – Fantasy Points Per Game Divided by Touches Per Game

In [5]:
#Beautiful soup objects on page
[type(item) for item in list(guards_advanced.children)]

[bs4.element.Doctype, bs4.element.Tag, bs4.element.NavigableString]

In [8]:
#Closer look at BS objects in bs4.element.Tag.  Will need to extract "body" tag.
html = list(guards_advanced.children)[1]
print(len(list(html.children)))

5


In [10]:
#Extract body
GA_body = html.body
type(GA_body)

#Extract children of body; convert iterator into list
GA_body_children = GA_body.children
print(type(GA_body_children))
GA_body_children = list(GA_body_children)
print(type(GA_body_children))
print(len(GA_body_children))

content = GA_body_children[1] #extract text from body's children
content.get_text()

<class 'list_iterator'>
<class 'list'>
5


"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nMenu\n\n\n\nMLB\nResearch tools and content for Daily Fantasy MLB\n\n\n\n\n\nPGA\nResearch tools and content for Daily Fantasy PGA\n\n\n\n\n\n\n\n\nFanDuel\n\n\nDraftKings\n\n\nFantasyDraft\n\n\nDraft\n\n\nYahoo\n\n\n\n\n\n\n\n\nFanDuel\nDraftKings\nFantasyDraft\nDraft\nYahoo\n\n\n\n\n\n                            Featured See All Featured\n\n\n\n\n\nPlayer Projections \nDaily projections from the best MLB minds out there.\n\n\n\n\n\nDaily MLB Projected and Starting Lineups\nDaily projected/confirmed batting orders and starting pitchers.\n\n\n\n\n\nPlateIQ\nAn MLB DFS dashboard that provides all of the relevant data about a particular matchup.\n\n\n\n\n\nMLB Weather With Kevin Roth\nAll the crucial weather information you need from our Chief Meteorologist.\n\n\n\n\n\nLineup Builder\nUse our projections to build one or multiple lineups.\n\n\n\n\n\nConsensus Value Rankings\nDaily aggregated point

In [11]:
#Extracting tables
tables = guards_advanced.find_all('table')
table_info = tables[2] #this is the table that holds the info we need.
table_info = table_info.get_text()#extract the text we need
table_info[:1000]

'\n\n\n\tPlayer\t\n\tTeam\t\n Pts \n Reb \n Ast \n Stl \n Blk \n eFG% \n TS% \n USG% \n O-Rt \n D-Rt \n DRE/36\t\n PER \n\n\n\n\n Stephen Curry \n GSW \n \t23.44%\t\n \t10.77%\t\n \t20.94%\t\n \t22.86%\t\n \t2.76%\t\n \t63.00%\t\n \t66.90%\t\n \t32.60%\t\n \t124.7\t\n \t102.8\t\n \t9.17\t\n \t31.3\t\n\n\n James Harden \n HOU \n \t27.29%\t\n \t14.01%\t\n \t34.00%\t\n \t17.38%\t\n \t11.53%\t\n \t51.20%\t\n \t59.80%\t\n \t32.50%\t\n \t115\t\n \t108.3\t\n \t3.99\t\n \t25.2\t\n\n\n Damian Lillard \n POR \n \t22.17%\t\n \t8.11%\t\n \t29.31%\t\n \t12.48%\t\n \t7.42%\t\n \t49.70%\t\n \t56.00%\t\n \t31.30%\t\n \t113.1\t\n \t110.8\t\n \t1.80\t\n \t22.1\t\n\n\n Russell Westbrook \n OKC \n \t21.28%\t\n \t15.57%\t\n \t45.72%\t\n \t27.78%\t\n \t3.55%\t\n \t48.90%\t\n \t55.40%\t\n \t31.60%\t\n \t115.5\t\n \t103.2\t\n \t4.73\t\n \t27.6\t\n\n\n DeMar DeRozan \n TOR \n \t21.83%\t\n \t9.89%\t\n \t19.80%\t\n \t12.79%\t\n \t4.61%\t\n \t46.30%\t\n \t55.00%\t\n \t29.80%\t\n \t112.9\t\n \t108.2\t\n \t0.55\t\n

In [12]:
#easier way to extract data from specific table I need
table = guards_advanced.find_all('table', class_="tbl data-table")
table_info = table[0].get_text()
table_info[:1000]

'\n\n\n\tPlayer\t\n\tTeam\t\n Pts \n Reb \n Ast \n Stl \n Blk \n eFG% \n TS% \n USG% \n O-Rt \n D-Rt \n DRE/36\t\n PER \n\n\n\n\n Stephen Curry \n GSW \n \t23.44%\t\n \t10.77%\t\n \t20.94%\t\n \t22.86%\t\n \t2.76%\t\n \t63.00%\t\n \t66.90%\t\n \t32.60%\t\n \t124.7\t\n \t102.8\t\n \t9.17\t\n \t31.3\t\n\n\n James Harden \n HOU \n \t27.29%\t\n \t14.01%\t\n \t34.00%\t\n \t17.38%\t\n \t11.53%\t\n \t51.20%\t\n \t59.80%\t\n \t32.50%\t\n \t115\t\n \t108.3\t\n \t3.99\t\n \t25.2\t\n\n\n Damian Lillard \n POR \n \t22.17%\t\n \t8.11%\t\n \t29.31%\t\n \t12.48%\t\n \t7.42%\t\n \t49.70%\t\n \t56.00%\t\n \t31.30%\t\n \t113.1\t\n \t110.8\t\n \t1.80\t\n \t22.1\t\n\n\n Russell Westbrook \n OKC \n \t21.28%\t\n \t15.57%\t\n \t45.72%\t\n \t27.78%\t\n \t3.55%\t\n \t48.90%\t\n \t55.40%\t\n \t31.60%\t\n \t115.5\t\n \t103.2\t\n \t4.73\t\n \t27.6\t\n\n\n DeMar DeRozan \n TOR \n \t21.83%\t\n \t9.89%\t\n \t19.80%\t\n \t12.79%\t\n \t4.61%\t\n \t46.30%\t\n \t55.00%\t\n \t29.80%\t\n \t112.9\t\n \t108.2\t\n \t0.55\t\n

In [13]:
print(type(list(guards_advanced.div.children)[3]))
content = list(guards_advanced.div.children)[3]
content.prettify()

<class 'bs4.element.Tag'>


'<div class="pag ui-panel-wrapper">\n <!-- PAGE HEADER -->\n <div class="blk supermenu alt" id="user-menu">\n  <div>\n   <div class="menu-groups">\n    <div class="menu-group">\n     <ul class="lst" data-group="menu" id="profile-menu">\n     </ul>\n    </div>\n   </div>\n  </div>\n </div>\n <div class="blk supermenu alt" id="search-menu">\n  <div>\n   <div class="menu-groups">\n    <div class="menu-group">\n     <ul class="lst" data-group="menu" id="search-menu">\n      <li>\n       <form action="/" class="frm player search">\n        <input name="id" type="hidden"/>\n        <input data-hidden="id" data-role="autocomplete" data-select-action="submit" data-sport="" data-url="/players/autocomplete" name="name" placeholder="Player Search" type="text"/>\n        <div class="search-btn">\n         <input class="ui-submit" type="submit" value=""/>\n         <span class="icn-search">\n         </span>\n        </div>\n       </form>\n      </li>\n      <li>\n       <form action="/" class="fr

In [93]:
#This is the info I need!
guards_advanced_stats_table = content.find_all('table', class_='tbl data-table')
print(type(guards_advanced_stats_table))
guards_advanced_stats_table

<class 'bs4.element.ResultSet'>


[<table class="tbl data-table">
 <thead>
 <tr>
 <th style="text-align:center;">	Player	</th>
 <th style="text-align:center;">	Team	</th>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Pts</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Reb</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Ast</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Stl</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Blk</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>eFG%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>TS%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong><span class="caps">USG</span>%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>O-Rt</strong> </t

In [125]:
for item in guards_advanced_stats_table:
    col_names = item.contents[1].text


In [126]:
col_names

'\n\n\tPlayer\t\n\tTeam\t\n Pts \n Reb \n Ast \n Stl \n Blk \n eFG% \n TS% \n USG% \n O-Rt \n D-Rt \n DRE/36\t\n PER \n\n'

In [127]:
guards_advanced_col_names = col_names.split()
guards_advanced_col_names

['Player',
 'Team',
 'Pts',
 'Reb',
 'Ast',
 'Stl',
 'Blk',
 'eFG%',
 'TS%',
 'USG%',
 'O-Rt',
 'D-Rt',
 'DRE/36',
 'PER']

In [83]:
#could also use pandas read_html method as well
guards_advanced_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-guards-181885")

guards_advanced_stats_df = guards_advanced_dfs[2]
guards_advanced_stats_df.tail()

Unnamed: 0,Player,Team,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13
176,Pablo Prigioni,LAC,1.58%,3.14%,6.91%,6.97%,0.20%,46.90%,48.20%,11.00%,103.3,103.9,0.16,10.2
177,Terry Rozier,BOS,1.05%,2.04%,2.07%,0.90%,1.05%,,,,,,-5.09,
178,Pat Connaughton,POR,0.45%,0.78%,0.51%,0.32%,0.00%,,,,,,-6.39,
179,James Young,BOS,0.34%,0.66%,0.43%,0.77%,0.26%,,,,,,-3.67,
180,Aaron Harrison,CHA,0.20%,0.42%,0.11%,0.95%,0.00%,,,,,,-2.65,


In [129]:
guards_advanced_col_names = pd.Series(guards_advanced_col_names)    
print(guards_advanced_col_names)
guards_advanced_stats_df.columns = guards_advanced_col_names
guards_advanced_stats_df.head()

0     Player
1       Team
2        Pts
3        Reb
4        Ast
5        Stl
6        Blk
7       eFG%
8        TS%
9       USG%
10      O-Rt
11      D-Rt
12    DRE/36
13       PER
dtype: object


Unnamed: 0,Player,Team,Pts,Reb,Ast,Stl,Blk,eFG%,TS%,USG%,O-Rt,D-Rt,DRE/36,PER
0,Stephen Curry,GSW,23.44%,10.77%,20.94%,22.86%,2.76%,63.00%,66.90%,32.60%,124.7,102.8,9.17,31.3
1,James Harden,HOU,27.29%,14.01%,34.00%,17.38%,11.53%,51.20%,59.80%,32.50%,115.0,108.3,3.99,25.2
2,Damian Lillard,POR,22.17%,8.11%,29.31%,12.48%,7.42%,49.70%,56.00%,31.30%,113.1,110.8,1.8,22.1
3,Russell Westbrook,OKC,21.28%,15.57%,45.72%,27.78%,3.55%,48.90%,55.40%,31.60%,115.5,103.2,4.73,27.6
4,DeMar DeRozan,TOR,21.83%,9.89%,19.80%,12.79%,4.61%,46.30%,55.00%,29.80%,112.9,108.2,0.55,21.5


## Guards Touches

In [84]:
#still need to extract column names first
print(type(list(guards_touches.div.children)[3]))
guards_touches_content = list(guards_touches.div.children)[3]
guards_touches_content.prettify()

<class 'bs4.element.Tag'>


'<div class="pag ui-panel-wrapper">\n <!-- PAGE HEADER -->\n <div class="blk supermenu alt" id="user-menu">\n  <div>\n   <div class="menu-groups">\n    <div class="menu-group">\n     <ul class="lst" data-group="menu" id="profile-menu">\n     </ul>\n    </div>\n   </div>\n  </div>\n </div>\n <div class="blk supermenu alt" id="search-menu">\n  <div>\n   <div class="menu-groups">\n    <div class="menu-group">\n     <ul class="lst" data-group="menu" id="search-menu">\n      <li>\n       <form action="/" class="frm player search">\n        <input name="id" type="hidden"/>\n        <input data-hidden="id" data-role="autocomplete" data-select-action="submit" data-sport="" data-url="/players/autocomplete" name="name" placeholder="Player Search" type="text"/>\n        <div class="search-btn">\n         <input class="ui-submit" type="submit" value=""/>\n         <span class="icn-search">\n         </span>\n        </div>\n       </form>\n      </li>\n      <li>\n       <form action="/" class="fr

In [85]:
guards_touches_stats_table = guards_touches_content.find_all('table', class_='tbl data-table')
print(type(guards_touches_stats_table))
guards_touches_stats_table

<class 'bs4.element.ResultSet'>


[<table class="tbl data-table">
 <thead>
 <tr>
 <th style="text-align:center;">	Player	</th>
 <th style="text-align:center;">	Team	</th>
 <th style="text-align:center;">	GP	</th>
 <th style="text-align:center;"> <span class="caps">MIN</span> </th>
 <th style="text-align:center;"> <span class="caps">TCH</span>/G	</th>
 <th style="text-align:center;">	Tch/Min	</th>
 <th style="text-align:center;">	PossTm	</th>
 <th style="text-align:center;">	Post	</th>
 <th style="text-align:center;">	Paint	</th>
 <th style="text-align:center;"> <span class="caps">PPG</span> </th>
 <th style="text-align:center;">	Pts/Tch	</th>
 <th style="text-align:center;"> <span class="caps">FPPG</span> </th>
 <th style="text-align:center;">	FP/Tch	</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td> <a class="player-popup" href="/players/stephen-curry-1079">Stephen Curry</a> </td>
 <td> <span class="caps">GSW</span> </td>
 <td> 	9	</td>
 <td> 	31.9	</td>
 <td> 	78.00	</td>
 <td> 	2.45	</td>
 <td> 	5.20	</td>
 <td> 	1.60	</td

In [109]:
for item in guards_touches_stats_table:
    guards_touches_col_names = item.contents[1].text
type(guards_touches_col_names)

str

In [110]:
guards_touches_col_names = guards_touches_col_names.split()
guards_touches_col_names

['Player',
 'Team',
 'GP',
 'MIN',
 'TCH/G',
 'Tch/Min',
 'PossTm',
 'Post',
 'Paint',
 'PPG',
 'Pts/Tch',
 'FPPG',
 'FP/Tch']

In [111]:
guards_touches_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-guards-touches-201726")

In [130]:
#Looks like the column names synced automatically this time
guards_touches_stats_df = guards_touches_dfs[1] 
guards_touches_stats_df.tail()

Unnamed: 0,Player,Team,GP,MIN,TCH/G,Tch/Min,PossTm,Post,Paint,PPG,Pts/Tch,FPPG,FP/Tch
90,Pablo Prigioni,LAC,5,5.2,11.6,2.23,0.8,0.2,0.0,0.0,0.0,8.68,0.75
91,Terry Rozier,BOS,5,19.7,33.2,1.69,1.8,0.4,0.2,0.96,0.14,5.99,0.18
92,Pat Connaughton,POR,6,1.3,1.5,1.15,0.1,0.0,0.0,0.22,0.87,2.81,1.87
93,James Young,BOS,3,3.4,4.7,1.38,0.1,0.0,0.0,0.23,0.15,2.99,0.64
94,Aaron Harrison,CHA,2,3.7,2.5,0.68,0.1,0.0,0.0,0.0,0.0,2.1,0.84


## Forwards Advanced

In [131]:
forwards_advanced_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-forwards-181887")

In [147]:
#will need to extract out the columns like I did for guards advanced 
forwards_advanced_stats_df = forwards_advanced_dfs[2]
forwards_advanced_stats_df.head()

Unnamed: 0,Player,Team,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13
0,Kevin Durant,OKC,23.26%,14.98%,18.84%,11.53%,17.58%,57.30%,63.40%,30.70%,121.7,104.2,4.27,28.2
1,LeBron James,CLE,22.52%,16.13%,27.97%,20.80%,16.30%,55.10%,58.80%,31.40%,116.5,102.3,4.77,27.5
2,Anthony Davis,NO,17.58%,17.95%,6.38%,12.32%,36.55%,50.80%,55.90%,29.60%,110.1,104.4,3.86,25.0
3,Paul George,IND,22.86%,15.76%,19.22%,20.91%,7.89%,49.00%,55.80%,30.40%,106.6,101.1,2.7,20.9
4,Carmelo Anthony,NYK,19.50%,15.31%,17.78%,13.28%,8.15%,47.40%,53.00%,29.70%,106.7,106.9,0.83,20.3


In [142]:
forwards_advanced_stats_table = content.find_all('table', class_='tbl data-table')
print(type(forwards_advanced_stats_table))
forwards_advanced_stats_table

<class 'bs4.element.ResultSet'>


[<table class="tbl data-table">
 <thead>
 <tr>
 <th style="text-align:center;">	Player	</th>
 <th style="text-align:center;">	Team	</th>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Pts</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Reb</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Ast</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Stl</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Blk</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>eFG%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>TS%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong><span class="caps">USG</span>%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>O-Rt</strong> </t

In [144]:
for item in forwards_advanced_stats_table:
    forwards_advanced_col_names = item.contents[1].text
type(forwards_advanced_col_names)

str

In [145]:
forwards_advanced_col_names = forwards_advanced_col_names.split()
forwards_advanced_col_names

['Player',
 'Team',
 'Pts',
 'Reb',
 'Ast',
 'Stl',
 'Blk',
 'eFG%',
 'TS%',
 'USG%',
 'O-Rt',
 'D-Rt',
 'DRE/36',
 'PER']

In [148]:
forwards_advanced_col_names = pd.Series(forwards_advanced_col_names)    
print(forwards_advanced_col_names)
forwards_advanced_stats_df.columns = forwards_advanced_col_names
forwards_advanced_stats_df.head()

0     Player
1       Team
2        Pts
3        Reb
4        Ast
5        Stl
6        Blk
7       eFG%
8        TS%
9       USG%
10      O-Rt
11      D-Rt
12    DRE/36
13       PER
dtype: object


Unnamed: 0,Player,Team,Pts,Reb,Ast,Stl,Blk,eFG%,TS%,USG%,O-Rt,D-Rt,DRE/36,PER
0,Kevin Durant,OKC,23.26%,14.98%,18.84%,11.53%,17.58%,57.30%,63.40%,30.70%,121.7,104.2,4.27,28.2
1,LeBron James,CLE,22.52%,16.13%,27.97%,20.80%,16.30%,55.10%,58.80%,31.40%,116.5,102.3,4.77,27.5
2,Anthony Davis,NO,17.58%,17.95%,6.38%,12.32%,36.55%,50.80%,55.90%,29.60%,110.1,104.4,3.86,25.0
3,Paul George,IND,22.86%,15.76%,19.22%,20.91%,7.89%,49.00%,55.80%,30.40%,106.6,101.1,2.7,20.9
4,Carmelo Anthony,NYK,19.50%,15.31%,17.78%,13.28%,8.15%,47.40%,53.00%,29.70%,106.7,106.9,0.83,20.3


## Forwards Touches

In [137]:
forwards_touches_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-forwards-touches-201728")

In [140]:
forwards_touches_stats_df = forwards_touches_dfs[1]

In [141]:
forwards_touches_stats_df.tail()

Unnamed: 0,Player,Team,GP,MIN,TCH/G,Tch/Min,PossTm,Post,Paint,PPG,Pts/Tch,FPPG,FP/Tch
73,Jason Thompson,TOR,9,5.8,8.8,1.52,0.1,0.1,0.2,0.1,0.1,7.42,0.84
74,K.J. McDaniels,HOU,4,8.4,8.5,1.01,0.4,0.0,0.0,0.58,0.27,4.96,0.58
75,Tyler Hansbrough,CHA,2,2.8,1.5,0.54,0.0,0.0,0.0,0.0,0.0,5.41,3.61
76,Nick Collison,OKC,9,8.7,8.7,1.0,0.2,0.6,0.6,0.11,0.11,6.82,0.78
77,Udonis Haslem,MIA,9,9.4,12.6,1.34,0.3,1.0,1.0,0.26,0.18,5.18,0.41


## Centers Advanced

In [149]:
centers_advanced_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-centers-181888")

In [153]:
#will need to extract out the columns again
centers_advanced_stats_df = centers_advanced_dfs[2]
centers_advanced_stats_df.head()

Unnamed: 0,Player,Team,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13
0,DeMarcus Cousins,SAC,19.83%,20.50%,10.62%,13.77%,24.31%,47.70%,53.80%,35.40%,103.1,102.4,2.22,23.3
1,Brook Lopez,BKN,18.56%,16.50%,8.04%,9.25%,37.46%,51.20%,56.20%,27.50%,109.3,108.3,1.46,21.8
2,Chris Bosh,MIA,10.56%,9.20%,6.66%,5.72%,5.54%,52.00%,57.10%,25.20%,113.2,103.6,2.06,20.3
3,Nikola Vucevic,ORL,14.11%,16.10%,9.26%,8.01%,16.79%,51.10%,53.10%,26.80%,106.7,105.0,1.36,21.0
4,Jahlil Okafor,PHI,11.62%,11.03%,3.68%,3.24%,12.32%,50.90%,53.60%,27.30%,100.1,110.1,-1.19,17.2


In [154]:
centers_advanced_stats_table = content.find_all('table', class_='tbl data-table')
print(type(centers_advanced_stats_table))
centers_advanced_stats_table

<class 'bs4.element.ResultSet'>


[<table class="tbl data-table">
 <thead>
 <tr>
 <th style="text-align:center;">	Player	</th>
 <th style="text-align:center;">	Team	</th>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Pts</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Reb</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Ast</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Stl</strong> </td>
 <td style="background:#B20000; color:white; text-align:center;"> <strong>Blk</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>eFG%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>TS%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong><span class="caps">USG</span>%</strong> </td>
 <td style="background:#000066; color:white; text-align:center;"> <strong>O-Rt</strong> </t

In [155]:
for item in centers_advanced_stats_table:
    centers_advanced_col_names = item.contents[1].text
type(centers_advanced_col_names)

str

In [156]:
centers_advanced_col_names = centers_advanced_col_names.split()
centers_advanced_col_names

['Player',
 'Team',
 'Pts',
 'Reb',
 'Ast',
 'Stl',
 'Blk',
 'eFG%',
 'TS%',
 'USG%',
 'O-Rt',
 'D-Rt',
 'DRE/36',
 'PER']

In [157]:
centers_advanced_col_names = pd.Series(centers_advanced_col_names)    
print(centers_advanced_col_names)
centers_advanced_stats_df.columns = centers_advanced_col_names
centers_advanced_stats_df.head()

0     Player
1       Team
2        Pts
3        Reb
4        Ast
5        Stl
6        Blk
7       eFG%
8        TS%
9       USG%
10      O-Rt
11      D-Rt
12    DRE/36
13       PER
dtype: object


Unnamed: 0,Player,Team,Pts,Reb,Ast,Stl,Blk,eFG%,TS%,USG%,O-Rt,D-Rt,DRE/36,PER
0,DeMarcus Cousins,SAC,19.83%,20.50%,10.62%,13.77%,24.31%,47.70%,53.80%,35.40%,103.1,102.4,2.22,23.3
1,Brook Lopez,BKN,18.56%,16.50%,8.04%,9.25%,37.46%,51.20%,56.20%,27.50%,109.3,108.3,1.46,21.8
2,Chris Bosh,MIA,10.56%,9.20%,6.66%,5.72%,5.54%,52.00%,57.10%,25.20%,113.2,103.6,2.06,20.3
3,Nikola Vucevic,ORL,14.11%,16.10%,9.26%,8.01%,16.79%,51.10%,53.10%,26.80%,106.7,105.0,1.36,21.0
4,Jahlil Okafor,PHI,11.62%,11.03%,3.68%,3.24%,12.32%,50.90%,53.60%,27.30%,100.1,110.1,-1.19,17.2


## Centers Touches

In [150]:
centers_touches_dfs = pd.read_html("https://rotogrinders.com/pages/nba-advanced-player-stats-centers-touches-201727")

In [152]:
centers_touches_stats_df = centers_touches_dfs[1]
centers_touches_stats_df.head()

Unnamed: 0,Player,Team,GP,MIN,TCH/G,Tch/Min,PossTm,Post,Paint,PPG,Pts/Tch,FPPG,FP/Tch
0,Andre Drummond,DET,4,32.9,39.5,1.2,1.0,6.0,4.8,4.2,0.43,38.55,0.98
1,Al Horford,ATL,10,32.7,65.0,1.99,1.5,4.0,3.5,1.34,0.21,31.98,0.49
2,Hassan Whiteside,MIA,10,29.1,34.4,1.18,0.7,6.2,6.0,1.2,0.35,35.25,1.02
3,Dwight Howard,HOU,5,36.0,68.8,1.91,1.7,9.8,9.2,2.64,0.19,32.83,0.48
4,Jonas Valanciunas,TOR,11,27.6,54.4,1.97,1.3,6.5,5.4,1.32,0.27,27.75,0.51


## Convert all stats to csv's

In [160]:
guards_advanced_stats_df.to_csv('guards_advanced_stats.csv', index=True)
guards_touches_stats_df.to_csv('guards_touches_stats.csv', index=True)

forwards_advanced_stats_df.to_csv('forwards_advanced_stats.csv', index=True)
forwards_touches_stats_df.to_csv('forwards_touches_stats.csv', index=True)

centers_advanced_stats_df.to_csv('centers_advanced_stats.csv', index=True)
centers_touches_stats_df.to_csv('centers_touches_stats.csv', index=True)