# Additional Web Scraping AHA Hockey Standings 

Taking my intial webscraping a little further we can further scrape the AHA website. There is a bunch of player related data we can scrape on players. This would be a great dataset to analyze players.

### Import packages needed to Web Scrape, I will be using Beautiful Soup

In [96]:
import pandas as pd
import urllib.request
from bs4 import BeautifulSoup
import re

### We choose the website we want to look at and specifically the current league I play in

In [97]:
website ="https://www.ahahockey.com/standings/twin-cities?selected_tier=316"

### Download the HTML code and start a BeautifulSoup Object

In [98]:
response = urllib.request.urlopen(website)
the_website = response.read()
soup = BeautifulSoup(the_website, "html.parser")

### Here we find all of the links on this page

In [99]:
links = []
for link in soup.find_all('a'):
    links.append(link.get('href'))

### If we go to the individual team pages they have data on all of the players on each team. Lets go to each page and download the player stats for each team as well. We will need to start by getting the link to each of these pages.

In [100]:
team_links = []
for x in links:
    if x is None:
        print('skip')
    elif re.search('team_results', x):
            team_links.append(x)

skip


### We are able to locate all of the teams home page with player stats

In [101]:
team_links

['/team_results/river-pirates/75/2035',
 '/team_results/barons-b3/401/2028',
 '/team_results/battle-cats/67/2029',
 '/team_results/fighting-loons-b3/739/2030',
 '/team_results/fighting-piranhas-b3/56/2031',
 '/team_results/rack-attack-b3/47/2034',
 '/team_results/venom-b3/59/2037',
 '/team_results/spitfires-b3/395/2036',
 '/team_results/nothern-horde-b3/32/2033',
 '/team_results/monster-squad/740/2032']

### Lets do some cleaning on the team name to make it easier to read for our data

In [102]:
actual_name = []
for x in team_links:
    actual_name.append(x.split('/')[2])
actual_name

['river-pirates',
 'barons-b3',
 'battle-cats',
 'fighting-loons-b3',
 'fighting-piranhas-b3',
 'rack-attack-b3',
 'venom-b3',
 'spitfires-b3',
 'nothern-horde-b3',
 'monster-squad']

### Now that we have all of the links to the individual teams we can go to each of these pages. At each page we will scrape the player specific data and put it into a dataframe.

In [104]:
Play_for = []
name = []
GP = []
Goals = []
Assists = []
Points = []
SHG = []
PPG = []
PIM = []
team_counter = 0

for team in team_links:
    table_counter = 1
    counter = 1
    website ="https://www.ahahockey.com" + team
    response = urllib.request.urlopen(website)
    the_website = response.read()
    soup = BeautifulSoup(the_website, "html.parser")
    tables = soup.find_all('table')
    for table in tables:
        for x in table.find_all('td'):      
            if table_counter > 1:
                counter = 100
            if counter == 1:
                name.append(x.get_text().strip())
                Play_for.append(actual_name[team_counter])
                counter += 1
            elif counter == 2:
                GP.append(x.get_text().strip())
                counter += 1
            elif counter == 3:
                Goals.append(x.get_text().strip())
                counter += 1
            elif counter == 4:
                Assists.append(x.get_text().strip())
                counter += 1
            elif counter == 5:
                Points.append(x.get_text().strip())
                counter += 1
            elif counter == 6:
                SHG.append(x.get_text().strip())
                counter += 1
            elif counter == 7:
                PPG.append(x.get_text().strip())
                counter += 1
            elif counter == 8:
                PIM.append(x.get_text().strip())
                counter = 1
        table_counter += 1
    team_counter +=1

### Turn our Lists into a pandas DataFrame

In [105]:
Hockey_data = pd.DataFrame([Play_for, name, GP, Goals, Assists, Points, SHG, PPG, PIM])
Hockey_data = Hockey_data.transpose()
Hockey_data.columns = ['Play_for', 'name', 'GP', 'Goals', 'Assists', 'Points', 'SHG', 'PPG', 'PIM']
Hockey_data

Unnamed: 0,Play_for,name,GP,Goals,Assists,Points,SHG,PPG,PIM
0,river-pirates,#3. J. Lee,17,3,8,11,1,0,22
1,river-pirates,#4. E. Sureda,16,11,11,22,0,1,6
2,river-pirates,#5. P. Rudolph,13,3,3,6,0,1,8
3,river-pirates,#6. B. Herdegen,17,3,2,5,0,0,2
4,river-pirates,#7. J. Leiviska C,15,11,9,20,1,0,16
5,river-pirates,#8. J. Drewes,15,6,6,12,0,2,0
6,river-pirates,#9. J. Dayton,14,6,9,15,0,0,8
7,river-pirates,#11. M. Gansmoe,15,4,7,11,0,0,6
8,river-pirates,#12. N. Cornell,15,8,4,12,1,1,2
9,river-pirates,#16. R. Collins,17,3,7,10,0,0,0


### With more indepth data around teams, scoring, and potential we can further analyze this data to understand certain categories of players