# Scrape NHL Additional Data
In this notebook, I show how I get the rest of the data as input into my models.

In [1]:
import requests
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup

I will start from the dataframe that I built when scraping the lines for each game.

In [2]:
df = pd.read_pickle("data/Lineups16_17.pkl")

## Scraping from https://www.hockey-reference.com
**List of data to get from each game**:

(Only those marked with * have been implemented so far)

Team Data: 
* Date *     
* Score *    
* Shots     
* OT of SO win *
* Points    
* Corsi for events 

Player Data:
* +/- *
* Corsi for events 
* Corsi against events 
* Goals *
* PP Goals *
* Assists *
* Penalty Minutes *


In [3]:
page = requests.get("https://www.hockey-reference.com/leagues/NHL_2017_games.html")
soup = BeautifulSoup(page.content, 'html.parser')

In [4]:
reg_season_tag = soup.find('table',{"id":'games'})
ind_game_tags = reg_season_tag.find('tbody').find_all('tr')

In [5]:
Ngames = len(ind_game_tags)
dates     = pd.Series([np.datetime64('2009-01-01')]*Ngames)
AwayTeams = np.empty(Ngames,dtype='U3')
HomeTeams = np.empty(Ngames,dtype='U3')
AwayScore = np.zeros(Ngames,dtype=int)
HomeScore = np.zeros(Ngames,dtype=int)
lOTwin    = np.empty(Ngames,dtype='?')
lSOwin    = np.empty(Ngames,dtype='?')

In [6]:
for i in range(Ngames):
    dates[i]     = ind_game_tags[i].find('th',{'data-stat':'date_game'}).find('a').text
    AwayTeams[i] = ind_game_tags[i].find('td',{'data-stat':'visitor_team_name'}).attrs['csk'][:3]
    HomeTeams[i] = ind_game_tags[i].find('td',{'data-stat':'home_team_name'}).attrs['csk'][:3]
    AwayScore[i] = ind_game_tags[i].find('td',{'data-stat':'visitor_goals'}).text
    HomeScore[i] = ind_game_tags[i].find('td',{'data-stat':'home_goals'}).text
    lOTwin[i]    = ind_game_tags[i].find('td',{'data-stat':'overtimes'}).text == 'OT'
    lSOwin[i]    = ind_game_tags[i].find('td',{'data-stat':'overtimes'}).text == 'SO'

Let's add 'em to the data frame, but first I will check if the dates and teams match up

In [7]:
print(np.all(( (dates == df['Date']).all(),
               (AwayTeams == df['Away Team']).all(),
               (HomeTeams == df['Home Team']).all() )))

True


In [8]:
df['Away Score'] = AwayScore
df['Home Score'] = HomeScore
df['OT Win'] = lOTwin
df['SO Win'] = lSOwin

In [9]:
cols = df.columns.tolist()
new_cols = cols[:4] + cols[-4:] + cols[4:-4]
df = df[new_cols]

In [10]:
df.head()

Unnamed: 0,Date,Season,Away Team,Home Team,Away Score,Home Score,OT Win,SO Win,AL1-0,AL1-1,...,HL1-2,HL2-0,HL2-1,HL2-2,HL3-0,HL3-1,HL3-2,HL4-0,HL4-1,HL4-2
0,2016-10-12,16_17,STL,CHI,5,2,False,False,Paul Stastny,Alexander Steen,...,Jonathan Toews,Artemi Panarin,Patrick Kane,Artem Anisimov,Tyler Motte,Marcus Kruger,Ryan Hartman,Jordin Tootoo,Vincent Hinostroza,Nick Schmaltz
1,2016-10-12,16_17,CGY,EDM,4,7,False,False,Kris Versteeg,Johnny Gaudreau,...,Milan Lucic,Anton Slepyshev,Benoit Pouliot,Ryan Nugent-Hopkins,Patrick Maroon,Leon Draisaitl,Jesse Puljujarvi,Tyler Pitlick,Mark Letestu,Zack Kassian
2,2016-10-12,16_17,TOR,OTT,4,5,True,False,Milan Michalek,Leo Komarov,...,Mark Stone,Zack Smith,Bobby Ryan,Derick Brassard,Tom Pyatt,Jean-Gabriel Pageau,Philip Varone,Ryan Dzingel,Chris Neil,Chris Kelly
3,2016-10-12,16_17,LAK,SJS,1,2,False,False,Anze Kopitar,Dustin Brown,...,Tomas Hertl,Logan Couture,Mikkel Boedker,Joonas Donskoi,Joel Ward,Chris Tierney,Patrick Marleau,Tommy Wingels,Matt Nieto,Melker Karlsson
4,2016-10-13,16_17,MTL,BUF,4,1,False,False,Alex Galchenyuk,Brendan Gallagher,...,Evander Kane,Marcus Foligno,Johan Larsson,Tyler Ennis,Zemgus Girgensons,Hudson Fasching,Matt Moulson,Nicolas Deslauriers,Brian Gionta,Derek Grant


Before I grab all of the data, I need to build the appropriate columns in the dataframe.

Columns to add:
* Number of Goals, by position (e.g. ``Goals AL1-0``, ``Goals HL2-1``, etc.).
* Number of PP Goals, by position.
* Number of Assists, by position.
* Number of PP Assists, by position.
* Number of Shots, by position.
* +/- for each position.
* Penalty minutes by position.
* Lineup Error, by lines (i.e. fantasy sites may have errors for lineups, & for now I will just mark the lines with errors)

In [11]:
for loc in ("A","H"):
    for line in range(1,5):
        for pos in range(3):
            df['Goals {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['PP Goals {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['Assists {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['PP Assists {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['Shots {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['+/- {0}L{1}-{2}'.format(loc,line,pos)] = 0
            df['PM {0}L{1}-{2}'.format(loc,line,pos)] = 0
        df['Lineup Error {0}L{1}'.format(loc,line)] = False

## The Scraping
Now, let's scrape data from https://www.hockey-reference.com for each individual game. The syntax for each game in the url looks to be ``https://www.hockey-reference.com/boxscores/YYYYMMDDNTeamAbbrev.html``. Where ``YYYYMMDD`` is the date in that form, ``N`` seems to always be 0 (maybe it has to do with the number of games by the home team that day?), and ``TeamName`` is the name of the home team.

First, I write a function ``add_stats`` to add all of the stats to a particular position, given a tag of the appropriate player's row (mostly to make things easier to read). Then, I have to account for some names that are not the same between different sites (e.g. Alex Steen vs. Alexander Steen). I do this with a ``check_name`` function Finally, I run the big loop.

In [12]:
def add_stats(skater,df,i,loc,line,pos):
    """
    Add the stats of:
    Goals, PP Goals, Assists, PP Assists, Shots, +/-, and Penaty Minutes
    to the dataframe using the html tag containing the row on hockey-reference.com 
    with the appropriate data.
    INPUT:
        skater: html tag of row in table with all of the players stats
        df: Dataframe to add data to
        i: index/row of dataframe to use
        loc: Team location (Home or Away)
        line: Line number [1-4]
        pos: index/position on line [0-2]
    OUTPUT:
        None (Just alters dataframe)
    """
    df['Goals {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'goals'}).text
    df['PP Goals {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'goals_pp'}).text
    df['Assists {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'assists'}).text
    df['PP Assists {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'assists_pp'}).text
    df['Shots {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'shots'}).text
    df['+/- {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'plus_minus'}).text
    df['PM {0}L{1}-{2}'.format(loc,line,pos)].iat[i] = \
        skater.find('td',{'data-stat':'pen_min'}).text

In [13]:
def check_name(name1,name2):
    """
    """
    retval = False
    #print("|{0}|{1}|".format(name1,name2))
    if name1.lower() == name2.lower():
        retval = True
    else:
        n1 = name1.split()
        n2 = name2.split()
        # Alex = Alexander
        if (n1[1] == n2[1] and 
            ((n1[0] == 'Alex' and n2[0] == 'Alexander') or
            (n1[0] == 'Alexander' and n2[0] == 'Alex'))):
            retval = True
        # I think someone mixed up Micheal (Micheal Ferland) for Michael
        if ((not retval) and (n1[1] == n2[1]) and
            ((n1[0] == 'Michael' and n2[0] == 'Micheal') or
            (n1[0] == 'Micheal' and n2[0] == 'Michael'))):
            retval = True
        # Mitch = Mitchell
        if ((not retval) and (n1[1] == n2[1]) and
            ((n1[0] == 'Mitch' and n2[0] == 'Mitchell') or
            (n1[0] == 'Mitchell' and n2[0] == 'Mitch'))):
            retval = True
        # Phil = Philip = Phillip
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Phil' and n2[0] == 'Philip') or
            (n1[0] == 'Philip' and n2[0] == 'Phil')  or
            (n1[0] == 'Phillip' and n2[0] == 'Phil') or
            (n1[0] == 'Phil' and n2[0] == 'Phillip'))):
            retval = True
        # Pierre-Alexandre Parenteau = P.A. Parenteau
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Pierre-Alexandre' and n2[0] == 'P.A.') or
            (n1[0] == 'P.A.' and n2[0] == 'Pierre-Alexandre'))):
            retval = True
        # Jon = Jonathan
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Jon' and n2[0] == 'Jonathan') or
            (n1[0] == 'Jonathan' and n2[0] == 'Jon'))):
            retval = True
        # Zach = Zachary
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Zach' and n2[0] == 'Zachary') or
            (n1[0] == 'Zachary' and n2[0] == 'Zach'))):
            retval = True
        # Vinnie = Vincent
        if ((not retval) and (n1[1] == n2[1]) and
            ((n1[0] == 'Vinnie' and n2[0] == 'Vincent') or
             (n1[0] == 'Vincent' and n2[0] == 'Vinnie'))):
            retval = True
        # Joel Eriksson Ek = Joel Eriksson-Ek
        if ((not retval) and
            ((name1 == 'Joel Eriksson Ek' and name2 == 'Joel Eriksson-Ek') or
             (name1 == 'Joel Eriksson-Ek' and name2 == 'Joel Eriksson Ek') )):
            retval = True
        # Chris = Christopher
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Chris' and n2[0] == 'Christopher') or
             (n1[0] == 'Christopher' and n2[0] == 'Chris'))):
            retval = True
        # Matt = Matthew
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Matt' and n2[0] == 'Matthew') or
             (n1[0] == 'Matthew' and n2[0] == 'Matt'))):
            retval = True
        # Jt = J.T. 
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'J.T.' and n2[0] == 'Jt') or
             (n1[0] == 'Jt' and n2[0] == 'J.T.'))):
            retval = True
        # Tj = T.J. 
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Tj' and n2[0] == 'T.J.') or
             (n1[0] == 'T.J.' and n2[0] == 'Tj'))):
            retval = True
        # Aj = A.J.
        if ((not retval) and (n1[1] == n2[1]) and 
            ((n1[0] == 'Aj' and n2[0] == 'A.J.') or
             (n1[0] == 'A.J.' and n2[0] == 'Aj'))):
            retval = True
        #if ( (not retval) and (n2[1] == n1[1])):
            #print("Last Name Same:{0}|{1}".format(name1,name2))
    return retval

In [14]:
for i in range(len(df)):
    date = df['Date'].iat[i]
    date = date.strftime('%Y%m%d')
    HomeTeam = df['Home Team'].iat[i]
    #print("date = {0}".format(date))
    #print("Home Team = {0}".format(HomeTeam))
    try:
        soup = BeautifulSoup(open("data/hockey-ref-boxscore{0}{1}{2}.html".format(date,0,HomeTeam)), "html.parser")
    except FileNotFoundError:
        print("https://www.hockey-reference.com/boxscores/{0}{1}{2}.html".format(date,0,HomeTeam))
        page = requests.get("https://www.hockey-reference.com/boxscores/{0}{1}{2}.html".format(date,0,HomeTeam))
        soup = BeautifulSoup(page.content, 'html.parser')
        with open("data/hockey-ref-boxscore{0}{1}{2}.html".format(date,0,HomeTeam), "w") as f:
            f.write(str(soup))
    for loc in ("A","H"):
        Num_not_found = 0
        lines_not_found = []
        if loc == "A":
            team = df['Away Team'].iat[i]
        else:
            team = HomeTeam
        table_tag = soup.find('table',{'id':"{0}_skaters".format(team)}).find('tbody')
        all_skaters = table_tag.find_all('tr')
        # Loop through positions on dataframe (to ensure all are found)
        for line in range(1,5):
            for pos in range(3):
                lfound = False
                for skater in all_skaters:
                    # Ignore any line positions without a player (but don't make that game an error)
                    if df['{0}L{1}-{2}'.format(loc,line,pos)].iat[i] == "None None":
                        lfound = True
                        break
                    if check_name(skater.find('td',{'data-stat':'player'}).find('a').text,
                                  df['{0}L{1}-{2}'.format(loc,line,pos)].iat[i]):      
                        add_stats(skater,df,i,loc,line,pos)
                        lfound = True
                        break
                if not lfound:
                    #print("Can't find {0} ({1}) in game data of {2} vs {3} on {4}".format(
                            #df['{0}L{1}-{2}'.format(loc,line,pos)].iat[i],loc,
                            #df['Away Team'].iat[i],df['Home Team'].iat[i],df['Date'].iat[i]))
                    # Set Lineup Error = True
                    df['Lineup Error {0}L{1}'.format(loc,line)].iat[i] = True
                    Num_not_found += 1
                    lines_not_found.append('L{0}-{1}'.format(line,pos))
                    #if Num_not_found > 1:
                        #print("Number not found > 1 for {0} Team".format(loc))
                        #print("Lines with player mistakes = {0}".format(','.join(lines_not_found)))
    
    

### Save Point!
This looks like a good place for a save point! The data does have a number of mistakes in it (see below for discussion about the mistakes).

In [15]:
print(df.columns)

Index(['Date', 'Season', 'Away Team', 'Home Team', 'Away Score', 'Home Score',
       'OT Win', 'SO Win', 'AL1-0', 'AL1-1',
       ...
       '+/- HL4-1', 'PM HL4-1', 'Goals HL4-2', 'PP Goals HL4-2',
       'Assists HL4-2', 'PP Assists HL4-2', 'Shots HL4-2', '+/- HL4-2',
       'PM HL4-2', 'Lineup Error HL4'],
      dtype='object', length=208)


In [16]:
df.to_pickle("data/FullData_Messy.pkl")

## Handling Some Messy Data 
### (In Progress)
Apart from fixing some of the names, I have not tried to handle any lineup mistakes. These seem to result from the differences in the lineups in rotogrinders.com vs hockey-reference.com. It looks that rotogrinders seems to make more of the mistakes (e.g. listing players that did not play in the game). To fix this I will use the actual nhl.com/gamecenter data. I didn't use this before because it is very similar to hockey-reference.com and hockey-reference is a MUCH more simple site to scrape. For the nhl.com/stats site, I will have to use a different scraping tool (I am pretty sure there is javascript code used for generating the table that I need). However, the nhl.com/gamecenter data breaks down their table between defensemen and forwards. If there is only one missing/erroneous player per team in each game, we could replace him with the correct one (and assume that any other lines stayed the same).
I may also check other fantasy sites, if I can find them.

It turns out that I will also need to convert NHL team abbreviations into their names. Fun. So here is the dictionary to do that.

In [17]:
ab2name = {"ANA" : "Anaheim Ducks",
    "ARI" : "Arizona Coyotes",
    "BOS" : "Boston Bruins",
    "BUF" : "Buffalo Sabres",
    "CAR" : "Carolina Hurricanes",
    "CGY" : "Calgary Flames",
    "CHI" : "Chicago Blackhawks",
    "CBJ" : "Columbus Blue Jackets",
    "COL" : "Colorado Avalanche",
    "DAL" : "Dallas Stars",
    "DET" : "Detroit Red Wings",
    "EDM" : "Edmonton Oilers",
    "FLA" : "Florida Panthers",
    "LAK" : "Los Angeles Kings",
    "MIN" : "Minnesota Wild",
    "MTL" : "Montreal Canadiens",
    "NSH" : "Nashville Predators",
    "NJD" : "New Jersey Devils",
    "NYI" : "New York Islanders",
    "NYR" : "New York Rangers",
    "OTT" : "Ottawa Senators",
    "PHI" : "Philadelphia Flyers",
    "PHX" : "Phoenix Coyotes",
    "PIT" : "Pittsburgh Penguins",
    "SJS" : "San Jose Sharks",
    "STL" : "St. Louis Blues",
    "TBL" : "Tampa Bay Lightning",
    "TOR" : "Toronto Maple Leafs",
    "VAN" : "Vancouver Canucks",
    "VGK" : "Vegas Golden Knights",
    "WPG" : "Winnipeg Jets",
    "WSH" : "Washington Capitals"}

I also need to load in ``selenium`` to get the full html output of the nhl.com/stats page that I need. I will run a headless firefox driver. (Note: There is a bit of work to actually make this work, but I am not going to dive into explaining it now)

In [18]:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(firefox_options=options) 


I can find all of the mistakes by the "Lineup Error" column.

In [19]:
n = 0
for loc in ("A","H"):
    for line in range(1,2):
        if n == 0:
            val = df["Lineup Error {0}L{1}".format(loc,line)] 
        else:
            val = val | df["Lineup Error {0}L{1}".format(loc,line)]
        n += 1
print(len(np.where(val == True)[0]))


16


The url for the actual stats for each game has a somewhat arbitrary numbering system in it. It has to do with the number of the game in the season and the type of game (preseason, regular, and playoffs I believe). It would be very complicated to attempt to find it (and would be difficult with many games occuring on the same day. To avoid this, I will search for the game from the nhl.com/stats website (see url in code below), and pull the appropriate gamecenter url from that page.


Then, after some more complicated scraping, I attempt to determine situations where I can actually know the lines, and I replace the player in the dataframe.


Sorry, this is a beast...

In [20]:
# Url to find the correct gamecenter url (0 = date in YYYY-MM-DD)
nhl_base_url = "http://www.nhl.com/stats/team?reportType=game&dateFrom={0}&dateTo={0}&gameType=2"
# Specific Gamecenter url ( {0} = lowercase away abbrev., {1} = lowercase home abbrev.,
#                           {2} = date in YYYY/MM/DD, {3} = gamecenter id number)
nhl_gc_base_url = "https://www.nhl.com/gamecenter/{0}-vs-{1}/{2}/{3}#game={3},game_state=final,game_tab=stats"   
ind_err = np.where(df["Lineup Error"] == True)[0]
num_check = 0
num_fix = 0
for i in ind_err:
    # get date and team names from the dataframe
    date  = df['Date'].iat[i]
    ATeam = df['Away Team'].iat[i]
    HTeam = df['Home Team'].iat[i]
    print(i,date,ATeam,HTeam)
    # Scrape the nhl.com/stats site to find the gamecenter webpage
    try:
        soup = BeautifulSoup(open("data/nhl_stats_search{0}.html".format(date)), "html.parser")
    except FileNotFoundError:
        print(nhl_base_url.format(date))
        driver.get(nhl_base_url.format(date))
        content = driver.page_source
        soup = BeautifulSoup(content, "html.parser")
        with open("data/nhl_stats_search{0}.html".format(date), "w") as f:
            f.write(str(soup))
    rows = soup.find('div',{'class':"rt-tbody"}).find_all('div',{'class':"rt-tr-group"}) 
    for row in rows:
        # The columns are only organized by index
        entries = row.find_all('div',{'class':'rt-td'})
        Team1Name   = entries[1].text
        Team2Abbrev = entries[2].text[-3:]
        if ( (ab2name[ATeam] == Team1Name and HTeam == Team2Abbrev) or
             (ab2name[HTeam] == Team1Name and ATeam == Team2Abbrev) ):
            # Found the correct game!
            # Extract the gamecenter number
            gc_num = entries[2].find('a').attrs["href"].split('/')[-1]
            # Build the gamecenter url
            gc_url = nhl_gc_base_url.format(ATeam.lower(),HTeam.lower(),date.replace('-','/'),gc_num)
            break
    # Now scrape the gamecenter data and compare the forwards there and in the dataframe
    try:
        soup = BeautifulSoup(open("data/nhl_gamecenter{0}vs{1}on{2}.html".format(ATeam,HTeam,date)), "html.parser")  
    except FileNotFoundError:
        print(gc_url)
        driver.get(gc_url)
        content = driver.page_source
        soup = BeautifulSoup(content, "html.parser")
        with open("data/nhl_gamecenter{0}vs{1}on{2}.html".format(ATeam,HTeam,date), "w") as f:
            f.write(str(soup))
    lfixed = False
    for loc in ("A","H"):
        if loc == "A":
            roster = soup.find('div',{"class":"away"})
        else:
            roster = soup.find('div',{"class":"home"})
        forwards = roster.find_all("table",{"data-position":"skaters"})[1]
        
        

        # Make sure it is the forwards
        thead = forwards.find("thead")
        assert thead.find_all('th')[1].find("span").text == "Forwards"

        # Get list
        allf  = forwards.find("tbody").find_all("tr")

        ## Find which players are in here and in dataframe ##
        # First make a dictionary to save which column (position) in the dataframe is also on nhl.com
        df_on_nhl = {}
        for line in range(1,5):
            for pos in range(3):
                df_on_nhl["{0}L{1}-{2}".format(loc,line,pos)] = False
        # Also make an empty list to find which (if any) forwards are on nhl.com but not in df
        not_in_df = []
        ls = True
        for f in allf:
            # Get name (from player url)
            player_url = f.find_all("span")[1].find("a").attrs["href"]
            split_name = player_url.split("/")[-1].split("-")[0:-1]
            nhl_name  = " ".join(split_name).title()
            # In dataframe?
            lfound = False 
            for line in range(1,5):
                for pos in range(3):
                    # Check name
                    df_name = df["{0}L{1}-{2}".format(loc,line,pos)].iat[i]
                    #print(df_name,nhl_name)
                    if (check_name(nhl_name,df_name)):
                        df_on_nhl["{0}L{1}-{2}".format(loc,line,pos)] = True
                        lfound = True
                        break
                if lfound: 
                    break
            if not lfound:
                not_in_df.append(nhl_name)
        # After looping, see if there are any 
        for val in not_in_df:
            print(val,"not on dataframe ({0} Team)".format(loc))
        Num_only_nhl = 0
        for df_loc, tf in df_on_nhl.items():
            if not tf:
                print(df_loc,"not on nhl")
                print(df[df_loc].iat[i],"not on nhl ({0} Team)".format(loc))
                # Save last column on dataframe where there is a mistake
                # If there is only one, this player will be replaced
                df_loc_sav = df_loc
                Num_only_nhl += 1
        # Make the replacement if there is only one missing player
        if len(not_in_df) == 1:
            print("Changing df['{0}'].iat[{1}] from {2} to {3} ({4} Team)".format(
                  df_loc_sav,i,df[df_loc_sav].iat[i],not_in_df[0],loc))
            #df[df_loc_sav].iat[i] = not_in_df[0][0]+" "+not_in_df[0][1]
            lfixed = True
        elif len(not_in_df) == 0 and Num_only_nhl > 0:
            # This means they dressed less forwards
            for df_loc, tf in df_on_nhl.items():
                if not tf:
                    print("Changing df['{0}'].iat[{1}] from {2} to {3} ({4} Team)".format(
                  df_loc,i,df[df_loc].iat[i],"None None",loc))
                    #df[df_loc].iat[i] = "None None"
                    lfixed = True
        print()
    if lfixed:
        num_fix += 1
    num_check += 1
    print("{0}/{1}".format(num_fix,num_check))

KeyError: 'Lineup Error'

In [None]:
for loc in ("A","H"):
    for line in range(1,5):
        for pos in range(3):
            print(df["{0}L{1}-{2}".format(loc,line,pos)].iat[61])

In [None]:
lA = np.empty(len(df.columns),dtype='?')
i = 0
for c in df.columns:
    lA[i] = (c[:2] == "AL")
    i +=1
print(lA)
print(df.columns[np.where(lA)])

In [None]:
for loc in ("A","H"):
    for line in range(1,5):
        for pos in range(3):
            print("{0}L{1}-{2}".format(loc,line,pos),df["{0}L{1}-{2}".format(loc,line,pos)].iat[145])