## Problem Background

A lot of introductory problems in this course can be seen as impractical due to their lack of application to everyday life. Here, we try to show how, using real world data, coding can be fun! In many data-focused jobs or applications, CSV files are used to store large matrices and sets of data. In these files, each column generally represents a piece of data. Here, we are looking at data from the National Hockey League (NHL) that contains the past head-to-head (H2H) information of teams since the beginning of the league (with modified names). But, we are not interested in ANY team - we are interested in the best hockey team of all time: the Toronto Maple Leafs. Your goal is to read this data file containing the Leafs' H2H record against other teams and be able to make an application where search for a team  and determine their head-to-head record and information regarding these match ups that are also contained in the CSV file.

Let's read in the file:

In [None]:
import csv 

nhl_data = []
with open('TML_WinLoss.csv', 'r') as csvfile:
    nhl_stats_reader = csv.reader(csvfile)

    for row in nhl_stats_reader:
        nhl_data.append(row)

for row in nhl_data:
    print(row)

In [None]:
type(nhl_data)

With all this data, we want to be able to search for a team in this database of teams. Which data structure have we learned that might be best for this?

In [None]:
 # Get the teams from the user and store them in a set
team_set = set()

team = input("Enter a team name (type exit to end): ")
while team != "exit":
    team_set.add(team)
    team = input("Enter a team name (type exit to end): ")

print(team_set)

One awkward thing about this code is that we have to write the user prompt twice. And so if we want to change it, we need to remember to do it twice. We can get around this with a variable.

In [None]:
# Get the teams from the user and store them in a set
team_set = set()

prompt = "Enter a team name (type exit to end):  "

team = input(prompt)
while team != "exit":
    team_set.add(team)
    team = input(prompt)

print(team_set)

The question initially asked us to search for a team and print out their H2H information against the Leafs and some other info. Let's code for the wins, losses, ties, if they lost in overtimes, and games played. If they played no games against each other ever, then print "No games played".

A pseudo-code plan might look like:

In [None]:
for team_name in team_set:
    # search for team_name in NHL data
    # if found - print out wins, losses, games played
    # else - print out "No games played"

It might be wise to have a function that prints everything for us nicely since we are accessing so much data from the CSV

In [None]:
#works if we know the index or row number of the team in CSV
def print_headtohead(data, index):
    '''
    (list of lists, int) -> None
    '''
    # display showing Team name, games played, wins, losses, ties, overtime losses
    print("Team Name:",data[index][1], "Games played:",
          data[index][2], "Wins:", data[index][3], "Losses:", data[index][4], "Ties: ", data[index][5], "Overtime Losses: ",data[index][6])


# BREAKOUT SESSION

Generalize the previous function so you only need to know the name of the team instead of needing to know the index.

The function should now take in a list of lists and the string representing a team name instead of its index, without changing our print statement.


In [None]:
def print_headtohead(data, team):
    '''
    (list of lists, str) -> None
    '''
    
    ...
    
    

Test case below:

In [None]:
print_headtohead(nhl_data, "Winnipeg Jets")

# BREAKOUT SESSION ROUND 2

Now generalize it even further, so the function can take in an entire set of team names, not just one.  It should print the Team's Head to Head data for each team in the set

In [None]:
def print_headtohead(data, team_set):
    '''
    (list of lists, set) -> None
    '''
    
    ...
    
    

Test case below:

In [None]:
print_headtohead(nhl_data, {"Winnipeg Jets", "Anaheim Ducks"})

In [None]:
for team in team_set:
    index = 0
    for i in nhl_data:
        if team in i:
            print_headtohead(nhl_data, index)
        else:
            index += 1

In [None]:
team = list(team_set)[0]
print_headtohead(nhl_data,1)

So, how can we sort through the set? One way might be:

In [None]:
'''
for team in team_set:
    print("Head to Head for: ", team)
    for lst in nhl_data:
        if team in lst:
            print_headtohead(nhl_data, team_set)
        else:
            print("No Head to Head for: ", lst[1])
'''

In [None]:
#alternate solution for looping through the set and 
#using a print_headtohead version that takes in one string at a time

nhl_team_index = 1
for team in team_set:
    print("Head to Head for: ", team)
    found_team = False        # this is the "flag"
    for lst in nhl_data:
        if lst[nhl_team_index] == team:
            print_headtohead(nhl_data, lst[nhl_team_index])
            found_team = True

        #else:   # if the flag hasn't been reset, then we didn't find the team
            #print("No Head to Head for: ", team)


As problems have multiple solutions: Let's think of another data structure we can use:

# Breakout Session 3

For fun, let's convert our list of lists to a dictionary

In [None]:
# Data Structure Design #2. Store the H2H data in a dict
# key is team name : the value will be a list of stats

def nhl_to_dict(nhl_list):
    """ (list of lists) -> (dict of lists)
    creates a dict from nhl_data with key being the team name
    and the value being the list of lists of data for each team
    """
    
    ...


Test case below:

In [None]:
nhl_dict = nhl_to_dict(nhl_data[1:])
print(nhl_dict)

#print("{" + "\n".join("{!r}: {!r},".format(k, v) for k, v in nhl_dict.items()) + "}")

So let's pause and think about this solution.

What is good about it?

What is bad about it?

Brainstorm: what data structures might fix some of the problems?

In [None]:
for team in team_set:
    if team in nhl_dict:
        print("Head to Head for", team)
        for i in range (len(nhl_data)):
            if nhl_data[i][nhl_team_index] == team:
                print_headtohead(nhl_data, team)

## Programming Plan: Step 4: Translate NHL data to our data structure.

Happily, this is almost done already since we did it in figuring out the previous step. Is there anything we want to change about this code? What don't you like?

In [None]:
def nhl_to_dict(nhl_list):
    """ (list of lists) -> (dict of lists)
    creates a dict from nhl_data with key being the team name
    and the value being the list of data for each team
    """

    team_dict = {}
    for team in nhl_list:
        '''
        if team[1] in team_dict:
            team_dict[team[1]].append(team)
        else:
            team_dict[team[1]] = [team]
        '''
        team_dict[team[1]] = team

    return team_dict

## Programming Plan: Step 5: Implement search for teams in the file

How do we search for team information in the file?

In [None]:
for team in team_set:
    print(team_set)
    if team in nhl_dict:
        print("Head to Head for", team)
        print_headtohead(nhl_data, team)
        


# Select a Solution

As is common, we had to write code to figure out how to solve the problem. That is fine and normal. What we need to be careful of is assuming that our first solution is the best. Ask some questions:
* Is this the most efficient way to solve this problem?
* What are the weaknesses of the code?
* Can the code be improved?
* Are there other data structures that should be considered? For example, what if each house was not represented by a dictionary but by a list or a tuple?

Let's put it all together.

In [None]:
import csv

def print_headtohead(data, index, team_index):

    print("Team Name:",data[index][team_index], "Games played:",
          data[index][team_index+1], "Wins:", data[index][team_index+2], "Losses:", data[index][team_index+3], "Ties: ", data[index][team_index+4], "Overtime Losses: ",data[index][team_index+5])


def nhl_to_dict(nhl_list):
    """ (list of lists) -> (dict of list of lists)
    creates a dict from nhl_list with key being the street name
    and the value being the list of lists of data for each team
    on the street
    """

    team_dict = {}
    for team in nhl_list:
        if team[1] in team_dict:
            team_dict[team[1]].append(team)
        else:
            team_dict[team[1]] = [team]

    return team_dict

nhl_dict = nhl_to_dict(nhl_data[1:])

#print(nhl_dict)


# read in the database
nhl_data = []
with open('TML_WinLoss.csv', 'r') as csvfile:
    nhl_stats_reader = csv.reader(csvfile)

    for row in nhl_stats_reader:
        nhl_data.append(row)



# Get the teams from the user and store them in a set

team_set = set()
prompt = "Enter a team name (type exit to end):  "

team = input(prompt)
while team != "exit":
    team_set.add(team)
    team = input(prompt)

#print(team_set)

for team in team_set:
    if team in nhl_dict:
        print("Head to Head for", team)
        for i in range (len(nhl_data)):
            if nhl_data[i][nhl_team_index] == team:
                print_headtohead(nhl_data, i, nhl_team_index)

Is there anything we might want to fix-up? Any functions that we should write?

There are a number of sections of code that could usefully go into a function in order to make the code more readable.

In [None]:
import csv

def print_headtohead(data, team_set):
    '''
    (list of lists, set) -> None
    '''
    for team in team_set:
        index = 0
        for lst in data:
            if team in lst:
                # display showing Team name, games played, wins, losses, ties, overtime losses
                print("Team Name:",data[index][1], "Games played:",
                      data[index][2], "Wins:", data[index][3], "Losses:", data[index][4], "Ties: ", data[index][5], "Overtime Losses: ",data[index][6])
            else:
                index += 1
                

def nhl_to_dict(nhl_list):
    """ (list of lists) -> (dict of list of lists)
    creates a dict from nhl_list with key being the team name
    and the value being the list of lists of data for each team
    """

    team_dict = {}
    for team in nhl_list:
        if team[1] in team_dict:
            team_dict[team[1]].append(team)
        else:
            team_dict[team[1]] = [team]

    return team_dict

def get_nhl_data(filename):
    '''
    (str)->list of lists of string
    Opens <filename> as a CSV file, reads in each row and returns the list of rows
    '''

    # read in the database
    nhl_data = []
    with open('TML_WinLoss.csv', 'r') as file:
        nhl_stats_reader = csv.reader(file)

        for row in nhl_stats_reader:
            nhl_data.append(row)
    #print(nhl_data)
    return nhl_data


def get_team_queries():
    '''
    None -> set of strings
    Prompts user to enter team names to query database
    '''
    team_set = set()

    team = input("Enter a team name in the format of 'City Mascot' (e.g., 'Montreal Canadiens') (type exit to end): ")
    while team != "exit":
        team_set.add(team)
        team = input("Enter a team name in the format of 'City Mascot' (e.g., 'Montreal Canadiens') (type exit to end): ")

    return team_set

'''
def process_queries(teams, nhl_data):
    
    #(set of str, dictionary of lists of list) -> None
    #Looks up each entry in teams CSV and prints the team info or an error message
    
    for team in team_set:
        if team in nhl_dict:
            print("Head to Head for", team)
            for i in range (len(nhl_data)):
                if nhl_data[i][nhl_team_index] == team:
                    print_headtohead(nhl_data, team)
'''

                    
# *** Main code ***

# Read in NHL data
nhl_data = get_nhl_data("TML_WinLoss.csv")

# Convert to dictionary (if we want to)
#nhl_dict = nhl_to_dict(nhl_data[1:])

# Get the teams from the user and store them in a set
team_set = get_team_queries()

# Run the queries on the NHL CSV
print_headtohead(nhl_data, team_set)

#Can you write the code that would determine who won the most?


# Challenge: 
### Write the code that would determine who has the highest winning percentage against the leafs