In [107]:
import pandas as pd

# Algorithms

Almost every time we write code that accomplishes some specific task, we are creating an algorithm. 

![Algorithms](../../images/algorithm.jpg "Algorithms")
![Algorithms](../images/algorithm.jpg "Algorithms")

An algorithm is a simple idea that has exploded in relevance with the mainstreaming of AI-based tools, it is a sequence of steps that accomplishes a task. For example, the following is an algorithm for finding the largest number in a list:

In [108]:
# Find largest number in a list
def find_largest_number(list):
    largest_number = list[0]
    for i in range(1, len(list)):
        if list[i] > largest_number:
            largest_number = list[i]
    return largest_number

In [109]:
sample_list = [23,1,34,2,578,123,4,-234,5,3214,2,-124,0,1234]
largest = find_largest_number(sample_list)
print(largest)

3214


## Algorithm Construction

Every function to sort values, calculate averages, or split a string into pieces is an algorithm. Creating algorithms is something we constantly do as programmers, and it is a skill that we can improve with practice. The following are some guidelines for creating algorithms:
<ul>
<li>Start with a clear statement of what the input is and what the output is.</li>
<li>Break the problem into pieces.</li>
<li>Test each piece separately.</li>
<li>Then put the pieces together.</li>
</ul>

Thinking systemically about how we design and structure our code will help us some problems with less frustration, generate fewer bugs, make our code more legible, and generally improve the quality of our lives and relationships.

### Inputs and Outputs
    
The core idea we focus on when building an algorithm is the starting point (input) and the ending point (output). When we write an algorithm, we are creating a recipe for how to accomplish a task. For the most part, we can relate a function to a task, or we can write a function to perform each algorithm that we need. The first thing that we want to consider when designing an algorithm is what the inputs and outputs are. This definition will set the basic terms for what we are doing, as well as outline the scope of the problem and what we need to accomplish our goal. For example, the largest_number function above has:
<ul>
<li> Input - a list of numbers</li>
<li> Output - the largest number in the list</li>
</ul>

This is also known as the method's signature. The signature is the name of the function, the inputs, and the outputs. This is also the "contract" that we agree to when creating our code - as long as we follow what the signature defines, we can do whatever we want inside the function. This is powerful and important in software - when we want to use some function from any library, we look at what it takes in, and what it returns, and we don't generally need to care about what's in the middle. This is called "encapsulation" and it is one of the most important concepts in software engineering. 

<b>Note:</b> it is common in commercial software development to define the inputs and outputs of a function, then passing that off to others to develop - as long as it does the job, it doesn't matter how it is done. If we have a clear idea of what the inputs and outputs must be, then we can also write tests to verify if the function meets those definitions or not. This is called "test-driven development" and it is very helpful in writing good code; a set of tests can be setup to verify functionality, then whenever changes are made - such as when someone updates a file in the central repository - those tests can be triggered to ensure that the changes don't break anything.

### Defining Signatures

When we define a function, we can define the inputs and outputs in the function definition. We can also add some optional indicators for what types those values are expected to be, these aren't required or enforced, but if we have a clear expectation of what to expect, we should formalize that here. This is called "type-hinting" as we are giving users a tip on what to do with our function. The syntax for those details is:
<ul>
<li> Input - colon then the expected data type in the argument list. </li>
<li> Output - arrow then the expected data type after the argument list. </li>
</ul>

For example, we can redefine the largest_number function (along with adding a docstring):
    

In [110]:
def find_largest_number(input_list: list) -> int:
    """
    A function to find the largest number in a list.
    
    Args:
        input_list: A list of numbers.
    
    Returns:
        The largest number in the list.
    """
    largest_number = input_list[0]
    for i in range(1, len(input_list)):
        if input_list[i] > largest_number:
            largest_number = input_list[i]
    return largest_number

In [146]:
import inspect
print(inspect.signature(find_largest_number))

(input_list: list) -> int


## Exercise

Write a function signature that takes in a list of numbers and returns the median value of the square roots of the middle number(s) (numerical middle, so the middle position if you were to sort the list). So if the list was [1, 3, 1, 2, 2], the middle value is 2 and the median is 2. If the list was [1,3,4,0], the two middle values are 1 and 3 and the median is 2. Think about what needs to be done step by step and psuedocode it in comments. 

In [111]:
def middleMedian(input_list: list) -> float:
    # Sort the list
    # Find the middle index
        # If the list is even, take the  two middle numbers
        # If the list is odd, take the middle number
    # Calculate the median
    # Return the median
    pass

In [112]:
def middleMedian(input_list: list) -> float:
    """
    A function to find the middle median of a list.
    
    Args:
        input_list: A list of numbers.
    
    Returns:
        The middle median of the list.
    """
    input_list.sort()
    if len(input_list) % 2 == 0:
        return (input_list[len(input_list) // 2 - 1] + input_list[len(input_list) // 2]) / 2
    else:
        return input_list[len(input_list) // 2]

In the middle of writing our code, we can also extend this concept of defining the signature first as a tool to help us plan out our code, and to accomplish part 2 - breaking problems into parts. 

### Breaking Up (Is Hard To Do)

We can make hard problems more simple by splitting them into smaller sub-problems and solving those sub-problems one at a time. If we consider how this fits with defining our functions, this works well - we can define all the steps that need to be done, without having to dwell on exactly <i>how</i> to do each detail, then fill in the details later. This is a common technique in software development, and it is called "top-down design" or "step-wise refinement".

For example, suppose we were using the find_largest_number function from above to calculate a rebate for your company's largest customer, and there are some other parts of the logic I don't know yet. So, our fake scenario is:
<ul>
<li> We have a list of what different customers spent at our store. </li>
<li> We want to find the largest number in that list. </li>
<li> We want to give that customer a 10% rebate cheque. </li>
</ul>

We can set this up as a challenge with separate functions for each step, then fill in the details later. For example: 

In [113]:
def calculateRebate(price: float, rebate:float=.1) -> float:
    # I still need to code this function
    final_rebate = 25
    return final_rebate

def generateMessage(rebate:float) -> str:
    # I also need to finish this one. 
    message = "Your rebate is: "
    return message

def rebateCustomer(customer_sales: list, rebate:float=.1) -> float:
    # Find the largest customer from the list of sales
    largest_customer = find_largest_number(customer_sales)
    # Calculate the rebate
    rebate = calculateRebate(largest_customer, rebate)
    # Generate the message
    message = generateMessage(rebate)
    return rebate

In [114]:
rebateCustomer(sample_list)

25

In [147]:
print(inspect.signature(calculateRebate))

(price: float, rebate: float = 0.1) -> float


Here, the two new functions to calculate the rebate and generate a message are just placeholders, they don't do any work, they just return a fixed, but valid, value. This is a common technique in software development, and it is called "stubbing" - we are creating a stub of a function that we can fill in later. These placeholders will allow the rest of our code to work, as they always give back some usable value, it just won't be correct, as they don't do any logic. We can set these parts aside though, and just use the placeholders while developing the rest of the code, then come make them work properly later.

Here, all the code that calls those functions can work with the stubs that we made above, when we get back to fill them in, we should see the actual results start to become correct. Here, these functions are very simple, but we can use this idea to help with algorithms that are more complex as well; as long as our stub is capable of returning some plausible value, we can keep it there while we solve how to actually do it. 

In [115]:
def calculateRebate(price: float, rebate:float=.1) -> float:
    """
    A function to calculate the rebate for a customer.
    
    Args:
        price: The price of the customer's largest purchase.
        rebate: The rebate percentage. Defaults to 0.1.
    
    Returns:
        The rebate amount.
    """
    final_rebate = price * rebate
    return final_rebate

def generateMessage(rebate:float) -> str:
    """
    A function to generate a message for the customer.
    
    Args:
        rebate: The rebate amount.
    
    Returns:
        The message.
    """
    message = "Your rebate is: " + str(rebate)
    return message


In [116]:
rebateCustomer(sample_list)

321.40000000000003

## Exercise

Break the middleMedain into separate functions for each step. 

In [117]:
# Input: List of numbers
# Output: Two values to average together for median 
def findMiddle(input_list:list):
    tmp_len = len(input_list)
    if tmp_len % 2 == 0:
        pos1 = tmp_len // 2 - 1
        pos2 = tmp_len // 2
    elif tmp_len % 2 == 1:
        pos1 = tmp_len // 2
        pos2 = tmp_len // 2
    return pos1, pos2

In [118]:
# Input: list of numbers
# Output: median of those numbers
def generateMedian(median_list:list):
    median = 25
    return median

# Input: number
# Output: square root 
def squareRoot(number:float):
    root = number/2
    return root

In [119]:
# Input: list of numbers
# Output: the median of the middle number(s)
def middleMedian(input_list:list):
    # Find the middle index
    pos1, pos2 = findMiddle(input_list)
    # Calculate the median
    median = generateMedian([input_list[pos1], input_list[pos2]])
    # Return the median
    return median

### Test Components Separately

If we have our algorithms split into their own functions, we can test each of those functions to ensure that they each work, individually. This is often easier than testing the entire output of our program to see if there's an error - we still want to do so, but if we build up the testing one part at a time, we can fix things while they are still in their simple phases. This is called "unit testing", where we test each "unit" of our code separately. Again, as long as they perform "to spec", or to the signature that we defined, we can expect them to work okay when we put them together.

#### Assert 

One tool that we can use to do some checks is the assert function, which tests if some condition is true, and if not, it will raise an error. In the sample test code, or testing harness, below we can test that our rebate function works as expected by calling it with some test data - a set of inputs and expected outputs. Note that the test data here should cover the different scenarios we will encounter - we have some integers, floats, and a negative value, all things that we might expect to see in our data. In more complex scenarios, we would likely have much more test data, particularly with data that is on an edge case or that might cause an error, to ensure we fail correctly. 

In [120]:
def testCalcRebate(inputs, outputs):
    for input in inputs:
        output = calculateRebate(input)
        assert output == outputs[inputs.index(input)], "The rebate amount is incorrect"

In [121]:
test_rebate_inputs = [100, 50, 4.5, 0, -1.2]
test_rebate_outputs = [10, 5, 0.45, 0, -0.12]

testCalcRebate(test_rebate_inputs, test_rebate_outputs)

In a slightly more complex case

In [122]:
def testRebate(inputs, outputs):
    for input in inputs:
        output = rebateCustomer(input)
        assert output == outputs[inputs.index(input)], "The rebate amount is incorrect"

In [123]:
test_full_rebates_in = [[1,2,3,4,5], [1], [1204,24, 234], [12.5, 23, 1234.5]]
test_full_rebates_out = [0.5, 0.1, 120.4, 123.45]

testRebate(test_full_rebates_in, test_full_rebates_out)

If we introduce some error (in the test data), we can see what happens:

In [124]:
test_full_rebates_out_err = [0.5, 0.1, 120.3, 123.45]

testRebate(test_full_rebates_in, test_full_rebates_out_err)

AssertionError: The rebate amount is incorrect

## Exercise - Test-Driven Development

Build a test harness and some test data for the middleMedian function. Create some code to try your tests. In my examples, I haven't filled in the methods yet, so I expect them to fail. I'm doing "test-driven development" here, I've specified what qualifies as success entirely before I write the code to do it. This is a generally good practice as it forces us to clearly think about what is successful and what isn't before we start writing code. It can also make it easier to split development among people, and potentially even hand parts off to AI, as now we have a strict definition of success that anyone can meet. 

In [None]:
def testMiddleMedian(inputs, outputs):
    for input in inputs:
        output = middleMedian(input)
        assert output == outputs[inputs.index(input)], "The median is incorrect: "+str(input)+" "+str(output)

In [None]:
# Test cases
test_middle_median_inputs = [[1,2,3,4,5], [1], [1204,24, 234], [12.5, 23, 1234.5]]
test_middle_median_outputs = [3, 1, 24, 23]


In [None]:
# Run the tests
testMiddleMedian(test_middle_median_inputs, test_middle_median_outputs)

AssertionError: The median is incorrect

### Basic Error Handling

One thing that we can do to make our code more robust is to add some error handling. This is a way to catch errors that might occur, and to handle them in a way that makes sense. We'll revisit this as we go, but we can start with some smarter error handling for out test setup here to introduce it. 

Error handling works by a basic concept:
<ul>
<li> When some error occurs, that error is "raised" or "thrown" by the code that encounters it. </li>
<li> We can "catch" that error and do something with it. </li>
</ul>

![Try-Except](../../images/try_except.png "Try-Except")
![Try-Except](../images/try_except.png "Try-Except")

To implement this, we can use a few special error handling keywords, try, except, else, and finally. 
<ul>
<li> The try block is where we put the code that might cause an error.</li>
<li> The except block is where we put the code to handle that error.</li>
<li> The else block is where we put the code that should run if there is no error. This works like an if-else block, where the above is the True and this is the False, with the condition being "was there an error". </li>
<li> The finally block is where we put code that should always run, regardless of whether an error occurred or not.</li>
</ul>

For example, we can add some error handling to our test harness above. We'll set up those 3 sections to capture what we are doing:
<ul>
<li> Try - the code that can generate the error goes in a "try block". </li>
<li> Except - the code that handles the error goes in an "except block". Here we are adding the error to a list of errors. Note that we can define different except blocks for different types of errors, so we can handle them differently. </li>
<li> Else - the code that should run if there is no error goes in an "else block". Here we are adding the result to a list of results. </li>
<li> Finally - the code that should always run goes in a "finally block". Here we to the next iteration of the loop. </li>
</ul>

Now we can capture what worked, what failed, and what errors we encountered. As a side bonus, our code no longer fully fails if we encounter an error, it captures that error and moves on (if that's possible). This is normal if we ever create programs that will be used by others or deployed for extended periods of time, we shouldn't have a program that goes full blue-sceen-of-death if we find a stray negative number in our data.

In [None]:
def testCalcRebate(inputs: list, outputs: list) -> tuple:
    errors = []
    correct = []
    for input in inputs:
        try:
            output = rebateCustomer(input)
            assert output == outputs[inputs.index(input)], "The rebate amount is incorrect"
        except AssertionError as e:
            errors.append(input)
        else:
            correct.append(input)
        finally:
            continue    
    return errors, correct

In [None]:
err, cor = testCalcRebate(test_full_rebates_in, test_full_rebates_out)
print(err)
print(cor)

[]
[[1, 2, 3, 4, 5], [1], [1204, 24, 234], [12.5, 23, 1234.5]]


In [None]:
err, cor = testCalcRebate(test_full_rebates_in, test_full_rebates_out_err)
print(err)
print(cor)

[[1204, 24, 234]]
[[1, 2, 3, 4, 5], [1], [12.5, 23, 1234.5]]


### Error Types

In the example above we generated an assertion error when things don't match, but there are many other types of errors that we can encounter. Every time we do something incorrectly, we generate some variety of error along with our code grinding to a halt. The following are some common errors that we might encounter:
<ul>
<li> AssertionError - raised when an assert statement fails. </li>
<li> AttributeError - raised when an attribute reference or assignment fails. </li>
<li> ImportError - raised when the imported module is not found. </li>
<li> IndexError - raised when the index of a sequence is out of range. </li>
<li> KeyError - raised when a key is not found in a dictionary. </li>
<li> KeyboardInterrupt - raised when the user hits the interrupt key (Ctrl+C or Delete). </li>
<li> MemoryError - raised when an operation runs out of memory. </li>
<li> NameError - raised when a variable is not found in local or global scope. </li>
<li> OverflowError - raised when the result of an arithmetic operation is too large to be represented. </li>
<li> ReferenceError - raised when a weak reference proxy is used to access a garbage collected referent. </li>
<li> RuntimeError - raised when an error does not fall under any other category. </li>
<li> StopIteration - raised by next() function to indicate that there is no further item to be returned by iterator. </li>
</ul>

For each of these error types we can either use general except blocks to deal with errors that might occur, or we can create specific except blocks to manage, and potentially correct, specific errors that we have prepared for. For example, anything that connects over the internet should be prepared to deal with a dropped connection error, anything that reads data from disk should be prepared for some error with file access, etc...

Errors can be raised by something like the assert statement above, or we can use the keyword <b>raise</b> EXCEPTION_TYPE to raise an error ourselves. We can also use the keyword <b>pass</b> to do nothing, but to avoid an error that causes the code execution to halt; for example, if you wanted your code to just ignore any unacceptable input and continue on, we could have pass in the except block to let us proceed. Each constructor for an error type can take in some message to display, so we can use that to give some information about what happened as a string. Like any other string, we can customize this string, so if there is useful information like variable values, loop counters, or other details that might help us debug, we can include those in the message.

In [149]:
# This loop is dumb, examine why...
# I just want to generate an error. 
def sortList(input_list:list) -> list:
    try:
        i = 0
        while True:
            if i >= len(input_list)-1:
                raise IndexError("The list is not sorted, so we got issues. Index: " + str(i)+" Length: "+str(len(input_list)))
            if input_list[i] > input_list[i+1]:
                input_list[i], input_list[i+1] = input_list[i+1], input_list[i]
            i += 1
    except AssertionError as e:
        print("I'm an assertion error")
    except IndexError as e:
        # I'll add a little info to the error above where it is raised. 
        # we can use this to help us!!
        print(e)
    finally:
        return input_list

In [150]:
sortList([10,9,8,7,6,5,4,3,2,1])

The list is not sorted, so we got issues. Index: 9 Length: 10


[9, 8, 7, 6, 5, 4, 3, 2, 1, 10]

In [163]:
sortList([2334,34,147,478])

The list is not sorted, so we got issues. Index: 3 Length: 4


[34, 147, 478, 2334]

## Exercise

Implement error catching and complete the bulk of the code. 

In [151]:
# Input: list of numbers
# Output: median of those numbers
def generateMedian(median_list:list) -> float:
    median = None
    try:
        num1 = median_list[0]
        num2 = median_list[1]
    except IndexError as e:
        print("The list is not long enough to calculate the median.")
    else:
        median = (num1 + num2) / 2
    finally:
        return median

# Input: number
# Output: square root 
def squareRoot(number:float) -> float:
    try:
        root = number**0.5
    except TypeError as e:
        print("The number must be a float or an int.")

    return root

### Putting Things Together

Once we have each piece working, we can assemble our code together. For this, we already have a tested and (hopefully working) core algorithm - the thing that finds the largest purchase and calculates the rebate. We just need to stitch it together. For mine, I'll update the main method to add some error checking, then call the other functions that I wrote above to do the work. In reality, I'd probably want to consolidate this stuff into a single section, but I don't need to go back to the old parts of the code that already work, like the middle number part that I wrote at the beginning. 

In [152]:
# Input: list of numbers
# Output: the median of the middle number(s)
def middleMedian(input_list:list) -> float:
    # Find the middle index
    pos1, pos2 = findMiddle(input_list)
    # Calculate the median
    try:
        median = generateMedian([input_list[pos1], input_list[pos2]])
    except AssertionError as e:
        print(e)
    else:
        print("The median is: "+str(median))
    # Return the median
    return median

In [153]:
testMiddleMedian(test_middle_median_inputs, test_middle_median_outputs)


The median is: 3.0
The median is: 1.0
The median is: 24.0
The median is: 23.0


## Exercise

Build a function to construct the data for a histogram from a list of <b>float numbers</b>. The output should be:
<ul>
<li> A data structure that defines both the bins and the counts for each bin. There are many ways to do this, one of the built-in data structures is likely much better than the others. It could also be a custom data structure that you define. </li>
    <ul>
    <li> If you all are using the thinkstats/thinkplot stuff in stats, the optimal outcome would be something that could be fed directly into the Hist stuff. </li>
    <li> The output could even be a Hist object, if you want to use that. </li>
    </ul>
<li> If you can get things working, try to add some error handling to skip over any non-float values. </li>
</ul>

![Histogram](../../images/histogram.png "Histogram")
![Histogram](../images/histogram.png "Histogram")

<b>We want the data in this, an indication of the range of each bin as well as the count of the items in there.</b>

The specifications for this are consciously vague, there are lots of ways to achieve the above. In particular, the specific details on how you construct and define the bins can vary, as long as you are consistent internally. There are some general standards for how data is treated in Python that may help:
<ul>
<li> Ranges are generally assumed to be inclusive of the left boundary, exclusive on the right. </li>
<li> There are functions that will do this for us, including "cut" in pandas, and "linspace" in numpy, but try to do it by hand here. </li>
</ul>

After the function is written, you can try to integrate it with the NHL stuff below, so that you can feed in any of the data columns and get a histogram of the data. Details are up to you, make it easy to get one of these histogram data structures form one of the stats. 

In [164]:
def getBinStarts(min_val, max_val, num_bins):
    data_range = max_val - min_val
    bin_width = data_range / num_bins
    bin_starts = {min_val + i * bin_width:0 for i in range(num_bins)}
    return bin_starts

def histogramBuilder(input_data:list, num_bins:int=10) -> dict:
    # Find the min and max of the data
    # Calculate the range
    # Calculate the bin width
    # Calculate the bin left edges
    # Calculate the bin centers
    # Calculate the bin heights
    # Return the bin centers and heights
    for item in input_data:
        try:
            if type(item) != int and type(item) != float:
                raise TypeError("The input data must be a list of numbers"+str(item))
        except TypeError as e:
            #print(e)
            input_data.remove(item)
            continue

    min_val = min(input_data)
    max_val = max(input_data)
    data_range = max_val - min_val
    bin_width = data_range / num_bins

    #bin_starts = {min_val + i * bin_width:0 for i in range(num_bins)}
    bin_starts = getBinStarts(min_val, max_val, num_bins)
    #print(bin_starts)

    for item in input_data:
        for i in bin_starts.keys():
            #print(i, item)
            if item >= int(i) and item < int(i) + bin_width:
                #print("here", bin_starts[i])
                bin_starts[i] += 1
    return bin_starts

In [165]:
histogramBuilder([1,1,2,2.43,2,3,4,5,6,7,8,"twelve",9,10], 5)

{1.0: 5, 2.8: 4, 4.6: 2, 6.4: 2, 8.2: 2}

In [166]:
print(inspect.signature(histogramBuilder))

(input_data: list, num_bins: int = 10) -> dict


## Exercise

Try to build out the class below. We want each instance of the class to represent one NHL team from the data below. The class should have at least the following attributes:
<ul>
<li> Team name </li>
<li> Team stats - a dictionary of the stats for the team, with the stat name as the key and the value as the value. </li>
</ul>

As well as at least the following methods:
<ul>
<li> A constructor. </li>
<li> A method to calculate the team's points. The PTS column has been dropped below, write a function to calculate it. A W is worth 2 points, an OL (both data columns) is worth 1 point. </li>
<li> An overloaded __str___ method to print a sentence similar to "This team has this many points". </li>
<li> "Getter" methods to return the stats and the name. </li>
<li> <b>Bonus:</b> a "calculateRank" method that takes in a league as well as a column name, and returns the rank of the team in that column. </li>
</ul>

<b>Notes:</b>
<ul>
<li> Data structures are <i>mostly</i> pretty smart when it comes to making one from the contents of another. 
<li> You can call other methods from inside of a constructor. So if there was some method to "set something up", you can call that as the item is created, you don't need to copy/paste the code into the constructor. </li>
<li> Pandas has several built in methods for aggregation that may be useful. </li>
<li> The itterows method of a dataframe can be used to loop over the rows of a dataframe. E.g. for index, row in df.iterrows(): </li>
</ul>

<b>Note:</b> file access is a good spot for error handling. Here I created this file in a subfolder, so I can use the error catching to try the other path. 

In [167]:

try:
    df = pd.read_excel("../data/sportsref_download.xlsx", header=1)
except FileNotFoundError as e:
    df = pd.read_excel("../../data/sportsref_download.xlsx", header=1)
finally:
    print("We got the data!")

df.drop(columns={"PTS"}, inplace=True)
df = df.head(32)
df.tail()

We got the data!


Unnamed: 0,Rk,Unnamed: 1,AvAge,GP,W,L,OL,PTS%,GF,GA,...,PK%,SH,SHA,PIM/G,oPIM/G,S,S%,SA,SV%,SO
27,28.0,Vegas Golden Knights,28.2,6,2,4,0,0.333,13,20,...,80.0,1,0,9.0,8.8,208,6.3,202,0.901,0
28,29.0,Los Angeles Kings,28.2,6,1,4,1,0.25,14,20,...,58.82,0,1,7.2,10.0,218,6.4,184,0.891,0
29,30.0,Montreal Canadiens,28.3,7,1,6,0,0.143,11,25,...,64.0,0,0,7.7,7.4,192,5.7,201,0.876,0
30,31.0,Chicago Blackhawks,27.8,6,0,5,1,0.083,12,27,...,90.91,0,0,10.0,11.0,186,6.5,182,0.852,0
31,32.0,Arizona Coyotes,28.4,6,0,5,1,0.083,11,29,...,35.71,0,1,13.0,12.7,160,6.9,188,0.846,0


In [168]:
class nhlTeam():

    def __init__(self, team_name) -> None:
        self.stats = dict()
        self.name = team_name
        self.POINTS = -1
        #self.getPOINTS()
        
    def setPOINTS(self, win="W", otl="OL") -> None:
        self.POINTS = self.stats[win] * 2 + self.stats[otl]

    def loadData(self, stat_line):
        self.stats = stat_line
        self.setPOINTS()
    
    def getStats(self):
        return self.stats
    
    def getName(self):
        return self.name
    
    def calculateRank(self, league, stat):
        tmp = []
        for team in league:
            tmp.append(team.stats[stat])
        tmp.sort(reverse=True)
        return tmp.index(self.stats[stat]) + 1
    
    def __str__(self) -> str:
        return f"{self.name} have {self.POINTS} points"

This method is what I used to create a "league" of teams. You can use this, or you can create your own. Nothing in here is specifically needed, it just splits the team name and the rest of the stats, and feeds them into some of these NHL team objects above. The entire league is a list of objects. 

In [169]:
def buildLeague(file_path:str) -> dict:
    df = pd.read_excel(file_path, header=1)
    league = []

    for row in df.iterrows():
        #print(row[1][1])
        team = nhlTeam(row[1][1])
        tmp = dict(row[1][2:])
        #print (tmp)
        team.loadData(tmp)
        league.append(team)
    return league

In [170]:
my_league = buildLeague("../../data/sportsref_download.xlsx")

#### Tests

Print the teams to see their string representation. 

In [171]:
# Check the league and the __str__ method
for team in my_league:
    #print(team.getName())
    print(team)

Florida Panthers have 12 points
Carolina Hurricanes have 10 points
Edmonton Oilers have 10 points
St. Louis Blues have 10 points
Minnesota Wild have 10 points
Washington Capitals have 10 points
Buffalo Sabres have 9 points
Calgary Flames have 9 points
New York Rangers have 9 points
San Jose Sharks have 8 points
Columbus Blue Jackets have 8 points
Pittsburgh Penguins have 8 points
New York Islanders have 7 points
Vancouver Canucks have 7 points
Detroit Red Wings have 7 points
Winnipeg Jets have 7 points
Tampa Bay Lightning have 7 points
Nashville Predators have 6 points
Boston Bruins have 6 points
Dallas Stars have 6 points
New Jersey Devils have 6 points
Anaheim Ducks have 5 points
Philadelphia Flyers have 5 points
Toronto Maple Leafs have 5 points
Seattle Kraken have 5 points
Ottawa Senators have 4 points
Colorado Avalanche have 4 points
Vegas Golden Knights have 4 points
Los Angeles Kings have 3 points
Montreal Canadiens have 2 points
Chicago Blackhawks have 1 points
Arizona Coyotes 

Check the rank function.

In [172]:
my_league[7].calculateRank(my_league, "GA")

21