# Recap

## Special Methods

Each "outside" operation, like...

In [None]:
3 > 1

....has a corresponding "inside" implementation:

In [None]:
(3).__gt__(1)     # same as 3 > 1

Some prominent examples are:

 * `__init__` for creating new objects
 * `__str__` for `str(...)`
 * `__repr__` for `repr(...)`

Comparison operators:

 * `__gt__` for `>`: **g**reather **t**han
 * `__lt__` for `<`: **l**ess **t**han
 * `__ge__` for `>=`: **g**reater than or **e**qual to
 * `__le__` for `<=`: **l**ess than or **e**qual to
 * `__eq__` for `==`: **eq**ual to
 * `__ne__` for `!=`: **n**ot **e**qual to

Other features:

 * `__hash__` for computing an objects hash number
 * `__len__` for `len(...)`
 * `__iter__` and `__next__` for `for x in my_object`
 * `__setattr__` for `my_object.some_attribute = a_value`
 * `__getattr__` for `my_object.some_attribute`


## `NotImplemented` for rich comparison operators

When implementing comparison operators, it's important to reject `other` types which are not supported for comparison:

In [None]:
class Bill:
    def __init__(self, denomination):
        self.denomination = denomination
        
    def __eq__(self, other):
        if not isinstance(other, Bill):          # only compare to other Bill objects
            return NotImplemented                # for foreign types, return NotImplemented instead
        return self.denomination == other.denomination

In [None]:
a = Bill(50)
b = Bill(100)
c = Bill(50)
print(a == b)
print(a == c)
print(a == 50)   # Bill.__eq__ returns NotImplemented, which is coerced to meaning False
print(50 == a)   # int.__eq__ also does

**Important**: `NotImplemented` is a not an exception. You do **not** `raise NotImplemented`! You have to `return NotImplemented`.

## Hashing

Generally speaking, a hash is a short string or number that can easily be computed from a more complex object. In Python, hashes are integers.

 * Computing the hash of an object multiple times should always return the same number
 * Two objects that have the same hash *could* be equivalent, but they don't have to be (if it's just a hash collision: when two different objects accidentally have the same hash).
 * Two objects that are equivalent (`a == b`) *must* have the same hash.

Requirements for implementing `__hash__`:

 * `__hash__` must return an integer
 * `__eq__` must also be implemented
 * if two objects are equal (`==`), then the hash computed by `__hash__` should be the same

In practice, implementing `__hash__` typically means:
 * Implementing `__eq__`
 * Taking the same object attributes which determine equality, adding them all to a tuple and returning `hash(on_that_tuple)`

**Important:** All the attributes that go into the tuple to be hashed must also be hashable (i.e., immutable).

In the following example, ingredients is a dictionary. But still, the contents matter for the hash. So in that case, we first need to transform the dictionary into some hashable data structure when implementing `__hash__`:

In [None]:
class Recipe():
    def __init__(self, name, ingredients):
        self.name = name
        self.ingredients = ingredients

    def __str__(self):
        return f"A recipe for {self.name}"

    def __eq__(self, other):
        if not isinstance(other, Recipe):
            return NotImplemented 
        return self.ingredients == other.ingredients    # we don't care about the name of the recipe. Two recipies with the same ingredients are equivalent

    def __hash__(self):
        return hash(tuple(self.ingredients.items()))
        
i = {"Butter (grams)": 20, "Garlic, smashed (cloves)": 1, "Salt (tsp)": .25}
r1 = Recipe("Garlic butter", i)
r2 = Recipe("Allium spread", i)
print(r1 == r2)
print(hash(r1))
print(hash(r2))

However, hashes see much more widespread use than you might think.

### Example 1: checking file integrity

Hashes are an easy and convenient way to check whether a file is fully intact, or whether it has been changed somehow. Many file downloads ([such as the download for Windows 11](https://www.microsoft.com/en-us/software-download/windows11)) will advertise hashes for their files. After you download the file, you can compute its hash and compare it to the advertised hash. If it differs, it means your download got corrupted somehow.

### Example 2: storing passwords

Any decent web service will **not** actually store any passwords. At best, they will store hashes of passwords. This still enables user authentication, but it mitigates the impact of data leaks.

This first example illustrates the problem: When a user sets their password, it is stored in the "database" and if anyone ever gets access to the database, they will see all users' passwords in plain text! Since many people use the same password everywhere, this is a very, VERY bad technique.

In [None]:
# Bad! Stores passwords in plain text!

class UserDB:
    def __init__(self):
        self.users = {}

    def set_password(self, username, password):
        self.users[username] = password
    
    def login(self, username, password):
        if self.users[username] == password:
            print("Login successful")
        else:
            print("Invalid password")

db = UserDB()
db.set_password("bob89", "hunter2")
db.login("bob89", "pw")
db.login("bob89", "hunter2")
print(db.users)                     # passwords can be leaked easily!

This problem can be solved quite easily by only storing the hash of each password:

 1. The user sets their password on the website
 2. The password is hashed and only the hash is stored. The password is immediately forgotten
 3. When the user wants to log in, they enter their password, and we just check if it has the same hash as the one in the database

Because hash functions are generally non-reversible, it is not possible to determine the password (string) from a given hash.

In [None]:
# Better. Only hashes are stored:

class UserDB:
    def __init__(self):
        self.users = {}

    def set_password(self, username, password):
        self.users[username] = hash(password)
    
    def login(self, username, password):
        if self.users[username] == hash(password):
            print("Login successful")
        else:
            print("Invalid password")

db = UserDB()
db.set_password("bob89", "hunter2")
db.login("bob89", "pw")             # authentication still works correctly, because (almost all) wrong passwords have a different hash
db.login("bob89", "hunter2")        
print(db.users)                     # Password is not stored at all! Only hashes are stored.

Note that this is **not** enough in practice. That's because a nefarious attacker could easily generate hashes for millions of passwords. This is called a "rainbow table". Then, when they gain access to the database, they can compare their hashes to the existing ones and still reverse-engineer your personal password.

In [None]:
# generate a lookup table from hash to actual password:
def generate_rainbow_table():
    res = {}
    for pw in ["password", "admin", "hunter2", "password1", "querty"]:
        res[hash(pw)] = pw
    return res

In [None]:
# hack into the server and get the stored hash for bob89:
print(db.users)

In [None]:
rt = generate_rainbow_table()
print(rt[7558422558281044130])    # attacker can simply look up the hash in the rainbow table

Now, the attacker knows your password, even though the server only stored its hash! Now they can use your password on all the other website where you used the same password. This is only possible, because the attacker knows how the hash is computed. For this reason, hashes are typically "salted" on the server. In the simplest case, that could look like this:

In [None]:
class UserDB:
    def __init__(self, salt):
        self.salt = salt
        self.users = {}

    def set_password(self, username, password):
        self.users[username] = hash((password, self.salt))
    
    def login(self, username, password):
        if self.users[username] == hash((password, self.salt)):
            print("Login successful")
        else:
            print("Invalid password")

db = UserDB("secret_salt")
db.set_password("bob89", "hunter2")
db.login("bob89", "pw")             # authentication still works correctly, because the salt is applied both when saving and checking the password
db.login("bob89", "hunter2")        
print(db.users)                     # Password is not stored at all! Only hashes are stored.
try:
    print(rt[-1393476854987916374])
except KeyError:
    print("Hash for user bob89 not found")

This at least prevents the attacker from generating rainbow tables that work in many scenarios. If they were able to obtain the secret salt, then they could still generate the rainbow table. That's why there's another technique called "adding pepper"... which is out of scope for this lecture.

**Two important side-notes**:

 1. NEVER implement your own cryptography. Re-use [tried and tested libraries](https://docs.python.org/3/library/crypto.html). If you build applications that require user management, use an existing service or implementation.
 2. If it really matters, get a cryptograph expert to do things for you. Not every developer has to be a cryptography expert!

## Example 3: Blockchain 🤢 and Crytpocurrencies 🤮

While blockchain has legitimate uses, traditional cryptocurrencies require "proof of work". Without getting into the details, all this "work" involves is computing millions of hashes until you, by pure chance, find one that has specific characteristics (typically a given number of leading `0`s). Here's an illustration of how that works: 

In [None]:
import hashlib
from time import time

def proof_of_work(difficulty):
    target = '0' * difficulty
    data = "Does-not-really-matter"
    nonce = 0
    start_time = time()

    # compute hashes over and over again until we luck out
    while True:
        # By adding nonce to the data, we generate a new hash each iteration
        test_string = f"{data}{nonce}"
        # Compute hash (here we use sha256, one of many standardised hash functions)
        hash_result = hashlib.sha256(test_string.encode()).hexdigest()
        # Check if the hash starts with the required number of zeroes
        if hash_result.startswith(target):
            print(f"""Found solution after {nonce} tries, spending {time() - start_time:.2f} seconds: "{test_string}" -> {hash_result}""")
            return # finally done with this nonsense
        nonce += 1

proof_of_work(2)   # easy
proof_of_work(6)   # hard
#proof_of_work(30) # burn the planet

What an insane waste of energy!

## `try` / `except` / `finally` / `else`

Just keep these things in mind:

 * It's better to except specific exeptions instead of using naked `except`.
 * When an `except` block triggers, the content of that block has nothing to do with the rest of the entire try/except construct. Any additional exceptions that happen there need to be caught separately.

In [None]:
try:
    print("trying...")
    raise ValueError
except ValueError:
    print("ValueError occured")
except IndexError:
    print("IndexError occured")
except:
    print("some other exception occurred")
else:
    print("no exceptions occured")
finally:
    print("finally always executes")

<p style="height:100px"></p>
<hr>
<p style="height:100px"></p>




# `type` vs `isinstance`

The `type` function will give you the exact type of an object. `isinstance` on the other hand will check the entire inheritance tree and will return `True` not only if the given object matches the given type exactly, but also if it is a subclass of the given type.

In the following example, `Sub` inherits from `Middle` and `Middle` inherits from `Topmost`:

In [None]:
class Topmost:
    pass

class Middle(Topmost):
    pass

class Sub(Middle):
    pass

s = Sub()
print(s)

As such, `type(x)` will give the exact type (`Sub`):

In [None]:
type(s)

But `isinstance` will return `True` for any type that the instance inherits from. So our `x`, which is of type `Sub` **also** inherits from `Middle` and `Topmost` eventually!

In [None]:
print(isinstance(s, Sub))
print(isinstance(s, Middle))
print(isinstance(s, Topmost))
print(isinstance(s, object))   # real top-most, as Topmost still inherits from object
print(isinstance(s, str))

As a matter of fact `Topmost` is *not* the topmost class that `Sub` inherits from. Because in reality, `Topmost`, like all custom classes, inherits from `object`:

In [None]:
isinstance(s, object)

In [None]:
isinstance("hello", str)

In [None]:
isinstance("hello", object)

Note how even functions inherit from `object`:

In [None]:
isinstance(logger_factory, object)

# Class attributes vs instance attributes

You know that classes are like "blueprints" to make concrete objects. When we create a new instance of an object, the object will have instance-specific members:

In [None]:
class Toyota:
    def __init__(self, model):
        self.model = model

    def honk(self):
        return f"HONK, I'm a {self.model} HONK"


t1 = Toyota("Yaris")                  # t1 is an *instance* of Toyota
print(t1.model)                       # t1 has a "model" attribute, which is specific to the instance t1 (it was set via self.model = model)
print(t1.honk())                      # t1 also supports the "honk" method
print(Toyota.honk(t1))                # honk is defined as a member of Toyota
print(isinstance(t1, Toyota))         # t1 is an object of type 'Toyota'
print(isinstance(Toyota, type))       # Toyota is an object of type 'type'
print(isinstance(t1, object))         # all types inherit from object eventually
print(isinstance(Toyota, object))

In [None]:
print(type(t1))
print(type(Toyota))

So `t1` is an instance of `Toyota` and we can see that it has a variety of attributes, including `model`, but also `honk` (and `__init__`, etc.):

In [None]:
dir(t1)

However, **classes are also objects**. So not only do *instances* of a class have members, so does the *class* itself. It does not have `model` an an attribute (because model is set on `self`, not `Toyota`), but it does show `honk`:

In [None]:
dir(Toyota)

And we can **also** add attributes to the class (which is just another object). Notice how `some_attribute` is at the same indentation level as all the methods.

In [None]:
class Toyota:
    
    some_attribute = "I'm an attribute of the Toyota class (NOT of an instance)"
    
    def __init__(self, model):
        self.model = model

    def honk(self):
        return f"HONK, I'm a {self.model} HONK"

print(Toyota.some_attribute)      # NOT set via self.some_attribute ... but set directly on the class itself

Now, when we reflect on the `Toyota` object, we can see that it has an additional attribute:

In [None]:
dir(Toyota)

This may look like something new, but should not come as a surprise. After all, *methods* (which sit at the same level as `some_attribute`) are also members of the *class*:

In [None]:
print(Toyota.some_attribute)
print(Toyota.__init__)
print(Toyota.honk)

So really, now that you know how functions are just regular objects, it's kind of obvious that you can not only specify methods (function attributes of class objects), but also other kinds of attributes, like a string: there is not much different between `some_attribute` and `__init__`, it's just that one is a string and one is a function, but they are both attributes of the `Toyota` class object. Note that just like with the method `honk`, the attribute `some_attribute` is also available both via the class itself (where it is actually defined), **and** through any specific instance of the class:

In [None]:
t1 = Toyota("Corolla")
print(Toyota.some_attribute)  # class attribute accessible via class
print(Toyota.honk(t1))        # class method accessible via class
print(t1.some_attribute)      # class attribute accessible via instance
print(t1.honk())              # class method accessible via instance (self implicitely added as first parameter)

The fact that we can store attributes as part of the class itself (rather than just individual instances) can come in handy. One particularly common use case is illustrated below. Imagine that every new Toyota instance should have a unique serial number. We can simply store a counter as a *class attribute* and increment it whenever a new instance is created:

In [None]:
class Toyota:
    serial_counter = 1
    def __init__(self, model):
        self.model = model
        self.serial_number = Toyota.serial_counter   # access the class attribute to set the instance attribute
        Toyota.serial_counter += 1                   # increment the class attribute by 1

    def drive(self):
        print("driving")

t1 = Toyota("Yaris")
t2 = Toyota("Yaris")
t3 = Toyota("Corolla")
t3.drive()
print(t1.serial_number)
print(t2.serial_number)
print(t3.serial_number)

Above, you can see that `self.serial_number` is an **instance attribute** while `Toyota.serial_counter` is a **class attribute**.

Note, however, that all *instances* of `Toyota` will also hold references to the *class* variable:

In [None]:
t1 = Toyota("Yaris")
t2 = Toyota("Yaris")
print(t1.serial_number)
print(t2.serial_number)
print(Toyota.serial_counter)
print(t1.serial_counter)
print(t2.serial_counter)
# Both instances and the class itself refer to the SAME class attribute! Toyta.serial_counter only exists once, while serial_number is set for each individual instance.
print(t1.serial_counter is t2.serial_counter is Toyota.serial_counter)

### Summarizing class attributes:

 * Classes are just objects, they can have attributes.
 * Methods are function attributes of a class, but you can add other kinds of attributes as well.
 * A class attribute only exists **once** as a member of the class itself.
 * All instances of the class will refer to the same class attribute. In contrast, instance attributes are set via `self.`.
 * A common use case that you should be able to replicate is a serial counter that is incremented with each instantiation of the class.

# Static methods

You know how each method of a class implicitly receives `self` as the first parameter?

The following class represents a temperature sensor that might be installed in some HVAC system or factory. In a real scenario, `measure` would measure the temperature of a room or device. The sensor data is stored in an attribute `self.measurements`. A method `fahrenheit_measurements` returns the measurements, but with each value converted to Fahrenheit. The code to convert Celsius to Fahrenheit has been moved to a dedicated method `c_to_f`.

In [None]:
from random import randrange

class TempSensor:
    def __init__(self):
        self.measurements = []

    def measure(self):
        self.measurements.append(randrange(-20, 50))        # in the real world, this would gather a temperature from a real sensor, here it's just a random value

    def __str__(self):
        return f"Sensor data: {self.measurements}"

    def fahrenheit_measurements(self):
        return [self.c_to_f(m) for m in self.measurements]

    def c_to_f(self, value):
        return value * 1.8 + 32                             # self is not referred to anywhere in the method body! Compare this to the other methods, where self is always used in some way.

s = TempSensor()
for _ in range(5):
    s.measure()         # records 5 random temperatures
print(s)
print(s.fahrenheit_measurements())

In the example above, notice how `c_to_f` is a method that **only depends on parameters passed to the method** (in this case, `value`), but **`self` is not used anywhere in the method body**

In such cases, where a method of a class does not require any knowledge of `self` and its attributes, we can annotate the method as being a `@staticmethod`:

In [None]:
from random import randrange

class TempSensor:
    def __init__(self):
        self.measurements = []

    def measure(self):
        self.measurements.append(randrange(-20, 50))

    def __str__(self):
        return f"Sensor data: {self.measurements}"

    def fahrenheit_measurements(self):
        return [self.c_to_f(m) for m in self.measurements]

    @staticmethod                                                # this method is "static", because it doesn't use 'self' anywhere in the method body. No need for the parameter.
    def c_to_f(value):                                           # a @staticmethod does NOT get self provided implicitely!
        return value * 1.8 + 32

s = TempSensor()
for _ in range(5):
    s.measure()
print(s)
print(s.fahrenheit_measurements())
print(TempSensor.__str__(s))                                     # regular methods require the first parameter to be the object (self).
print(TempSensor.c_to_f(0))                                      # static methods can be called without providing self!
print(s.c_to_f(0))                                               # you can call the static method on the object or the class; the result is the same.

You don't *have to* annotate methods that don't use `self` (and drop the `self` parameter). You could just ignore the `self` parameter if you don't need it. However, semnatically, `@staticmethod` communicates to other developers that the functionality in this method does not relate to any of the object's state (like here, any specific temperatures recorded). The method `c_to_f` is basically like a plain function, only that it "belongs" to the `TempSensor` class. You don't need to have a `TempSensor` instance to use it.

### Summarizing static methods:

 * A static method of a class does **not** take `self` as a parameter.
 * Static methods are annotated with `@staticmethod` (which does not need to be imported)
 * Static methods are typically isolated pieces of behavior that are not connected with the state (i.e., `self`) of the class.
 * A common use case is helper functions that just convert or otherwise process values, without knowing about individual state.

# Testing

If you're not familiar with testing, you probably just "tried" your code with different inputs to see (visually) if it appears to work correctly. Let's revisit the IP validator from the exercises and assume it were an interactive application:

In [None]:
class IPValidator:
    def run(self):
        while True:
            addr = input("Please enter an IP address and hit enter: ")
            if addr == "exit":
                return
            print(f"{addr} is valid: {self.is_valid(addr)}")                          # could also call IPValidator.is_valid

    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
        return (sum([IPValidator.is_valid_octet(octet) for octet in octets]) == 4)    # self doesn't exist, so definitely must use IPValidator.is_valid_octet
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255
    
v = IPValidator()
v.run()

Of course, you might recognize that you don't actually have to call run() to test this class. You could just try using the methods individually (like in the exercise):

In [None]:
class IPValidator:
    def run(self):
        while True:
            addr = input("Please enter an IP address and hit enter: ")
            if addr == "exit":
                return
            print(f"{addr} is valid: {self.is_valid(addr)}")

    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
        return (sum([IPValidator.is_valid_octet(octet) for octet in octets]) == 4)
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255

print(IPValidator.is_valid("127.0.0.1"))
print(IPValidator.is_valid("asdf"))
print(IPValidator.is_valid("127.0.0.1.100"))

But of course that gets more and more tedious with more test cases. So you might just compare each test case to the expected value instead, so that you'll see "all Trues" when running your code:

In [None]:
class IPValidator:
    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
        return (sum([IPValidator.is_valid_octet(octet) for octet in octets]) == 4)
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255

print(IPValidator.is_valid("127.0.0.1") == True)
print(IPValidator.is_valid("asdf") == False)
print(IPValidator.is_valid("127.0.0.1.100") == False)

But this still isn't very clear when you have many test cases and things go wrong. It would be nicer if we had a way to test our code that...

 * Clearly spells out what went wrong
 * Tells us where it went wrong

For this reason, we have "unit testing". Python ships with a `unittest` module that provides these kinds of services.

To implement a unit test, you...
 1. create a new class that inherits from `unittest.TestCase`.
 2. implement multiple methods, all taking no parameters (except `self`) and starting with `test` for their method name.
 3. test exactly **one** feature or behavior per test case. You may want to instantiate a new instance of the class you're testing every time.
 4. optionally provide a meaningful message for test failures

In [None]:
import unittest

class IPValidatorTests(unittest.TestCase):

    def test_home(self):
        v = IPValidator()
        self.assertTrue(v.is_valid("127.0.0.1"))         # valid

    def test_invalid_string(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("asdf"))             # not an IP

    def test_five(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("127.0.0.1.100"))    # one too many octets

unittest.main(argv=[''], verbosity=3, exit=False)        # these parameters are only here because of Jupyter Notebook. Normally, you could just do unittest.main()

Don't be confused by the red color of the output in Jupyter Notebook. All tests pass as indicated by "OK". We could add messages, if we wanted:

In [None]:
import unittest

class IPValidatorTests(unittest.TestCase):

    def test_home(self):
        v = IPValidator()
        self.assertTrue(v.is_valid("127.0.0.1"), "127.0.0.1 wrongly identified as invalid")

    def test_invalid_string(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("asdf"), "random string 'asdf' identified as valid IP")

    def test_five(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("127.0.0.1.100"), "127.0.0.1.100 wrongly identified as valid")

unittest.main(argv=[''], verbosity=3, exit=False) # these parameters are only here because of Jupyter Notebook. Normally, you could just do unittest.main()

So if we now provide a faulty implementation, we'll receive some additional error message:

In [None]:
class IPValidator:
    @staticmethod    
    def is_valid(ip):
        return True

import unittest

class IPValidatorTests(unittest.TestCase):

    def test_home(self):
        v = IPValidator()
        self.assertTrue(v.is_valid("127.0.0.1"), "127.0.0.1 wrongly identified as invalid")

    def test_invalid_string(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("asdf"), "random string 'asdf' identified as valid IP")

    def test_five(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("127.0.0.1.100"), "127.0.0.1.100 wrongly identified as valid")

unittest.main(argv=[''], verbosity=3, exit=False) # these parameters are only here because of Jupyter Notebook. Normally, you could just do unittest.main()


The messages are not usually necessary. However, what's nice about testing like this is that it's easy to read whether or not the code is working as expected. It tells you where and why a test failed and why.

The best thing about tests is that they can **prevent regressions**. A regression is when a feature that previously worked fine stops working correctly. This can actually happen very often when working on existing code. Say we want to improve our implementation by using the "more pythonic" `all` instead of `sum`. While `sum` calculates a sum of list elements, `all` just checks if each element is `True`. This seems right because we just want to check if each octet is valid. **But this misses the case where each ocetet is valid, but there aren't exactly 4 of them**:

In [None]:
class IPValidator:
    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
      # return (sum([IPValidator.is_valid_octet(octet) for octet in octets]) == 4)
        return all(IPValidator.is_valid_octet(octet) for octet in octets)
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255

import unittest
class IPValidatorTests(unittest.TestCase):
    def test_home(self):
        v = IPValidator()
        self.assertTrue(v.is_valid("127.0.0.1"))
        
    def test_invalid_string(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("asdf"))
        
    def test_five(self):
        v = IPValidator()
        self.assertFalse(v.is_valid("127.0.0.1.100"), "127.0.0.1.100 wrongly identified as valid")     # This test case saves us from introducing a regression bug

unittest.main(argv=[''], verbosity=3, exit=False) 

You can also get creative and generate a lot more test cases programmatically, for example:

In [None]:
class IPValidator:
    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
        return len(octets) == 4 and all(IPValidator.is_valid_octet(octet) for octet in octets)
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255

import unittest
class IPValidatorTests(unittest.TestCase):
    def test_first_two_octets(self):
        v = IPValidator()
        for a in range(256):
            for b in range(256):
                    self.assertTrue(v.is_valid(f"{a}.{b}.0.255"))  # gets called over 65'000 times

unittest.main(argv=[''], verbosity=3, exit=False) 

In this example, this is almost not necessary. You're probably better off testing edge cases. However, in some scenarios, generating test values like this can make a lot of sense.

The `unittest` framework also supports a couple of additional features. For example, you can implement the `setUp` method to execute some code *before each test method is run*:

In [None]:
class IPValidator:
    @staticmethod    
    def is_valid(ip):
        octets = ip.split(".")
        return len(octets) == 4 and all(IPValidator.is_valid_octet(octet) for octet in octets)
        
    @staticmethod    
    def is_valid_octet(octet):
        if not 1 <= len(octet) <= 3:
            return False
        for char in octet:
            if not char.isdigit():
                return False
        return 0 <= int(octet) <= 255
        
import unittest
class IPValidatorTests(unittest.TestCase):

    def setUp(self):                     # the setUp function is called before each test method execution!
        self.v = IPValidator()
        
    def test_home(self):
        self.assertTrue(self.v.is_valid("127.0.0.1")) # now we refer to self.v
        
    def test_invalid_string(self):
        self.assertFalse(self.v.is_valid("asdf"))
        
    def test_five(self):
        self.assertFalse(self.v.is_valid("127.0.0.1.100"), "127.0.0.1.100 wrongly identified as valid") 

unittest.main(argv=[''], verbosity=3, exit=False)

There also exist `tearDown`, called after each test method ends, and also `setUpClass` and `tearDownClass`, which are executed before and after an entire test class is used.

### Assertion methods

In these examples, we only used `assertTrue`, but there are [many more assertion functions](https://docs.python.org/3/library/unittest.html#assert-methods).

| Assertion Method | Equivalent To | Since Python |
|-|-|-|
| `assertEqual(a, b)` | `a == b` | - |
| `assertNotEqual(a, b)` | `a != b` | - |
| `assertTrue(x)` | `bool(x) is True` | - |
| `assertFalse(x)` | `bool(x) is False` | - |
| `assertIs(a, b)` | `a is b` | 3.1 |
| `assertIsNot(a, b)` | `a is not b` | 3.1 |
| `assertIsNone(x)` | `x is None` | 3.1 |
| `assertIsNotNone(x)` | `x is not None` | 3.1 |
| `assertIn(a, b)` | `a in b` | 3.1 |
| `assertNotIn(a, b)` | `a not in b` | 3.1 |
| `assertIsInstance(a, b)` | `isinstance(a, b)` | 3.2 |
| `assertNotIsInstance(a, b)` | `not isinstance(a, b)` | 3.2 |
| `assertAlmostEqual(a, b)` | `round(a-b, 7) == 0` | - |
| `assertNotAlmostEqual(a, b)` | `round(a-b, 7) != 0` | - |
| `assertGreater(a, b)` | `a > b` | 3.1 |
| `assertGreaterEqual(a, b)` | `a >= b` | 3.1 |
| `assertLess(a, b)` | `a < b` | 3.1 |
| `assertLessEqual(a, b)` | `a <= b` | 3.1 |
| `assertRegex(s, r)` | `r.search(s)` | 3.1 |
| `assertNotRegex(s, r)` | `not r.search(s)` | 3.2 |
| `assertCountEqual(a, b)` | *a* and *b* have the same elements in the same number, regardless of their order | 3.2 |

Long story short:

 * When you've "tried" the same inputs to your function, the same clicks in your frontend, the same call to your backend, more than 5 times: write a unit test
 * Unit tests will **save** you time.
 * To write unit tests, create a class inheriting from `unittest.TestCase` and implement methods starting with `def test`
 * Each test method should test exactly one thing

You have several options how to run unittests for the exercises in your programming environment.

 1. Configure your IDE. Here, again it will be important to have the correct *working directory* (CWD) set to the folder that **contains** the `task` folder.
 1. On the command line, navigate to the folder containing `task` and then run one of these commands:

To run a specific test file:

 ```bash
 python -m unittest task/tests.py
 ```
 
 or if you add `unittest.main()` at the end of the test file, you can also run it as a module:
 
 ```bash
 python -m task.tests
 ```

To run all test files contained in the task folder:

 ```bash
 python -m unittest discover -v task
 ```

### A couple more testing terms

#### Test-Driven Development (TDD)

The idea to **first** implement tests, then write the actual implementation.

#### Black-box testing

When you do not have access to the implementation to see how it actually works, you're performing black-box testing. You only get to try the implementation, call the methods, etc., but you cannot see inside the implementation. This is common practice in large companies, where testers are responsible for testing, without caring about how the code is implemented.

If working with compiled code, where the source is not available, black-box testing may also be the only option.

#### White-box testing

More regular testing where you, the tester, can also read the actual implementation code.

#### Fuzz testing

In fuzz testing, values used to test the implementation may be generated (semi-)randomly, similarly to what we did in `IPValidatorTests.test_first_two_octets`

You will be practicing some of these in the exercises!