In [1]:
import csv
import numpy as np

import inspect
import hashlib
def _hash(s):
    return hashlib.blake2b(bytes(str(s), encoding='utf8'), digest_size=5).hexdigest()

## Coding test

This coding test evaluates your knowledge of Python programming, NumPy, command line usage and git. It is composed of **4 exercises**, some of which are divided in different parts. Note that **each part is evaluated independently**, so if you are stuck or cannot solve a given part, you can move on to the next one.

## Exercise 1 (4 points)

This exercise is a set of 4 questions, each of which is worth 1 point. The questions are split in two parts.

### Part I (questions 1-2)

Consider that you are using a Unix-based machine via a terminal with the working directory `/users/ldssa`. The file tree below illustrates the contents and structure of that directory. **Consider these contents to answer Questions 1-2**.


```text
/users/ldssa
├── students.csv
├── instructors.csv
└── bootcamp
    ├── notes.txt
    └── learning_units.txt
```

#### Question 1
Consider that you execute the command `cd .`, followed by the command `pwd`. Write the output of the second command in a string in the variable `pwd_output`.

In [2]:
# YOUR CODE HERE


pwd_output = '/users/ldssa'

In [3]:
assert _hash(pwd_output) == "dfbdb9d70f", "Wrong output."

#### Question 2
Consider you execute the command `rm students.csv`. How many CSV files exist in `users/ldssa` after you execute the command? Write the answer as an integer in the variable `rm_num_files`.

In [4]:
# YOUR CODE HERE


rm_num_files = '1'

In [5]:
assert _hash(rm_num_files) == "36c76eba33", "Wrong number of files."

### Part II (questions 3-4)

Answer the following questions Python programming.

#### Question 3
Create a Python list of 1000 elements, where the $i$th element is $i^2$, i.e. where the first element is $1^2 = 1$, the second element is $2^2 = 4$ and so forth. Assign the result to the variable `squares`.

In [6]:
# YOUR CODE HERE
squares = [i**2 for i in range(1,1001)]

In [7]:
assert _hash(squares) == "fcdc82b722", "Incorrect list."

#### Question 4

Create a dictionary whose keys are integers 0-6 and the corresponding values are strings with the names of the days of the week, Monday-Sunday (in this order and format). Assign the result to the variable `weekdays`.

In [8]:
# YOUR CODE HERE
weekdays = {0:"Monday", 1:"Tuesday", 2:"Wednesday", 3:"Thursday" , 4:"Friday" , 5:"Saturday" , 6:"Sunday"}

In [9]:
assert _hash(weekdays) == "8f1a6ff1f0", "Incorrect dictionary."

## Exercise 2

Consider a csv file that stores information about the European population. This file has five columns:
* `contry_name`: the name of the country;
* `population`: the population of the country;
* `female_fraction`: the fraction of female population in the country;
* `male_life_expectancy`: the life expectancy of male individuals in the country;
* `female_life_expectancy`: the life expectancy of female individuals in the country;
* `birth_rate`: the birth rate in the country, i.e. total number of births per 1000 individuals per year.
* `death_rate`: the death rate in the country, i.e. total number of deaths per 1000 individuals per year. 

The file `europe_population.csv` is an example of such a file and was included in the zip file you downloaded with this notebook.

### Part I (2 points)

Implement a function that reads a file with the same format as the `europe_population.csv` file, and stores the data in a list of dictionaries with the following structure:

```
[
    {
        "name": name of the country (type: str),
        "population": the population of the country (type: int),
        "female_fraction": the fraction of female population (type: float),
        "male_life_expectancy": male life expectancy (type: float),
        "female_life_expectancy": female life expectancy (type: float),
        "birth_rate": birth rate (type: float),
        "death_rate": death rate (type: float),
    },
]

```

The function should:
1. be called `read_population_data`;
2. receive an argument called `file_path`, the path to the file that the function should read the data from;
3. read data from each country in the format specified above;
4. return the list that was created.

Remember to inspect the contents of the file before writing the function, as it may contain header information that should be skipped.

To read the csv file, you shall use Python's [`csv`](https://docs.python.org/3/library/csv.html) module, which we already imported at the top of the Notebook. **Do not use [pandas](https://pandas.pydata.org/) or any other library not included in the coding test requirements file**.

In [10]:
def read_population_data(file_path):
    """
    Reads the file in file_path, parses it and returns the data in a list of dictionaries.

    Parameters:
    file_path (str): Path to the input file to be parsed.

    Returns:
    countries_parsed_data (list): Countries dataset stored as a list of dictionaries.
    """
    
    data = []
    
    # Open the file and read its contents
    with open(file_path, 'r') as file:
        reader = csv.reader(file)
        
        # Skip the header
        next(reader)
        
        # For each row in the CSV, create a dictionary and add it to the data list
        for row in reader:
            country_data = {
                "name": row[0],
                "population": int(row[1]),
                "female_fraction": float(row[2]),
                "male_life_expectancy": float(row[3]),
                "female_life_expectancy": float(row[4]),
                "birth_rate": float(row[5]),
                "death_rate": float(row[6]),
            }
            data.append(country_data)
    
    return data



    

In [11]:
population_data = read_population_data("europe_population.csv")

assert isinstance(population_data, list)
assert all(isinstance(country_data, dict) for country_data in population_data)
assert len(population_data) == 8

country_data = population_data[4]
assert country_data["name"] == "Portugal"
assert isinstance(country_data["population"], int)
assert country_data["population"] == 10297000
assert country_data["female_fraction"] == 0.5282
assert country_data["male_life_expectancy"] == 78.0
assert country_data["female_life_expectancy"] == 84.1
assert country_data["birth_rate"] == 8.2
assert country_data["death_rate"] == 12.0

from more_tests import test_exercise_2_part_I
test_exercise_2_part_I(read_population_data)

### Part II a (2 points)

Consider now that we want to use our dataset to answer some questions about the European population, for example "which countries have an average life expectancy larger than X?".

To answer this question, we note that the average life expectancy can be approximated by the weighted average of the male and female life expectancies, where the weights in the average are the fraction of male and female individuals in the population, respectively. In other words, the average life expectancy is given by `female_fraction * female_life_expectancy + (1 - female_fraction) * male_life_expectancy`.

Implement a function called `average_life_expectancy_larger_than` that:
1. receives as input a list like the one created in Part I of this exercise and a value for threshold X;
2. computes the average life expectancy for each country in the list;
3. determines which countries have an average life expectancy larger than the provided threshold;
4. returns a list of dictionaries with the same format as the one created in Part I of this exercise with only the countries with a average life expectancy larger than the provided threshold.

In [12]:
def average_life_expectancy_larger_than(countries_data, threshold):
    """
    Determines the countries with an average life expectancy larger than a given threshold.

    Parameters:
    countries_data (list): List with the countries dataset.
    threshold (float): Threshold.

    Returns:
    valid_countries_data (list): List with the countries whose average life expectancy is larger than the threshold.
    """
    result_list = []
    # YOUR CODE HERE
    
    for i in range(len(countries_data)):
        life_exp = countries_data[i]["female_fraction"] * countries_data[i]['female_life_expectancy'] + (1 - countries_data[i]['female_fraction']) * countries_data[i]['male_life_expectancy']
        if life_exp > threshold:
            result_list.append(countries_data[i])
            
    return result_list
    

        
    

In [13]:
population_data = [
    {"name": "Austria", "population": 8917000, "female_fraction": 0.5077, "male_life_expectancy": 78.9, "female_life_expectancy": 83.6, "birth_rate": 9.4, "death_rate": 10.3},
    {"name": "Belgium", "population": 11544000, "female_fraction": 0.5059, "male_life_expectancy": 78.6, "female_life_expectancy": 83.1, "birth_rate": 9.9, "death_rate": 11.0},
    {"name": "France", "population": 67380000, "female_fraction": 0.5167, "male_life_expectancy": 79.2, "female_life_expectancy": 85.3, "birth_rate": 10.9, "death_rate": 9.9},
]

valid_countries_data = average_life_expectancy_larger_than(population_data, 81.0)

assert isinstance(valid_countries_data, list)
assert len(valid_countries_data) == 2

for country_name in ["Austria", "France"]:
    assert country_name in [country_data["name"] for country_data in valid_countries_data]

empty_countries_data = average_life_expectancy_larger_than(population_data, 85.0)

assert isinstance(empty_countries_data, list)
assert len(empty_countries_data) == 0

all_countries_data = average_life_expectancy_larger_than(population_data, 0)

assert isinstance(all_countries_data, list)
assert len(all_countries_data) == len(population_data)

### Part II b (2 points)

Consider now that you want to estimate the difference in a country's population in a year, given its current population and birth and death rates. The population difference in a year can be estimated by using the formula `population * (birth_rate - death_rate) / 1000`, given that the birth and death rates are given in number of births and deaths per 1000 individuals per year.

Create a function called `estimate_population_difference` that computes that difference for a country with a given name. If the country is not in the list, the function should return `None`.


This function should:
1. receive as input a list like the one we created in Part II a of this exercise and the name of the country for which to estimate the population difference;
2. return the estimate of the population difference of that country as an integer.

In [14]:
def estimate_population_difference(countries_data, country_name):
    """
    Estimates the population difference for a given country.

    Parameters:
    countries_data (list): List with the countries dataset.
    country_name (str): Name of the country to estimate the population difference for.

    Returns:
    population_difference(int): Population difference.
    """
    
    for country in countries_data:
        if country["name"] == country_name:
            population_diff = country["population"] * (country["birth_rate"] - country["death_rate"]) / 1000
            return int(population_diff)
    
    # Return None if country_name is not found in countries_data
    return None

In [15]:
population_data = [
    {"name": "Austria", "population": 8917000, "female_fraction": 0.5077, "male_life_expectancy": 78.9, "female_life_expectancy": 83.6, "birth_rate": 9.4, "death_rate": 10.3},
    {"name": "Belgium", "population": 11544000, "female_fraction": 0.5059, "male_life_expectancy": 78.6, "female_life_expectancy": 83.1, "birth_rate": 9.9, "death_rate": 11.0},
    {"name": "France", "population": 67380000, "female_fraction": 0.5167, "male_life_expectancy": 79.2, "female_life_expectancy": 85.3, "birth_rate": 10.9, "death_rate": 9.9},
]

austria_population_difference = estimate_population_difference(population_data, "Austria")
assert isinstance(austria_population_difference, int)
assert austria_population_difference == -8025

belgium_population_difference = estimate_population_difference(population_data, "Belgium")
assert isinstance(belgium_population_difference, int)
assert belgium_population_difference == -12698

france_population_difference = estimate_population_difference(population_data, "France")
assert isinstance(france_population_difference, int)
assert france_population_difference == 67380

none_population_difference = estimate_population_difference(population_data, "Portugal")
assert none_population_difference is None

## Exercise 3

In this exercise, we'll use object oriented programming concepts to model [Pokemons](https://en.wikipedia.org/wiki/Pok%C3%A9mon).

In case you're not familiar with Pokemon, they are wild (imaginary) creatures. These are some examples.

<img src="./images/pokemons.png" style="width:800px">

When they're living in the wild, Pokemon can be captured by Pokemon Trainers. Once they belong to a Trainer, they will obey the Trainer's commands. They're usually sent out to non-lethal battles against other Pokémon, in order to gain experience and level up.

<img src="./images/battle.png" style="width:300px">

When they reach certain levels, they can undergo a form of metamorphosis and transform into a similar but stronger species of Pokémon: this is called evolution.

<img src="./images/charmander_evolves.jpg" style="width:350px">

### Part I (4 points)

Your first assignment is to implement a class that represents a `Pokemon`.

You'll need to store the following information about a Pokemon:
* `name`: the Pokemon's name.
* `max_health`: the number of health points that this Pokemon has with full health.
* `speed`: a measure of how fast this Pokemon is. Faster Pokemons usually attack first in battles.
* `hp`: current number of health points. Pokemons may lose health points during battles.
* `level`: Pokemon's current level. This is a measure of the Pokemon's experience. Pokemons in higher levels have more chances of winning battles.

Some additional information:
* All the stats points described above (`max_health`, `speed`, `hp`, `level`) should be measured with non-negative integers.
* When a Pokemon is born, its `level` is always 1 and its `hp` is always the same as `max_health`, but its `name`, `max_health` and `speed` vary from Pokemon to Pokemon.


Our Pokemon class should implement 4 methods, described below.

##### 1. Method `is_knocked_out`

During battles, Pokemon take damage, which translates into losing health points.
Method `is_knocked_out` receives no arguments, and checks whether the Pokemon is knocked out by checking if the Pokemon's `hp` is equal to 0.
This method should return a bool.


##### 2. Method `level_up`

When a Pokemon wins a battle, it will level up.
Method `level_up` receives no arguments, and doesn't return anything.

This method should:
* Increase the Pokemon's `level` by one.
* Increase the Pokemon's `max_health` by 20 points.
* Increase the Pokemon's `speed` by 10%. But speed must be an integer, so round it down.
* It shouldn't change the Pokemon's `hp`!


##### 3. Method `take_damage`

As explained above, during battles, Pokemon take damage, which translates into losing health points.
Method `take_damage` receives as argument the integer `damage_points`, and doesn't return anything.

This method should decrease the Pokemon's `hp` by `damage_points`. Make sure that the Pokemon's `hp`doesn't fall below 0.


##### 4. Method `attack_damage`

This method calculates the number of damage points that our Pokemon's attack will inflict on an enemy Pokemon, during a battle. More experienced Pokemon inflict more damage and have higher chances of having successful attacks than less experienced Pokemon. 
Method `attack_damage` receives as argument an `enemy_pokemon` (which is another instance of the Pokemon class) and returns an integer representing the number of damage points.

This method should compute the damage points in the following way:
* Create a variable `level_diff` that stores the difference in levels of the two Pokemon. If our Pokemon is more experienced, `level_diff` should be positive. If our Pokemon is less experienced, `level_diff` should be 0.
* Create a variable `max_level` that stores the maximum level between the levels of the two Pokemon.
* Create a variable called `p_success`, that represents the probability of success of your Pokemon's attack, and set it to: `0.5 + level_diff / (2 * max_level)`
* Create a variable called `attack_success`, that represents whether the attack was successful or not. Calculate its value by drawing one sample from a binomial distribution: use numpy's [binomial function](https://numpy.org/doc/stable/reference/random/generated/numpy.random.binomial.html), with parameters `n=1` and `p=p_success`. This function will output a 0 (which means the attack was not successful) or a 1 (which means the attack was successful). 
* Create a variable called `damage_points`. Calculate its value by multiplying your Pokemon's `level` by `attack_success`.
* Return `damage_points`.


In [16]:
class Pokemon:
    def __init__(self, name, max_health, speed):
        self.name = name
        self.max_health = max_health
        self.speed = speed
        self.hp = max_health
        self.level = 1

    
    def is_knocked_out(self):
        return self.hp == 0

    
    def level_up(self):
        self.level += 1
        self.max_health += 20
        self.speed = int(self.speed * 1.10) 

    
    def take_damage(self, damage_points):
        self.hp -= damage_points
        if self.hp < 0:
            self.hp = 0

    
    def attack_damage(self, enemy_pokemon):
        level_diff = self.level - enemy_pokemon.level
        if level_diff < 0:
            level_diff = 0
        
        max_level = max(self.level, enemy_pokemon.level)
        p_success = 0.5 + level_diff / (2 * max_level)
        attack_success = np.random.binomial(n=1, p=p_success)
        damage_points = self.level * attack_success
        
        return damage_points

# YOUR CODE HERE


In [17]:
score = 0

try:
    pika = Pokemon(name='Pikachu', max_health=20, speed=5)
    assert pika.name == 'Pikachu'
    assert pika.max_health == 20
    assert pika.speed == 5
    assert pika.hp == pika.max_health
    assert pika.level == 1
    assert not pika.is_knocked_out()
except AssertionError:
    pass
else:
    score += 1

try:
    pika.level_up()
    assert pika.max_health == 40
    assert pika.hp == 20
    assert pika.speed == 5
    assert pika.level == 2
except AssertionError:
    pass
else:
    score += 1


try:
    enemy = Pokemon(name='Squirtle', max_health=10, speed=10)
    np.random.seed(42)
    assert pika.attack_damage(enemy) == 2
    assert pika.attack_damage(enemy) == 0
    assert pika.attack_damage(enemy) == 2

    np.random.seed(13)
    assert pika.attack_damage(enemy) == 0
    assert pika.attack_damage(enemy) == 2
    assert pika.attack_damage(enemy) == 0
except AssertionError:
    pass
else:
    score += 1

try:
    pika.take_damage(5)
    assert pika.hp == 15

    pika.take_damage(20)
    assert pika.hp == 0
    assert pika.is_knocked_out()
except AssertionError:
    pass
else:
    score += 1

if score == 0:
    raise AssertionError("Not enough correct answers to score points :(")

print(f"Your score is {score}/4")

Your score is 4/4


### Part II (2 points)

Now we'll implement a battle between two Pokemon.

Write a function called `battle` that receives as arguments `p1` and `p2`, both instances of the Pokemon class, and returns the Pokemon that wins the battle, the Pokemon that loses the battle, and how many rounds were fought (in this order!).

A battle is a sequence of rounds. In each round, one Pokemon attacks the other once. The faster Pokemon starts the battle, i.e. attacks in round 1 (if they have the same speed, `p1` attacks first). The slower Pokemon attacks in round 2, the faster Pokemon in round 3, and so on.

In an attack, the attacker inflicts as many damage points as indicated by the `attack_damage` method. Remember that this method has a random component, so you have to explicitly call it everytime the Pokemon attacks.
The defender may suffer damage points. In order to record that, you should use the `take_tamage` method.

The battle ends as soon as one of the Pokemon is knocked out.

Grader Tip: If you see a `KeyboardInterrupt` error on the grader feedback, that is because your cell is taking too long to run, which is probably due to an infinite loop.

In [18]:
def battle(p1, p2):
    """
    Represents a battle between two Pokemon, where each Pokemon attacks at a time.

    Parameters:
    p1 (Pokemon): A Pokemon fighting in the battle.
    p2 (Pokemon): The other Pokemon fighting in the battle.

    Returns:
    winner (Pokemon): The winner Pokemon
    loser (Pokemon): The loser Pokemon
    """
   
    if p1.speed > p2.speed:
        first_attacker, second_attacker = p1, p2
    else:
        first_attacker, second_attacker = p2, p1

    rounds_fought = 0

    
    while not p1.is_knocked_out() and not p2.is_knocked_out():
        
        # First Pokemon attacks
        damage_points = first_attacker.attack_damage(second_attacker)
        second_attacker.take_damage(damage_points)
        rounds_fought += 1
        
        # Check if the second Pokemon got knocked out
        if second_attacker.is_knocked_out():
            return first_attacker, second_attacker, rounds_fought
        
        # Second Pokemon attacks
        damage_points = second_attacker.attack_damage(first_attacker)
        first_attacker.take_damage(damage_points)
        rounds_fought += 1
        
        # Check if the first Pokemon got knocked out
        if first_attacker.is_knocked_out():
            return second_attacker, first_attacker, rounds_fought
 



In [19]:
pika = Pokemon(name='Pikachu', max_health=20, speed=5)
squirtle = Pokemon(name='Squirtle', max_health=10, speed=10)

np.random.seed(19)
winner, loser, rounds = battle(pika, squirtle)

assert loser.is_knocked_out()
assert not winner.is_knocked_out()
assert winner.name == "Pikachu"
assert loser.name == "Squirtle"
assert winner.hp == 15
assert loser.hp == 0
assert rounds == 28

from more_tests import test_exercise_3_II
test_exercise_3_II(battle, Pokemon)

print("Answer is correct. Good Job!")

Answer is correct. Good Job!


### Exercise 4

Square matrices (*i.e.*, matrices with the same number of rows and columns) are called **symmetric** if all elements with row index `i` and column index `j` match the elements with row index `j` and column index `i`. The entries in such matrices are thus symmetric with respect to the main diagonal, i.e. a square matrix is equal to its [transpose](https://en.wikipedia.org/wiki/Transpose).

An example of such matrix is shown below.

<code>
[[ 2, -1,  0],
 [-1,  1,  3],
 [ 0,  3,  0]]
</code>

#### Part I (2 points)

Write a function, named `is_symmetric` that receives as argument a numpy array representing a square matrix, and returns a boolean indicating whether that matrix is symmetric or not.

**Hint:** the numpy functions `np.transpose` and `np.array_equal` may be useful in this exercise.

In [20]:
def is_symmetric(matrix):
    """
    Checks if a matrix is symmetric.

    Parameters:
    matrix (np.ndarray): Matrix.

    Returns:
    is_matrix_symmetric (bool): Whether the matrix is symmetric.
    """
    
    # YOUR CODE HERE
    return np.array_equal(matrix, np.transpose(matrix))

In [21]:
symmetric_matrix = np.array([
    [ 2, -1,  0],
    [-1,  1,  3],
    [ 0,  3,  0],
])

nonsymmetric_matrix = np.array([
    [ 2,  1,  0],
    [-1,  1,  3],
    [ 0,  3,  0],
])

for matrix in [symmetric_matrix, nonsymmetric_matrix]:
    matrix_is_symmetric = is_symmetric(matrix)
    assert isinstance(matrix_is_symmetric, bool)

assert is_symmetric(symmetric_matrix)
assert not is_symmetric(nonsymmetric_matrix)

#### Part II (2 point)

To quantify how close a matrix is to being symmetric, one can compute the [Frobenius norm](https://en.wikipedia.org/wiki/Matrix_norm#Frobenius_norm) of the difference between the matrix and its transpose. The Frobenius norm of a matrix $A$ is defined as the square root of the sum of the squares of all elements of that matrix $a_{ij}$,

$|| A ||_\mathrm{F} = \sqrt{ \sum_i^n \sum_j^m a_{ij}^2 }$

Implement a function called `get_symmetry_frobenius_norm` that, given a square matrix, computes the difference between the matrix and its transpose and returns the Frobenius norm of that resulting matrix.

**Hint:** the numpy functions `np.transpose` and `np.linalg.norm` may be useful in this exercise.

In [22]:
def get_symmetry_frobenius_norm(matrix):
    """
    Computes the Frobenius norm of the difference between a matrix and its transpose.

    Parameters:
    matrix (np.ndarray): Square matrix.

    Returns:
    symmetry_frobenius_norm (float): The Frobenius norm of the difference between the input square matrix and its transpose.
    """
    
    difference = matrix - np.transpose(matrix)
    
    symmetry_frobenius_norm = np.linalg.norm(difference, ord ='fro')
    
    return symmetry_frobenius_norm

In [23]:
symmetric_matrix = np.array([
    [ 2, -1,  0],
    [-1,  1,  3],
    [ 0,  3,  0],
])

nonsymmetric_matrix = np.array([
    [ 2,  1,  0],
    [-1,  1,  3],
    [ 0,  3,  0],
])

for matrix in [symmetric_matrix, nonsymmetric_matrix]:
    symmetry_frobenius_norm = get_symmetry_frobenius_norm(matrix)
    assert isinstance(symmetry_frobenius_norm, float)

np.testing.assert_almost_equal(get_symmetry_frobenius_norm(symmetric_matrix), 0, decimal=1)
np.testing.assert_almost_equal(get_symmetry_frobenius_norm(nonsymmetric_matrix), 2.82, decimal=1)