# Warmup Exercises - Exercises

*General Hints*

* Always make sure to execute all the code cells in your notebook (even if you are not required to implement code yourself).
* In general, you do not need to change code lines which are already implemented.
* Sections where you are required to write code are always marked with ``### YOUR SOLUTION HERE`` and ``### END OF SOLUTION``.
* Always try to be aware of the state of your kernel (which imports were executed, which variables and functions are defined).
* When in doubt, a kernel reset ("restart") might be a good idea. (Remember to execute all cells once again after a reset.)
* Don't jump to opening the solutions too quickly. Mindfully try to solve the exercises first (consult lecture materials or ask questions if necessary).

### Python Basics

To get warmed up with our technology stack and environment, we will start with a few very basic exercises.

The purpose is demonstrate or to remind you of basic python syntax and functionalities. 
Further, it is a possibility to familiarize yourself with the format of our tutorial exercises.

In the next code cell, declare the following variables:


| name | type | value |
| --- | --- | --- |
| year | ``int`` | 2024 |
| name | ``str`` | *[your name]* |
| awake | ``bool`` | True |

When you are done, the code cell should output "Well done!".

In [None]:
### YOUR SOLUTION HERE
year = 2024
name = "Keanu"
awake = True
### END OF SOLUTION

assert type(year) is int and year == 2024
assert type(name) is str
assert type(awake) is bool and awake
print("Well done!")

Well done!


Note, that python is a *dynamically typed* programming language, i.e. the type of a variable is determined at runtime.
Further, it is mutable: the same variable can be overwritten to have a different type (contrary to other programming languages like Java or C++).

In [2]:
year = 2024
year = "year of the dragon"

Still, Python is *strongly typed*, i.e. all variables have a type which is relevant when performing operations like ``+``.

Consequently, the following line results in a ``TypeError``.
It can be very helpful to be familiar with error messages and stack traces to speed up debugging processes during code development.

When encountering an error message, analyze the message text (e.g. ``TypeError: can only concatenate str (not "int") to str``) and be sure to note which line of your code triggered the error message (e.g. line 1: ``----> 1 year + 2024``, this information can always be found in the error message's stack trace).

In [3]:
year + 2024

TypeError: can only concatenate str (not "int") to str

Create a Dictionary (``dict``) named ``date`` with the keys ``day``, ``month``, ``year`` and set its value according to today's date.

In [6]:
import datetime

### YOUR SOLUTION HERE
date= {
    "day": 29,
    "month": 11,
    "year": 2025
}
### END OF SOLUTION

assert date["day"] == datetime.date.today().day
assert date["month"] == datetime.date.today().month
assert date["year"] == datetime.date.today().year
print("Good job!")

Good job!


Next, implement the function ``say_hello()`` which takes a list of strings as parameter. For each entry in the list, the function is supposed to print the line ``"Hello [entry]"`` to the console.

In [7]:
from typing import List

def say_hello(names: List[str]):
    ### YOUR SOLUTION HERE
    for name in names:
        print("Hello " + name)
    ### END OF SOLUTION

say_hello(["Anne", "Laura", "Fabian"])

Hello Anne
Hello Laura
Hello Fabian


### Jupyter Notebook Basics

Jupyter Notebooks are interactive documents that combine code, text, and visualizations. They are widely used for data analysis, prototyping, and sharing results in engineering and scientific applications.

**Key features:**
- Notebooks are organized into cells. There are two main types:
  - **Code cells:** For writing and running Python code.
  - **Markdown cells:** For formatted text, explanations, and instructions.
- You can run a cell by selecting it and pressing `Shift+Enter`.
- Outputs (results, plots, errors) appear directly below the code cell.
- You can restart the kernel (the Python interpreter) to reset all variables and imports.

**Example:**
- The cell below shows a simple Python calculation.

In [8]:
# This is a code cell. You can run it with Shift+Enter.
a = 5
b = 3
print("Sum:", a + b)

Sum: 8


You do not need to explicitly use Python's `print` function to display your results below the executed code cell.

By default, the value of the last statement in your code block will be displayed:

In [9]:
# This is a code cell. You can run it with Shift+Enter.
a = 5
b = 3
a + b

8

**Tips:**
- Use markdown cells to add explanations, headings, or instructions.
- You can edit any cell by double-clicking it.
- Save your notebook regularly to avoid losing work.
- If you encounter errors, read the message and check which line caused the problem.

For more information, see the [Jupyter documentation](https://jupyter.org/documentation).

### Packages: Numpy and Pandas 

In Jupyter Notebooks, command lines starting with ``!`` are not interpreted as python commands, but executed in the console.

Using this syntax, python packages can be added directly from Jupyter Notebooks.
The following lines check whether the python packages ``numpy`` and ``pandas`` are already available, and add them to your environment, if they are not.
In case you are working in an Anaconda environment, the packages should already be pre-installed.

In [1]:
try:
    import numpy as np
except:
    !pip install numpy
    import numpy as np

try:
    import pandas as pd
except:
    !pip install pandas
    import pandas as pd

Now that ``numpy`` and ``pandas`` should be available and imported, we will make use of their data analysis functionalities.

If you are interested, check out the package documentations here: [NumPy](https://numpy.org/doc/2.0/index.html), [Pandas](https://pandas.pydata.org/docs/).

Use ``numpy`` to instantiate a ``5x5`` unit matrix and a matrix of shape ``5x2`` with randomized numeric entries.
Subsequently, multiply the two matrices and print the resulting matrix to the screen.

**Hints:** 
* Arbitrary numpy arrays can be initialized using ``np.array()``
* Unit matrices can more easily be created using the ``numpy``-function ``identity()``
* The submodule ``np.random`` provides functions for the creation of random values and arrays
* The ``numpy``-function ``matmul()`` performs matrix multiplication. (Alternatively, use the operator ``@``)

In [3]:
### YOUR SOLUTION HERE
a = np.random.rand(5,5)
b = np.random.rand(5,2)
np.matmul(a,b)
### END OF SOLUTION

array([[1.34675886, 1.5005198 ],
       [1.32581926, 1.71892553],
       [1.08404002, 1.32342494],
       [0.95382515, 1.51415423],
       [1.07223355, 1.65406999]])

Now we will very shortly introduce the package ``pandas`` and its core data class ``DateFrame``.

The following code cell defines the data used for demonstration purposes.

In [4]:
data_dict = {
    "name": ["Mario", "Luigi", "Yoshi", "Peach"],
    "age": [26, 28, 20, 23],
    "color": ["red", "green", "green", "pink"],
    "human": [True, True, False, True]
}

A ``DataFrame`` can be created from a python dict using ``pd.DataFrame.from_dict()``.

Create an instance of the class ``DataFrame`` containing the data from ``data_dict`` and use its method ``info()`` to print some basic information on the instance.

In [6]:
### YOUR SOLUTION HERE
df = pd.DataFrame.from_dict(data_dict)
print(df.info())
### END OF SOLUTION

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   name    4 non-null      object
 1   age     4 non-null      int64 
 2   color   4 non-null      object
 3   human   4 non-null      bool  
dtypes: bool(1), int64(1), object(2)
memory usage: 232.0+ bytes
None


Using ``DataFrame``'s method ``head(x)`` the first ``x`` entries of the data can be printed to the screen. Try it!

In [7]:
### YOUR SOLUTION HERE
df.head(2)
### END OF SOLUTION

Unnamed: 0,name,age,color,human
0,Mario,26,red,True
1,Luigi,28,green,True


The method ``describe()`` delivers statistical information of a ``DataFrame``'s numeric columns.

In [8]:
### YOUR SOLUTION HERE
df.describe()
### END OF SOLUTION

Unnamed: 0,age
count,4.0
mean,24.25
std,3.5
min,20.0
25%,22.25
50%,24.5
75%,26.5
max,28.0


Single columns of the dataframe can be accessed via ``[]`` using the column name (`str`) as argument.

When a single column is accessed, an object of type ``pd.Series`` is returned.

Use the ``[]``-operator to extract a series of type ``bool`` which indicates for each data instance whether its entry in the dataframe's ``age``-column is smaller than 25.
Save that series in a variable with name ``age_filter``.

In [16]:
### YOUR SOLUTION HERE
age_filter = df["age"]<25
### END OF SOLUTION

assert type(age_filter) is pd.Series
assert len(age_filter) == 4

Data filtering based on a series can also be performed using ``[]``.
Alternatively, the method ``get()`` can be used.

Extract a dataframe which only contains those entries with an age smaller than ``25``, i.e. filter the data instances in ``dataframe`` according to ``age_filter``. 
Store the result in a variable named ``filtered_data``.

In [None]:
### YOUR SOLUTION HERE
filtered_data = df[age_filter]
### END OF SOLUTION

assert type(filtered_data) is pd.DataFrame
assert len(filtered_data) == 2

0    False
1    False
2     True
3     True
Name: age, dtype: bool
    name  age  color  human
2  Yoshi   20  green  False
3  Peach   23   pink   True


Remember, you can display the variable value, e.g., the dataframe in `filtered_data`, by using it as the last Python statement in your code cell:

In [23]:
filtered_data

Unnamed: 0,name,age,color,human
2,Yoshi,20,green,False
3,Peach,23,pink,True
