[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/burghoffdavid/bads/blob/master/exercises/1_ex_python.ipynb) 

# Exercises on Python programming
We covered a lot of concepts in the first [tutorial on Python programming](https://github.com/Humboldt-WI/bads/blob/master/tutorials/1_nb_python_intro.ipynb). Solving the exercises allows you to test your familiarity with these concepts.  

In [4]:
import numpy as np

## Variables, assignments, and comparisons

1. Create two variables $a$ and $b$ and assign values of $3$ and $4.5$.

In [3]:
b = 4.5
a = 3
print("test")

test


2. Query the type of variable $a$.

In [4]:
type(a)

int

3. Check whether variable $b$ is a text variable (of class character).

In [5]:
print(isinstance(b, str))
## checks if a variable is an instance of a class and returns true or false, in this case it checks if b is an instance of the str class.

False


4. Calculate $a^2 + \frac{1}{b}$, $\sqrt{a*b}$, and $log_2(a)$.

In [None]:
a **2 + 1/b

9.222222222222221

In [None]:
np.sqrt(a*b)

3.6742346141747673

In [None]:
np.log2(a)

1.584962500721156

## Matrix algebra
Create three additional variables as follows:

 $$ A = \left( \begin{matrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 10 \end{matrix} \right) \quad
  B = \left( \begin{matrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{matrix} \right)  \quad
  y = \left( \begin{matrix} 1 \\ 2 \\ 3 \end{matrix} \right) $$

Perform the following operations. Note that mathematical operators like `*` might not behave in the way you need it. Wasn't there a powerful library for all sorts of numerical computations including classic linear algebra?

Calculate  

  1. $a*A$

In [None]:
A = np.array([1,2,3,4,5,6,7,8,10]).reshape(3,3)
print(A)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8 10]]


In [None]:
B = np.array([1,4,7,2,5,8,3,6,9]).reshape(3,3)
print(B)

[[1 4 7]
 [2 5 8]
 [3 6 9]]


In [None]:
y = np.array([1,2,3]).reshape(3,1)
print(y)

[[1]
 [2]
 [3]]


  2. $A*B$

In [None]:
np.dot(A,B)

array([[ 14,  32,  50],
       [ 32,  77, 122],
       [ 53, 128, 203]])

  3. The inverse of matrix $A$ and store the result in a variable $invA$. Any ideas how to get Python to invert a matrix? Hint: NumPy is your friend.  

In [None]:
invA = np.linalg.inv(A)
print(invA)

[[-0.66666667 -1.33333333  1.        ]
 [-0.66666667  3.66666667 -2.        ]
 [ 1.         -2.          1.        ]]


  4. Multiply $A$ and $invA$ and verify that the result is the identity matrix (i.e. only 1s on the diagonal). You'll probably find that it isn't, because computers usually make very small rounding error when handling real numbers. The reason is interesting, but you'll have to look it up if you're interested.

In [None]:
print(np.dot(A,invA))
print(np.eye(3))

[[ 1.00000000e+00 -4.44089210e-16 -1.11022302e-16]
 [ 4.44089210e-16  1.00000000e+00 -2.22044605e-16]
 [ 4.44089210e-16  8.88178420e-16  1.00000000e+00]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


  5. The transpose of matrix $B$

In [None]:
B.transpose()

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

  6. Fill the first row of matrix $B$ with ones

In [None]:
B[0] = np.ones(3)
print(B)

[[1 1 1]
 [2 5 8]
 [3 6 9]]


  7. Calculate the ordinary least squares estimator $\beta$ (i.e. a standard regression) 
$$ \beta = (A^{\top}A)^{-1}A^{\top} y $$ Run a web search for "Python matrix transpose" to get help on how to transpose a matrix. 

In [None]:
np.dot(np.linalg.inv(np.dot(A.transpose(), A)), (np.dot(A.transpose(), y)))

array([[-3.33333333e-01],
       [ 6.66666667e-01],
       [-5.68434189e-14]])

## Indexing
1. Look at values of variables $A$, $B$, and $y$ from the last exercise

In [None]:
print(A)
print(B)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8 10]]
[[1 1 1]
 [2 5 8]
 [3 6 9]]


2. Access the second element in the third row of $A$ and the first element in the second row of $B$, and compute their product

In [None]:
A[2][1] * B[1][0]

16

3. Multiply the first row of $A$ and the third column of $B$

In [None]:
print(B[:, 2])
print(A[0])
np.dot(A[0], B[:, 2])

[1 8 9]
[1 2 3]


44

4. Access the elements of y that are greater than 1 (without looking up their position manually)

In [None]:
for element in y:
    if element > 1:
        print(element)

[2]
[3]


5. Access the elements of A in the second column, for which the values in the first column are greater or equal to 4)

In [None]:
for index in range(0,3):
    if A[index,0] >= 4:
        print(A[index,1])

5
8


6. Access the 4th row of A. If this returns an error message, use Google to investigate the problem and find out what went wrong.

In [None]:
A[4]
#index out of bounds --> there is no element here, no row in the matrix

IndexError: index 4 is out of bounds for axis 0 with size 3

## Custom functions
For many statistical applications it is practical to standardize variable values. One way to standardize is *centering and scaling*. In simple words, we make the variables comparable by reducing them to the same scale.

Start with implementing a custom function. Your function should take an argument **x**. To keep things simple, we expect x to always be a numeric vector (and not text or a matrix, for example). In the body of the function, calculate the mean and standard deviation of **x**. Store the results in variables  **mu** and **std**, respectively. Then for each element in the vector, substract the mean and divide by the standard deviation.
$$ x_{new} = \frac{x-\mu}{std}$$
Make sure your functions **returns** the standardized vector (i.e., $x_new$ in the equation) as result. You might want to import `NumPy` for calculating the mean and standard deviation.

In [2]:
def standardize (x):
    #convert the possible regular array into a numpy array 
    np_x = np.array(x)
    print(np_x.dtype)
    # check if ANY of the values inside the numpy array are NaN = Not a number.
    try:
        #Try: attempt to execute the code below, if it returns an error, the except block gets executed
        np.isnan(np_x).any()
    except:
        # if the try block fails, return this (stops any futher instructions, function would end here)
        return ('Error, vector is not numeric, it is of type {}').format(np_x.dtype)
        
    
    mu = np.mean(x)
    std = np.std(x)
    print(mu)
    print(std)
    
    #create a new emtpy array to store the standardized values of the original vector
    xnew = np.array([])
    for element in x:
        #loop through each element of the original vector, apply the standardization calculation and append ("roughly meaning add") the new value to the before created vector 
        xnew = np.append(xnew, ((element - mu) / std))
    return xnew


You should always test your functions. Create a vector **a** with the elements (-100, -25, -10, 0, 10, 25, 100) and check if your function produces the correct result.     

In [None]:
a = np.array([-100, -25, -10, 0, 10, 25, 100])
print(standardize(a))

int64
0.0
55.35599077142161
[-1.80648921 -0.4516223  -0.18064892  0.          0.18064892  0.4516223
  1.80648921]


In [5]:
b = np.array([1,2,3])
standardize(b)

int64
<class 'numpy.ndarray'>
2.0
0.816496580927726


array([-1.22474487,  0.        ,  1.22474487])

*Optional*: Create a vector **b** with elements ("1", "2", "3") and check the function. Let's include a simple check in the function and give feedback. Before doing any calculations, use `if()` and `type()` to check if the input is a numeric vector. There are many ways to code the condition *x is numeric* in Python. Run a quick web search and use a simple approach. If the input is not numeric, skip the computations and print a message "input not numeric".

In [None]:
c = ['ssdas', '2', 2, 10.5]
standardize(c)

<U5


'Error, vector is not numeric, it is of type <U5'

## Data structures 
Say you want to keep track of the members of the four houses of the famous Hogwarts School of Witchcraft and Wizardry. What might be a suitable data structure. We create a dictionary named **hogwarts** and use the names of the houses as keys. Then, the values associated with those keys could be any type that supports storing a set of strings, i.e., to store the names of the members. Draw on your knowledge of Python dictionaries and list to implement such a data structure. Populate the dictionary with the following data, and feel free to add more characters if you wish. 

- Gryffindor: I'm sure you know many members of that house 
- Hufflepuff: notable members include Newt Scamander, Cedric Diggory and Nymphadora Tonks
- Ravenclaw: here we've got, e.g. Newt Scamander, Cedric Diggory and Nymphadora Tonks
- Slytherin: Draco Malfoy, Vincent Crabbe, Gregory Goyle, and of course the one that must not be named

In [None]:
hogwarts = {'gryffindor':["Harry Potter", "Hermione", "Ron"], 'hufflepuff': ["Newt Scamander", "Cedric Diggory", "Nymphadora Tonks"], 'ravenclaw':["Newt Scamander", "Cedric Diggory", "Nymphadora Tonks"], 'slytherin':["Draco Malfoy","Vincent Crabbe", "Gregory Goyle","the one that must not be named"]}

Dictionaries are really useful. Still our above data structure is limited. We can only store the name of a witch or wizard. Wouldn't it be cool to be able to store more information, something like her/his favored charm, best friend, pet, etc. 

Think about how we could realize this functionality. Well, we could create yet another dictionary in which we use a witch's/wizard's name as key and as value some some other data structure in which we can store all the details we like. To our knowledge, the names of witches / wizards are unique in the Harry Potter universe, so that names could serve as (unique) keys; nerd alert. Still a dictionary of dictionaries sounds pretty complicated. In fact, the task we described above is a perfect use case of customer classes. They allow us to store any piece of information about a person at one place.

Create a custom class wizard that facilitates storing the following properties:
- First name
- Last name
- Pet
- Pet name
- Patronus shape


Also implement a method `tell_pet()`that prints an output of the following format:
*"Harry Potter's owl is called Hedwig."* 

Implement one more  method `expecto_patronum()`. Calling that method for Harry would produce the output (print):
*"A stag appears."*

In [None]:
class wizard():
    def __init__(self, first_name, last_name, pet, pet_name, pastronus_shape):
        self.first_name = first_name
        self.last_name = last_name
        self.pet = pet
        self.pet_name = pet_name
        self.pastronus_shape = pastronus_shape
    def tell_pet(self):
        print("{}'s {} is called {}'".format(self.first_name, self.pet, self.pet_name))
    def expecto_patronum(self):
        print("A {} appears".format(self.pastronus_shape))

Update your dictionary with schools and their members. Instead of storing a list of names as values, your new dictionary should store a list of instances of the class wizard. Note that you need to create these instances first. So you need to create an instance of class wizard for Harry, another one for Ron, Malfoy, etc. In case your knowledge of the Potter universe is a bit shaky, just invent the data you need. Just in case, [here is a ittle refresher of the expecto patronum spell](https://www.insider.com/harry-potter-characters-patronus-2018-11).

In [None]:
harry = wizard("Harry", "Potter", "owl", "Hedwig", "Stag")
hemione = wizard("Hermione", "??", "??", "??", "??")
print(harry.first_name)

hogwarts = {'gryffindor': [harry, hemione]}
print(hogwarts)

Harry
{'gryffindor': [<__main__.wizard object at 0x7fa6ba573580>, <__main__.wizard object at 0x7fa6ba573310>]}


## Well done!!!