# Exercise 1: Type conversions

In the lecture, we discussed the basic built-in data types: integers, floating-point numbers, booleans, and strings. Python allows us to convert one type to another using the following functions:

- [`int()`](https://docs.python.org/3/library/functions.html#int) converts its argument to an integer.
- [`float()`](https://docs.python.org/3/library/functions.html#float) converts its argument to a floating-point number. 
- [`bool()`](https://docs.python.org/3/library/functions.html#bool) converts its argument to a boolean.
- [`str()`](https://docs.python.org/3/library/stdtypes.html#str) converts its argument to a string.

These conversions mostly work in an intuitive fashion, with some exceptions.

1.  Define a string variable `s` with the value `'1.1'`. Convert this variable to an integer, a float, and a boolean.
2.  Define the string variables `s1`, `s2`, and `s3` with values `'True'`, `'False'`, and `''` (empty string), respectively. Convert each of these to a boolean. Can you guess the conversion rule?
3.  Define a floating-point variable `x` with the value `0.9`. Convert this variable to an integer, a boolean, and a string.
4.  Define the integer variables `i1` and `i2` with values `0` and `2`, respectively. Convert each of these variables to a boolean.
5.   Define the boolean variables `b1` and `b2` with values `True` and `False`, respectively. Convert each of them to an integer.
6.  NumPy arrays cannot be converted using `int()`, `float()`, etc. Instead, we have to use the 
    method [`astype()`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.astype.html) 
    and pass the desired data type (e.g., `int`, `float`, `bool`) as an argument.
    Create a NumPy array called `arr` with elements `[0.0, 0.5, 1.0]` and convert it to an integer
    and a boolean type.


In [15]:
import numpy as np

### Solution to exercise 1

### Solution

In [6]:
# 1
s = '1'
s_int = int(s)
s_float = float(s)
s_bool = bool(s)

print(s, s_int, s_float, s_bool)

t = '1.0'
# t_int = int(t)  # Doesn't work, because 1.0 can't be turned into an integer
t_float = float(t)
t_bool = bool(t)

print(t, t_float, t_bool)

1 1 1.0 True
1.0 1.0 True


In [9]:
# 2
s1 = 'True'
s2 = 'False'
s3 = ''

s1_bool = bool(s1)
s2_bool = bool(s2)
s3_bool = bool(s3)

print(s1_bool, s2_bool, s3_bool)

# Conversion rule is 'True' if there is a non-empty string, since the empty string becomes False

s4 = ' '
s4_bool = bool(s4)
s4_bool

# Sanity check for conversion rule, this string is non-empty

True True False


True

In [21]:
# 3
x = .9
x_int = int(x)
x_bool = bool(x)
x_str = str(x)

print(f'x is {x}, as an integer {x_int}, as boolean {x_bool}, as string {x_str}')

x is 0.9, as an integer 0, as boolean True, as string 0.9


In [14]:
# 4
i1 = 0
i2 = 2
i1_bool = bool(i1)
i2_bool = bool(i2)

print(f'0 becomes {i1_bool}, 2 becomes {i2_bool}')

0 becomes False, 2 becomes True


In [13]:
# 5
b1 = True
b2 = False
b1_int = int(b1)
b2_int = int(b2)

print(f'True becomes {b1_int}, False becomes {b2_int}')

True becomes 1, False becomes 0


In [16]:
# 6
arr = np.array([0.0, 0.5, 1.0])
arr_int = arr.astype(int)
arr_bool = arr.astype(bool)

print(f'{arr} becomes {arr_int} in integers, {arr_bool} in boolean')

[0.  0.5 1. ] becomes [0 0 1] in integers, [False  True  True] in boolean


***
# Exercise 2: Working with strings

Strings in Python are full-fledged objects, i.e., they contain both the character data as well as additional functionality implemented via functions or so-called _methods_.
The official [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) 
provides a comprehensive list of these methods. For our purposes, the most important are:

- [`str.lower()`](https://docs.python.org/3/library/stdtypes.html#str.lower) 
    and 
  [`str.upper()`](https://docs.python.org/3/library/stdtypes.html#str.upper)
  convert the string to lower or upper case, respectively.
- [`str.strip()`](https://docs.python.org/3/library/stdtypes.html#str.strip) 
  removes any leading or trailing whitespace characters from a string.
- [`str.count()`](https://docs.python.org/3/library/stdtypes.html#str.count)
  returns the number of occurrences of a substring within a string.
- [`str.startswith()`](https://docs.python.org/3/library/stdtypes.html#str.startswith) 
    and 
  [`str.endswith()`](https://docs.python.org/3/library/stdtypes.html#str.endswith) 
  check whether a string starts or ends with a given substring.

Moreover, strings are also sequences and as such support indexing in the 
same way as lists or tuples.

Create a string variable with the value 
```python
s = '  NHH Norwegian School of Economics  '
```
and perform the following tasks:

1. Strip the surrounding spaces from the string using `strip()`.
2. Count the number of `'H'` in the string.
3. Modify your code so that it is case-insensitive, i.e., both instances of 
   `'h'` and `'H'` are counted.
4. Reverse the string, i.e., the last character should come first, and so on.
5. Create a new string which contains every 2nd letter from the original.
6. Select the last character from this new string using at least two different methods.


### Solution to exercise 2

In [47]:
# 1
s = ' NHH Norwegian School of Economics '
s = s.strip()
print(f'#1: {s}')

# 2
n_H = s.count('H')
print(f'#2: Number of H is {n_H}')

# 3
n_Hh = s.lower().count('h')
print(f'#3: Number of H, case-insensitive, is {n_Hh}')

# 4 (Slicing from end(:) to end(:) in steps of -1)
s_rev = s[::-1]
print(f'#4: S reversed is: {s_rev}')

# 5 (Should it ignore spaces???)
s_2 = s[::2]
print(f'#5: Every second letter of s: {s_2}')

# 6
print(f'#6: Last character is {s_2[len(s_2)-1]} or {s_2[-1]}')

#1: NHH Norwegian School of Economics
#2: Number of H is 2
#3: Number of H, case-insensitive, is 3
#4: S reversed is: scimonocE fo loohcS naigewroN HHN
#5: Every second letter of s: NHNreinSho fEoois
#6: Last character is s or s


***
# Exercise 3: Summing lists and arrays

In this exercise, we investigate an additional difference between built-in lists and NumPy arrays: performance.
You are asked to investigate performance differences for different implementations of the `sum()` function.

1. Create a list `lst` and a NumPy array `arr`, each of them containing the sequence 
   of ten values `0, 1, 2, ..., 9`.

   *Hint*: You can use the list constructor [`list()`](https://www.w3schools.com/python/ref_func_list.asp)
   and combine it with the [`range()`](https://docs.python.org/3/library/functions.html#func-range)
   function which returns an object representing a range of integers.

   *Hint:* You should create the NumPy array using 
   [`np.arange()`](https://numpy.org/doc/stable/reference/generated/numpy.arange.html).

2. We want to compute the sum of integers contained in `lst` and `arr`. Use 
   the built-in function [`sum()`](https://www.w3schools.com/python/ref_func_sum.asp)
   to sum elements of a list.
   For the NumPy array, use the NumPy function 
   [`np.sum()`](https://numpy.org/doc/stable/reference/generated/numpy.sum.html).

3. You are interested in benchmarking which summing function is faster.
    Repeat the steps from above, but use the cell magic 
    [`%timeit`](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit)
    to time the execution of a statement as follows:

    ```python
    %timeit statement
    ```

4.  Recreate the list and array to contain 100 integers starting from 0,
    and rerun the benchmark.

5.  Recreate the list and array to contain 10,000 integers starting from 0,
    and rerun the benchmark.


What do you conclude about the relative performance of built-in lists 
vs. NumPy arrays?

### Solution to exercise 3

In [None]:
# 1
lst = list(range(10))
arr = np.arange(10)
print(f'#1: lst: {lst}, arr: {arr}')

# 2
print(f'#2: Sum list: {sum(lst)}, arr: {np.sum(arr)}')

# 3
%timeit sum(lst)
%timeit np.sum(arr)


#1: lst: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], arr: [0 1 2 3 4 5 6 7 8 9]
#2: Sum list: 45, arr: 45
65 ns ± 0.473 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
1.22 μs ± 9.58 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [44]:
# 4
n = 100

lst = list(range(n))
arr = np.arange(n)

%timeit sum(lst)
%timeit np.sum(arr)

340 ns ± 2.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
1.25 μs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [46]:

# 5
n = 10_000

lst = list(range(n))
arr = np.arange(n)

%timeit sum(lst)
%timeit np.sum(arr)

27.9 μs ± 767 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
2.1 μs ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


For very small lists, the built-in sum-function is faster than numpy, because it doesn't have to run through numpy overheads. As we increase the range, numpy sum-function starts to overtake the built-in function. For large ranges, numpy is undoubtedly the best. (np.sum() scales much better)
The built-in sum-function is a regular python loop over all elements, so it's relatively slow per element. 