# Container in Python and numpy Arrays

---

In this notebook we will show the basic concept of containers in Python and gives an introduction into 1d numpy arrays as a good example of a container type.

---

## 1. Container types in Python

Besides the basic scalar types Python provides another class of types, the *container* types, including Strings, Lists, Dictionaries, Sets and `numpy`-Arrays, which we will show later.

All of the container types share similar features and operations. In principle a *container* is a pool of scalar type elements which have an unique identifier, e.g. a number which can be used as an index to a certain element. All container types have a so called *length*  property `len`, which gives you the number of stored elements in the container. However, different containers have special features:

 * `numpy`-arrays, have indices, homogeneous element types
 * strings, have indices, are immutable
 * python lists, have indices, inhomogeneous element types
 * python tuples, similar to lists, are immutable
 * dictionaries, have names as indices
 * sets, similar to lists, no duplicate elements possible

---

## 2. `numpy`-Arrays

As said, a `numpy`-array is a container type, which has for his elements a homogeneous scalar (mostly numerical) type. A list of types can be found [here](https://numpy.org/doc/stable/user/basics.types.html). The basic idea behind the `numpy`-Arrays is to provide very fast vectored operation for numerical calculations.

**Note:** In comparison to the scalar integer number type, the `numpy`-integers are limited to special bit sizes (64bit, 32bit, etc.).

For handling of `numpy`-arrays we need to talk about so called `meta`-data of an array, which mainly describes the array data structure:

In [2]:
import numpy as np

# numpy-array creation from a list of numbers:
a = np.array([1.0, 2.0, 3.0, 4.0]) # np.array is a type-conversion function
print(type(a))     # the type is numpy-array
print(a.dtype)     # the data-type object.
print(a.ndim)      # number of array dimensions
print(a.shape)     # shape of an array (important mainly for multi-dimensional arrays)
print(len(a))      # 'length of the array'. This corresponfs to the number of
                   # array elements for a one-dimensional array!

<class 'numpy.ndarray'>
float64
1
(4,)
4


---

## 2. Creation of arrays

Here are a few manual creation methods for `numpy`-arrays:

In [7]:
a = np.array([1,2,3,4])   # automatic guessing of the number type
print(a, a.dtype)

b = np.array([1,2,3,4], dtype=np.float64)
print(b, b.dtype)

[1 2 3 4] int64
[1. 2. 3. 4.] float64


Creating an array of numbers between limits:

In [33]:
a = np.arange(0,20,2)   # the intervall is open on the right side!

print(a, a.dtype)

# or

b = np.linspace(0,10,11, dtype=np.int32)  # the intervall is closed!
print(b, b.dtype)

[ 0  2  4  6  8 10 12 14 16 18] int64
[ 0  1  2  3  4  5  6  7  8  9 10] int32


Creating an array with `0`s:

In [19]:
a = np.zeros(5, dtype=np.float64) # create 5 float zeros!
print(a, a.dtype)

[0. 0. 0. 0. 0.] float64


---

## 3. Accessing elements

`numpy´-array elements can be addressed by an index. The index starts always with `0` until `len(array)-1` . Indices outside the defined number raise an error.

Invidual elements can be accessed (similar to C):

In [22]:
a = np.arange(0,10,1)
print(a)
print(a[0])    # first element
print(a[1])    # second element
print(a[-1])   # last element

[0 1 2 3 4 5 6 7 8 9]
0
1
9
9


**Note:** Indices with negative numbers are counted from the end of the array. Please use always this notation not `len(a)-1`!

You can also modify individual elements using the same notation on the left side of an assignment:

In [34]:
a = np.arange(0,10,1)
a[2] = a[-1] + a[1]   

print(a)

[ 0  1 10  3  4  5  6  7  8  9]


Special to the container types in Python is, that you can address multiple elements of a container at the same time, this is called **slicing**:

In [35]:
a = np.arange(0,10,1)

print(a[1:3])     # the second and third element
print(a[1:2])     # the second element only
print(a[4:])      # everything, starting at the 5. element
print(a[:-1])     # everything except the last element
print(a[::2])     # everything but every second element
print(a[-2:1:-1]) # reverse from the second last until the third elemnt
print(a[::-1])    # everything reverse

[1 2]
[1]
[4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8]
[0 2 4 6 8]
[8 7 6 5 4 3 2]
[9 8 7 6 5 4 3 2 1 0]


These are some basic **slicing** operations, you can find a more formal set of slicing rules [at the end of the notebook](#formal_slicing).

Sliced arrays can also be used on the left side of an assignment:

In [41]:
a = np.arange(0,10,1)

a[1:3] = 1                       # change multiple elements to one value
print(a)

# or

a[5:8] = np.array([100,200,300]) # change multiple elments with a array of the same length
print(a)

[0 1 1 3 4 5 6 7 8 9]
[  0   1   1   3   4 100 200 300   8   9]


Container types provide also a very easy fail save possibility to address all elements:

In [None]:
a = np.arange(0,10,1)

for num in a:      # similar to bash in Linux/Unix
    print(num)

You can use the `for`-loop to get all elements `in` the container `a`. You don't need to generate a special running variable, you may possibly know from other programming languages or if you use a `while`-loop:

In [36]:
a = np.arange(0,10,1)

i = 0              # trap 1
while i < len(a):  # trap 2
    print(a[i])
    i = i + 1      # trap 3

0
1
2
3
4
5
6
7
8
9


Same results, but `3` points of possible failures ;-)

---

## 4. Array operations

`numpy`-arrays provides some special vectorized operations, which make array operations very easy to use and very efficient:

In [1]:
x = np.array([1,2,3,4])
y = np.array([5,6,7,8])

print(x+10)   # add a constant
print(x*10)   # multiply with constant
print(x+y)    # add two arrays with the same length
print(x/y)    # divide two arrays with the same length
print(x+2*y)  # complex operations

NameError: name 'np' is not defined

All the operations are done element wise, so if you are use two or more arrays, all the arrays must have the same length!

The `numpy`-library also provides numerical functions which can be used on numbers (as shown last week) or on arrays:

In [5]:
x = 2
y = np.array([1,2,3,4])
print(np.sqrt(x))
print(np.sqrt(y))

z = np.linspace(0.0, 2*np.pi, 50)
z2 = np.sin(z)

print(z2)
print(np.sum(z2))   # should be close to zero

1.4142135623730951
[1.         1.41421356 1.73205081 2.        ]
[ 0.00000000e+00  1.27877162e-01  2.53654584e-01  3.75267005e-01
  4.90717552e-01  5.98110530e-01  6.95682551e-01  7.81831482e-01
  8.55142763e-01  9.14412623e-01  9.58667853e-01  9.87181783e-01
  9.99486216e-01  9.95379113e-01  9.74927912e-01  9.38468422e-01
  8.86599306e-01  8.20172255e-01  7.40277997e-01  6.48228395e-01
  5.45534901e-01  4.33883739e-01  3.15108218e-01  1.91158629e-01
  6.40702200e-02 -6.40702200e-02 -1.91158629e-01 -3.15108218e-01
 -4.33883739e-01 -5.45534901e-01 -6.48228395e-01 -7.40277997e-01
 -8.20172255e-01 -8.86599306e-01 -9.38468422e-01 -9.74927912e-01
 -9.95379113e-01 -9.99486216e-01 -9.87181783e-01 -9.58667853e-01
 -9.14412623e-01 -8.55142763e-01 -7.81831482e-01 -6.95682551e-01
 -5.98110530e-01 -4.90717552e-01 -3.75267005e-01 -2.53654584e-01
 -1.27877162e-01 -2.44929360e-16]
-3.0044051106072847e-16


Sometimes in the literature there are two different ways, e.g. to sum all array elements:

In [6]:
a = np.array([1,2,3,4])

print(np.sum(a))

# or

print(a.sum())

10
10


The first one is less confusing that the second one and not all `np.`-functions are always implemented. However, you can use an array itself in a code cell to find out, what functions are implemented:

In [7]:
arr = np.array([1,2,3,4])
# to look for implemented functions, write the name of the defined array with a . and the press TAB
#arr.

**Important**: All mathematical operations with arrays creates a new array as a result!

---

## 5. Example for using `numpy`

After the introduction of of the `numpy` modules and the arrays I want to show a nice example to use vectorized operations to calculate numerically the derivative of functions.

A derivative of a function $f(x)$ is defined with:

$$ f'(x) = \lim_{h \rightarrow 0} \frac{f(x-h)-f(x+h)}{h}$$

If the function is defined of discrete values $f = f_1, f_2, f_3, \ldots, f_n$ , which all have the same distance $h$, then the derivative 
can be defined with:
    
$$ f' = f'_1, f'_2, f'_3, \ldots f`_{n-1} $$ with

$$ f'_n = \frac{f_n - f_{n+1}}{h} $$

Wählt man das $h$ sehr klein, dann nähert sich die numerische Ableitung der richtigen Ableitung an.
If one choose a very small $h$, the numerical result reaches the real derivative.

### How to calculate the numerical derivative?

The simplest idea is to calculate the $f'_n$ element wise, but `numpy` arrays provides are better solution.

Let's have a loot at the array $f = f_1, f_2, f_3,\ldots$:

<center> <img src="figs/elements.png" style="" /> </center>

Use now the the **same** array but write the elements in the second row shifted by one position to the left, so that $f_1$  is below $f_0$:

<center> <img src="figs/diff_elements.png" /> </center>

Now we have $n-1$ pairs of elements, which defines exactly the enumerical derivative if one array will be substracted from the other.

In Python we can see this methon in a full example:

In [48]:
%matplotlib inline

import numpy as np

import matplotlib.pyplot as plt



x = np.linspace(0.,10,1000)

#f = np.sin(x)
#f = 1/(x+1)
f = x**3-5*x**2+1

# calculate the derivative
f_prime = (f[1:]-f[:-1])/(x[1]-x[0])

#print(f.shape)
#print(f_prime.shape)

# adjust the x array, calculate the value of the middle
x_prime = (x[1:] + x[:-1])/2
#print(x_prime.shape)


# plot the results

#fig, ax = plt.subplots()
#ax.plot(x, f, label='f(x)')
#ax.plot(x_prime, f_prime, label='f\'(x)')
#ax.legend(loc='upper right')
#ax.set_xlabel('x')
#ax.set_ylabel('y')

---

## 6. Traps with arrays

There are some traps with arrays:

In [51]:
# values in variable a should be copied into b
a = 1234
b = a

# change variable a
a = 9876

print(a, b)

9876 1234


but:

In [52]:
import numpy as np

# values in variable a should be copied into b
a = np.array([1,2,3,4])
b = a

# change a value in variable a
a[0] = 100

print(a, b)

[100   2   3   4] [100   2   3   4]


**Note**: `b=a` is not a copy of the array, but `b` is now a new __view__ to the same array!

In [None]:
import numpy as np

# values in variable a should be copied into b
a = np.array([1,2,3,4])

# make a real copy
b = a * 1.

# or make a real copy
#b = a.copy()

# change a value in variable a
a[0] = 100

print(a, b)

<a id='formal_slicing'></a>
# Appendix: Slicing Rules

You absolutely need to master the `Python` slicing rules. Besides with `numpy`-arrays, they are essential for many other Python containers such as lists or strings.

Many students have difficulties to perform or to understand certain slicing operations. I therefore do a *formal* summary of the slicing rules in this appendix.

The following applies to a larger number of `Python` containers such as lists, strings, tuples, `numpy`-arrays. We just talk about *arrays* for all these container types and we use the following `numpy`-array `x` as a concrete example.

In [None]:
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

- An individual element $i$ is accessed with the syntax `x[i]`. $i$ can take the positive values $i\in [0, n-1]$, where $n$ is the number of elements in the array. `x[0]` accesses the first and `x[n-1]` the last element of the array. If $i$ is negative, the element `x[n-i]` is accessed.

In [None]:
# examples for single element array access:
print(x[1], x[-1], x[3])
print(x[10])  # invalid index - python raises an error

- To access multiple array-elements simultaneously and to work on a *subarray*, we need to use an array-slice. The basic slice syntax is `x[i:j:k]`. $i$ is the starting index, $j$ is the stopping index, and $k$ is the step $(k\neq0)$. This selects the $m$ elements with index values $i, i + k, \dots, i + (m - 1) k$ where $m = q + (r\neq0)$ and $q$ and $r$ are the quotient and remainder obtained by dividing $j - i$ by $k$: $j - i = q k + r$, so that $i + (m - 1) k < j$.

**Note: Slicing operations are always inclusive the starting index $i$ BUT exclusive the stopping index $j$!**

In [None]:
print(x[1:7:2])

- Negative $i$ and $j$ are interpreted as $n + i$ and $n + j$ where $n$ is the number of elements in the array. Negative $k$ makes stepping go towards smaller indices.

In [None]:
print(x[-2:10], x[-3:3:-1])

- Assume $n$ is the number of elements in the array. Then, if $i$ is not given it defaults to $0$ for $k > 0$ and $n - 1$ for $k < 0$. If $j$ is not given it defaults to $n$ for $k > 0$ and $-1$ for $k < 0$. If $k$ is not given it defaults to $1$. Note that `::` is the same as : and means select all elements.

In [None]:
print(x[5:], x[::-2])

- *Remark:* A slicing operation `x[i:j:k]` *always* returns a *subarray*, while accessing a single element with `x[i]` returns an object of corresponding type. Note carefully the outout of the following example:

In [None]:
print(x[3])   # accessing the fourth element
print(x[3:4]) # accessing a *subarray* containg only the fourth argument!