# Numpy Arrays - Introduction
Numpy arrays are the Python container specialised for multi-dimensional numerical calculations and data visualization. They are offered within the `numpy` module.

- They form an array of *homogeneous* (typically numerical) data
- A list of numerical types available for numpy-arrays is [here](http://docs.scipy.org/doc/numpy/user/basics.types.html)
- They consist of the data-array and information on the structure of the array (metadata on memory layout)
- Besides the array-type the `numpy`-module contains *vectorized* functions allowing *very fast* manipulation of `numpy`-arrays.

In [None]:
import numpy as np

# numpy-array creation from a list:
a = np.array([1.0, 2.0, 3.0, 4.0]) # np.array is a type-conversion function
print(type(a))     # the type is numpy-array
print(a.dtype)     # the data-type object.
print(a.ndim)      # number of array dimensions
print(a.shape)     # shape of an array (interesting mainly for multi-dimensional arrays)
print(a.strides)   # The number of bytes fron one element to the next

## Remark on data-types of arrays
Especially when you load data from external sources and you want to save them again, it is important to *exactly* control the type of your data!

After a command such as
   ```
   a = np.array([1.0, 2.0, 3.0, 4.0])
   ```
the *type* of `a` if machine dependent! On 64-bit machines, it will be `np.float64` and on 32-bit machines
`np.float32`!

If the type is important, you should specify it explicitely with `dtype` such as
   ```
   # single precision (4-byte) float numbers
   a_single = np.array([1.0, 2.0, 3.0, 4.0], dtype=np.float32)
   
   # double precision (8-byte) float numbers
   a_double = np.array([1.0, 2.0, 3.0, 4.0], dtype=np.float64))
   ```


## Basic Array creation

In [None]:
import numpy as np

# there are many possibilities to create a numpy-array
a = np.array([1,2,3,4])       # conversion of a numerical list to a numpy-array
b = np.arange(0.0, 1.0, 0.1)  # array between two limits with a given distance
                              # between array elements. The array if a half-open
                              # interval!
print(b)
c = np.linspace(0.0, 1.0, 10) # array between two limits with a given number of
                              # array elements. Both limits are contained in the
                              # array
print(c)
d = np.zeros(10)              # array of 10 elements with 0 
print(d)

## Slicing operations
Array slicing for (one-dimensional) numpy arrays works exactly as for lists and other Python containers

In [None]:
import numpy as np

a = np.arange(0, 11, 1)
print(a)
print(a[5])    # access 6th element (zero-based arrays!)
print(a[2:6])  # access third up to the sixth element
print(a[1::2]) # access every other element starting from the second
print(a[:-1])  # access all elements except the last

## Note on memory management of `numpy`-arrays
`numpy`-arrays have a very similar memory management (access to memory locations in variable reassignments) to lists. An important difference is that slices share the same memory as the original array for numpy arrays. However, slices of lists are copies from the original list data!

In [None]:
import numpy as np

l = list(range(7))
n = np.arange(0, 7, 1)
print(l, n)

# n1 and l1 share the same memory with n and l!
# modifications on l1 and n1 also lead to
# modifications in l and n:
l1 = l
n1 = n

l1[0] = 10
n1[0] = 10
print(l, n)

# slices behave differently for lists and numpy-arrays:
ls = l[1:4] # uses different memory than l!
ns = n[1:4] # uses same memory as n!

print(ls, ns)
ls[0] = 5  # does not modify l
ns[0] = 5  # modifies also n
print(l, ls)
print(n, ns)

### Exercises
- create a numpy array of 10 zeros of 'int32' data-type.
- create an array with 10 logarithmically spaced elements between 1 and 100.
- Explain the result of the following cell

In [None]:
import numpy as np

a = np.arange(0,11, 1)
print("Array a:", a)
b = a[2:5]    # array slicing creates 'views' of an existing array!
print("Array b:", b)
b[0] = 100
print("Array b:", b)
print("Array a:", a)

c = np.arange(0,11, 1)
print("Array c:", c)
d = c[2:5].copy()  # we create an explicit copy here!
print("Array d:", d)
d[0] = 100
print("Array d:", d)
print("Array c:", c)

## Array operations
`numpy`-arrays can be connected with mathematical operations and they behave as you would expect them to. Mathematical operations betwen arrays are performed *elementwise*.

In [None]:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
print(x + y)   # element-wise addition
print(x * y)   # element-wise multiplication
print(x + 2 * y)  # more complex manipulation
print(x**y)
print(y > 4)    # element-wise comparison resulting in
                # a bool-array!

application of high-performace, vectorized functions:

In [None]:
x = np.linspace(0.0, 2.0 * np.pi, 50)
y = np.sin(x)
print(y)

**Manipulate numpy-arrays with vector operations whenever possible!** If somebody tells you that Python is about 10 times slower than C, they talk about element-wise array manipulations with loops!

In [None]:
%%timeit
# fast vector operations
x = np.linspace(0.0, 2.0 * np.pi, 100)
y = np.sin(x)

In [None]:
%%timeit
# C-like element-wise array manipulation
x = np.linspace(0.0, 2.0 * np.pi, 100)
y = np.zeros(len(x))

for i in range(len(x)):
    y[i] = np.sin(x[i])

### Exercise:
- Calculate the `scalar` product of two one-dimensional numpy arrays: $s=\sum_{i=1}^Na_ib_i$.

  **Hint**: The function `np.sum` calculates the sum of the elements of an array.
- Use the function `random_sample` witin the module `numpy.random` to create a numpy array $x$ of 1000 uniformly distributed random numbers within $[-1;1)$.
- Calculate the mean $\bar{x}=\frac 1N\sum_{i=1}^N x_i$ and the standard deviation
$s_x=\sqrt{\frac 1{N-1}\sum_{i=1}^N(x_i-\bar{x})^2}$ of the elements in $x$. Does the result meet your expectations?

  You can use the function `numpy.sum` **but not** the functions `numpy.mean` and `numpy.std` for this exercise!


In [None]:
# your solutions here

## Basic plotting of numpy-Arrays
We will cover matplotlib in more detail later but we give some basic commands here for simple plots!

In [None]:
%matplotlib inline
# The previous line is necessary that matplotlib plots
# appear within the Jupyter documents. It is sufficent to
# give it once within a document.
import numpy as np
import matplotlib.pyplot as plt

# matplotlib plots numpy-array values!
x = np.linspace(0.0, 2.0 * np.pi, 100)
y = np.sin(x)

# Note that you can use LaTeX in for labels, titles
# etc.
plt.xlabel(r"$x$")
plt.ylabel(r"$y$")
plt.title(r"The $\sin(x)$ function")

# a simple x-y plot
plt.plot(x, y)


### Worked example and exercise
I will walk you through a very simple method to estimate derivatives of functions given at discrete points.

Write Python code to estimate the derivative $\frac{\rm d}{\rm{dx}}\sin(x)$ with $x\in[0, 2\pi]$ and plot the result. Create another plot showing the difference between your estimated derivative and the function $\cos(x)$.

In [None]:
# your solution here

## Fancy indexing and masking
Slicing does not provide all the necessary functionality to extract sub-arrays. For instance, the application of a $\log$-function only should happen on elements larger than zero. We would therefore like to act on array elements meeting more complex conditions.

### Fancy indexing: explicit array access with a subset of indices

In [None]:
import numpy as np

x = np.arange(1, 5, 1)
print(x)
ind = np.array([0, 2, 3]) # indices of elements we would like to extract
b = x[ind]
print(b)

### Boolean indexing: array access with a bool *mask-array*

In [None]:
# Note that boolean-indexing is usually never done explicitely
# but indirectly via masking (see below). We show the explicit
# boolean masking for demonstration purposes here.
import numpy as np

x = np.arange(1, 5, 1)
print(x)
# we access indices that are 'True' in a boolean array
# of the same size as x:
ind = np.array([True, False, True, True])
b = x[ind]
print(b)

### General Masking

In [None]:
import numpy as np
import numpy.random as nr

x = nr.randint(-10, 10, 10)
print(x)
mask1 = (x > 0)  # mask is a bool array
y = x[mask1]     # extract the values from x where mask = True
print(y)
mask2 = (x > 0) & (x < 4)  # combined mask (and condition)
mask3 = (x < -5) | ( x > 5) # combined mask (or condition)
print(x[mask2])
print(x[mask3])

## Important Notes
- In contrast to slicing, fancy indexing and masking **always** return **copies** of the original array!

In [None]:
import numpy as np

a = np.arange(0, 11, 1)

b = a[::2] # get each second number of a
print(b)

# create the 'same' array with fancy indexing:
c = a[np.array([0, 2, 4, 6, 8, 10])]
print(c)

# create again the 'same' array with masking:
d = a[a%2 == 0]
print(d)

# only a modification in b also modifies a!
b[0] = 5
print(a, b)

c[1] = 100
print(a, c)

d[2] = 1000
print(a, d)

- Fancy indexing and masking can be used *on the left side of an assignment* as well!

In [None]:
import numpy as np

a = np.arange(0, 11, 1)

ind = np.array([0, 2, 4])
a[ind] = 1000
print(a)

a = np.arange(0, 11, 1)
a[a%2 == 0] = 1000
print(a)

### Exercises:
- Give a Python command which multiplies all positive numbers in an integer-array with 2. Negative numbers should be unchanged. The mofification should happen in place, i.e. the original array is replaced by the new one.
- Write a Python function `my_sign` which calculates the signum function `sgn(x)` of an integer numpy-array:
  $$\text{sgn}(x)=\left\{\begin{array}{lr}-1 & x < 0 \\ 0 & x = 0 \\ 1 & x > 0\\\end{array}\right.$$
  The function should return a *new* array and leave the original one unchanged.