# Introduction to numpy

Credits: V. A. Sole, ESRF Software Group

## Summary

* introduction
* numpy generalities
* numpy arrays
    * creating arrays
    * saving loading arrays
    * using arrays
* numpy modules

# Introduction
## Python basic operators

* `+`: addition
* `-`: soustraction
* `/`: division
* `x//y`: integer part of x/y
* `**`: exponentiation
* abs(x): absolute value of x
* `x%y` remaining of x/y

In [None]:
a = 2
b = 3

In [None]:
a + b

In [None]:
a - b

In [None]:
a / b

In [None]:
a // b

In [None]:
b**a

In [None]:
abs(-a)

In [None]:
2%3

## Python basic high level data types

* numbers: 10, 10.0, 1.0e+01, (10.0+3j)
* strings: "Hello world"
* bytes: b"Hello world"
* list: ['abc', 3, 'x']
* tuples: ('abc', 3, 'x')
* dictionnaries: {'key1': 'abc', 'key2': 3, 'key3': 'x'}

In [None]:
1.0e+01 + 1

In [None]:
type("Hello world")

In [None]:
type(b"Hello world")

In [None]:
print({'key1': 'abc', 'key2': 3, 'key3': 'x'})

## Exercice 1
if a = [1, 2, 3], what are the results of:

* `2 * a[2]` ?
* `2 * a` ?

Combine operations and data types and comment your findings.

In [None]:
a = [1, 2, 3]
print(2*a[2])
print(2*a)

## Conclusion

without additional libraries python is almost useless for scientific computing

# Numpy generalities

numpy is THE library providing number crunching capabilities to python
It extends Python providing tools for:

* Treatment of multi-dimensional data
* Access to optimized linear algebra libraries
* Encapsulation of C and Fortran code


# Numpy arrays

## the numpy ndarray objects

The (nd)array object is:

* a collection of elements of the same type. Can be any type of data.
* implemented in memory as a true table optimized for performance
* handled as any other Python object
* multidimensional, meaning:

    * dimensions can be modified, flexible indexation
    * internal optimization for 1D, 2D and 3D


In [None]:
import numpy

## creating arrays
### from numpy.array

In [None]:
? numpy.array

create an array from a list of values

In [None]:
arr = numpy.array([1, 2, 3, 5, 7, 11, 13, 17])
print(arr)

In [None]:
arr.reshape(2, 4)

create an array from a list of values and dimensions

In [None]:
numpy.array([[1, 2, 3, 5], [7, 11, 13, 17]])

specifying the type of element

In [None]:
numpy.array([[1, 2, 3, 5], [7, 11, 13, 17]])

### from dedicated methods

In [None]:
numpy.empty((2, 4))

In [None]:
numpy.zeros((2, 4))

In [None]:
numpy.ones((2, 4))

In [None]:
numpy.arange(start=0, stop=10, step=1)

In [None]:
numpy.linspace(start=1, stop=10, num=10, endpoint=True)

In [None]:
numpy.identity(4)

### types of elements

#### traditional types

As we said previously, numpy arrays can deal with any kind of python object:

* python integers and real numbers (int, float)
* Complex
* Chains of characters
* Any python object (object).
* numpy integers and real numbers (numpy.int32, numpy.int64, numpy.float32, numpy.float64 ...)
* numpy complex (numpy.complex64: corresponds to two 32 bits floats, one for real part and one for imaginary)


You can specify type during array construction or load using the `dtype` parameter

In [None]:
print(numpy.arange(2, dtype=numpy.float64))
print(numpy.arange(2, dtype=numpy.int32))
print(numpy.arange(2, dtype=numpy.complex))

!!! Types should be specify to insure portability !!!

#### arrays of object

numpy arrays can contain any type of object

In [None]:
a = dict({'key1': 0})
b = dict({'key2': 1})
c = dict({'key3': 2})
numpy.array([a, b, c])

#### record arrays

They allow access to the data using named fields.
Imagine your data being a spreadsheet, the field names would be the column heading.

In [None]:
img = numpy.zeros(
    (2,2),
    {
        'names': ('r','g','b'),
        'formats': (numpy.float32, numpy.float32, numpy.float32)
    })
img['r']

## save and load arrays
### using a numpy file (.npy) - binary file

In [None]:
a = numpy.arange(start=0, stop=10, step=1, dtype=numpy.int32)
numpy.save('data.npy', a)

In [None]:
numpy.load('data.npy')

### using text file

In [None]:
numpy.savetxt('myarray.txt', a)

In [None]:
numpy.loadtxt('myarray.txt', dtype=numpy.int32)

Note: 

* Each row in the text file must have the same number of values.
* Several other options exists. Here is the full function signature:

`loadtxt(fname, dtype=<type 'float'>,
         comments='#', delimiter=None,
         converters=None, skiprows=0,
         usecols=None, unpack=False, ndmin=0)`

## Exercice 2

Use python as a simple calculator and try the basic operations
on different arrays of numbers (integers, floats, ...).

In [None]:
import numpy
a = [1, 2, 3]
b = numpy.array(a)

* do you remenber the result of 2 * a[2] ?
* do you remenber the result of 2 * a ?

* what is the result of 2 * b[2] ?

In [None]:
2 * b[2]

* what is the result of 2 * b ?

In [None]:
2 * b

* what is the result of b / 2.0 ?

In [None]:
b / 2.0

## using arrays

### indexing

One can select elements as with any other Python sequence:

* Indexing starts at 0 for each array dimension
* Indexes can be negative: x[-1] is the same as x[len(x) - 1]

The output refers to the original array and usually it is not contiguous in memory.

In [None]:
a = numpy.array([
    (1, 2, 3, 4),
    (5, 6, 7, 8),
    (9, 10, 11, 12)
    ])

In [None]:
a[1, 2]

In [None]:
a[0:2, 2]

In [None]:
a[2] # all the elements of the fourth row

In [None]:
a[:, 2] # same as previous assuming a has at least two dimensions

In [None]:
a[0, -1]  # last element of the first row

In [None]:
a[0:2, 0:4:2]  # slicing allowed

In [None]:
a[0:2, :] = 5  # assignation is also possible

The indexation argument is a list or an array:

In [None]:
a = numpy.arange(10., 18.)
print(a)
a[[2, 5]]

The indexation argument can be a logical array:

In [None]:
a[a>13]

## Excercise 3

if:

* x = numpy.arange(10)
* y = numpy.arange(1, 11)

then:

* Calculate the element-wise difference between x and y?
* Provide an expression to calculate the difference x[i+1]-x[i] for all the elements of the 1D array.

#### Solution

In [None]:
import exercicesolution
import inspect
print('element wise difference solution')
print(inspect.getsource(exercicesolution.ex3_1))
exercicesolution.ex3_1()

In [None]:
print('difference of X[i+1]-X[i]')
print(inspect.getsource(exercicesolution.ex3_2))
exercicesolution.ex3_2()

### array attributes

In [None]:
a = numpy.array([[3, 2], [8, 12]])

**dtype**: Identifies the type of the elements of the array

In [None]:
a.dtype

In [None]:
a.dtype.char

**shape**: Tuple containing the array dimensions. It is a Read and Write attribute.

In [None]:
a.shape

In [None]:
numpy.shape(a)

**T**: transposed view of the array

In [None]:
a.T

exists also as:

In [None]:
a.transpose()
numpy.transpose(a)

### advanced attributes

* itemsize: Size of a single item, also the size of dtype
* size: Total number of element in the nd_array: prod(shape)
* strides: Tuple of bytes to step in each dimension when traversing an array
* flags: Information about the contiguity of the data in the buffer
* nbytes: Size in bytes occupied by the buffer in memory: size*itemsize
* ndim: Number of dimensions of the nd_array: len(shape)
* data: The read/write buffer containing actually the data


### Methods

There are methods associated to the arrays 

* a.reshape() Returns a view of the array with a different shape
* a.min() Returns the minimum of the array
* a.max() Returns the maximum of the array
* a.sort() Sorts an array in-place: returns None
* a.sum() Returns the sum of the elements of the array
* a.sum(axis=None, dtype=None, out=None) Perform the sum along a specified axis

There are functions associated to the module: dir(numpy) Many methods are available in both forms:

In [None]:
b = a.copy()
b = numpy.copy(a)
b = numpy.array(a, copy=True)

### Views

here c is a new object but pointing to the same buffer as a

In [None]:
a = numpy.arange(10.)
a.shape = (2, 5)
c = a.T
print(a[1, 2])
c[2, 1] = 10
print(a[1, 2])

An other example:

In [None]:
b = a[:]
b.shape = -1
print(b.shape) # makes whatever needed to get the matching number
print(a.shape)
print(a[0, 0])
b[0] = 25
print(a[0, 0])

### Exercise 4

perform a 2x2 binning of an image

* 1D binning:
    1. generate an array with 100 elements stored in increasing order from 0 to 99

| 1  | 2  | 3  | 4  |
|----|----|----|----|
| 5  | 6  | 7  | 8  |
| 9  | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 |

    2. perform the binning of the 1D array such as: 1 2 3 4 -> 1+2 3+4
    
| 1+2+5+6    | 3+4+7+8     |
|------------|-------------|
| 9+10+13+14 | 11+12+15+16 |
    
* 2D binning:
    * generate a 100x100 array with elements in increasing order
    * perform a 2x2 binning

#### Solution

In [None]:
import exercicesolution
import inspect
print('1D array binning')
print(inspect.getsource(exercicesolution.ex4_1))
rawdata, bind = exercicesolution.ex4_1()
print(rawdata)
print(bind)

In [None]:
import exercicesolution
import inspect
print('2D array binning')
print(inspect.getsource(exercicesolution.ex4_2))
rawdata, bind = exercicesolution.ex4_2()
print(rawdata)
print(bind)

In [None]:
import exercicesolution
import inspect
print('alternative 2D array binning')
print(inspect.getsource(exercicesolution.ex4_2_alt))
rawdata, bind = exercicesolution.ex4_2_alt()
print(rawdata)
print(bind)