<img style="float: right;" width="120" src="../Images/supplier-logo.png">
<img style="float: left; margin-top: 0" width="80" src="../Images/client-logo.png">
<br><br><br>


# numpy 
<img style="float: right;" src="../Images/numpy_logo.png">

- is the fundamental package for scientific computing with Python. <br>

It contains among other things:

- a powerful N-dimensional array object

- sophisticated (broadcasting) functions

- tools for integrating C/C++ and Fortran code

- useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. <br>
Arbitrary data-types can be defined, allowing NumPy to seamlessly and speedily integrate with a wide variety of databases.

# numpy Arrays

- In NumPy, each dimension is called an **axis**.

- The number of axes is called the **rank**.

- An array's list of axis lengths is called the **shape** of the array.

- The rank is equal to the shape's **length**.

- The size of an array is the total number of elements, which is the product of all axis lengths

## import the numpy packge

Convention is to alias it as `np`

In [None]:
import numpy as np

## np.zeroes

create an array containing any number of zeros

In [None]:
np.zeros(5)

create a 2D array (ie. a matrix) by providing a tuple with the desired number of rows and columns. 


In [None]:
a = np.zeros((3,4))
a

For the above 3x4 matrix 
- rank is 2 (it is 2-dimensional).
- first axis has length 3
- the second has length 4.
- shape is (3, 4).

**NOTE**<br>
The rank is equal to the shape's length.<br>
The size of an array is the total number of elements, which is the product of all axis lengths (eg. 3*4=12)

In [None]:
print(a.shape)
print(a.ndim)


## N-dimensional arrays
You can also create an N-dimensional array of arbitrary rank. <br>
a 3D array (rank=3), with shape (3,4,5):

In [None]:
np.zeros((2,3,4))

## Type of an array

All numpy arrays are of type `numpy.ndarray` (irrespective if its size, shape or rank)

In [None]:
print(type(np.zeros((3,4))))
print(type(a))

## np.ones

Create an array full of 1's

In [None]:
np.ones((3,4))

## np.full
Creates an array of the given shape initialized with the given value. 

In [None]:
np.full((3,4), 23 / 4)

##  np.empty 
Created an uninitialized array 

**NOTE**<br>
its content is not predictable, as it is whatever is in memory at that point

In [None]:
np.empty((3, 4))

## np.array
Initialize an ndarray using a regular python array. 

In [None]:
l1 = [1,2,3,4,]
l2 = [10,20,30,40]
np.array([l1, l2])

##  np.arange
You can create an `ndarray` using NumPy's `arange` function, which is similar to python's built-in `range` function:

2 versions
- `np.arange(x, y)` - start at `x` , continue up to but not including `y`<br>
- `np.arange(x, y, step)` - start at `x` continue up to but not including `y`, in steps of `y`
- can include floats as well as integers


In [None]:
print (np.arange(1, 5))
print (np.arange(-8, -3))
print (np.arange(10, 15))
print (np.arange(1.0, 5.0))
print (np.arange(10, 25, 3))
print (np.arange(1, 5, 0.5))

However, when dealing with floats, the exact number of elements in the array is not always predictible. 
For example, consider this:

In [None]:
print(np.arange(0, 5/3, 1/3)) # depending on floating point errors, the max value is 4/3 or 5/3.
print(np.arange(0, 5/3, 0.333333333))
print(np.arange(0, 5/3, 0.333333334))

## np.linspace
Its generally preferable to use the `linspace` function instead of `arange` when working with floats. <br>
The `linspace` function returns an array containing a specific number of points evenly distributed between two values 

**NOTE**
- the maximum value is *included*, contrary to `arange`

In [None]:
np.linspace(0, 5/3, 6)

## np.rand and np.randn
A number of functions are available in NumPy's random module to create ndarrays initialized with random values.<br>
For example, here is a 3x4 matrix initialized with random floats between 0 and 1 (uniform distribution):


In [None]:
np.random.rand(3,4)

Here's a 3x4 matrix containing random floats sampled from a univariate normal distribution (Gaussian distribution) of mean 0 and variance 1:



In [None]:
np.random.randn(3,4)

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.hist(np.random.rand(100000), density=True, bins=100, histtype="step", color="blue", label="rand")
plt.hist(np.random.randn(100000), density=True, bins=100, histtype="step", color="red", label="randn")
plt.axis([-2.5, 2.5, 0, 1.1])
plt.legend(loc = "upper left")
plt.title("Random distributions")
plt.xlabel("Value")
plt.ylabel("Density")
plt.show()

## np.fromfunction

Used to initialize an ndarray from a function

In [None]:
def fn2Dn(x, y):
    return x + y 

np.fromfunction(fn2Dn, (3,8))

In [None]:
def fn3Dn(x, y, z):
    return x + y * z

np.fromfunction(fn3Dn, (3, 2, 8))

NumPy first creates three ndarrays (one per dimension), each of shape (2, 8).<br>
Each array has values equal to the coordinate along a specific axis. <br>
For example, all elements in the z array are equal to their z-coordinate:

	[[[0., 0., 0., 0., 0., 0., 0., 0.],
	  [0., 1., 2., 3., 4., 5., 6., 7.]],

	 [[1., 1., 1., 1., 1., 1., 1., 1.],
 	  [1., 2., 3., 4., 5., 6., 7., 8.]],

	  [[2., 2., 2., 2., 2., 2., 2., 2.],
	   [2., 3., 4., 5., 6., 7., 8., 9.]]]

So the terms x, y and z in the expression x * y + z above are in fact ndarrays<br>
The point is that the function my_function is only called once, instead of once per element. <br>
This makes initialization very efficient.
    
    

# Array Data

## dtype
NumPy's ndarrays are also efficient in part because all their elements must have the same type (usually numbers). <br>
Use the `dtype` attribute to check what the data type is 

In [None]:
c = np.arange(1, 5)
print(c.dtype, c)

c = np.arange(1.0, 5.0)
print(c.dtype, c)

Instead of letting NumPy guess what data type to use, you can set it explicitly when creating an array by setting the dtype parameter:

In [None]:
d = np.arange(1, 5, dtype=np.complex64)
print(d.dtype, d)

Common data types include <br>
`int8`, `int16`, `int32`, `int64`, `uint8|16|32|64`, `float16|32|64` and `complex64|128`. 

Full list here  https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html


## itemsize

returns the size (in bytes) of each item:

In [None]:
e = np.arange(1, 5, dtype=np.complex64)
e.itemsize

## data buffer

An array's data is actually stored in memory as a flat (one dimensional) byte buffer.

It is available via the data attribute.

In [None]:
f = np.array([[1,2],[1000, 2000]], dtype=np.int32)
f.data

## Reshaping an array

### shape
- happens in place

- set its shape attribute. 

- array size must remain the same.

In [None]:
g = np.arange(24)
print(g)
print("Rank:", g.ndim)

g.shape = (6, 4)
print(g)
print("Rank:", g.ndim)

### reshape
- returns a new ndarray object pointing at the same data
- This means that modifying one array will also modify the other

In [None]:
g2 = g.reshape(4,6)
print(g2)
print("Rank:", g2.ndim)


Set item at row 1, col 2 to 888888 

In [None]:
g2[1, 2] = 999
g2

The corresponding element in `g` has been modified.

In [None]:
g

### ravel

returns a new 1D array pointing to the same data

In [None]:
g.ravel()

# Array arithmetic

- standard arithmetic operators (+, -, *, /, //, **, etc.) 
- they apply elementwise:

In [None]:
a = np.array([14, 23, 32, 41])
b = np.array([5,  4,  3,  2])
print("a + b  =", a + b)
print("a - b  =", a - b)
print("a * b  =", a * b)
print("a / b  =", a / b)
print("a // b  =", a // b)
print("a % b  =", a % b)
print("a ** b =", a ** b)

Note that the multiplication is *not* a matrix multiplication. 

The arrays must have the same shape. 

If they do not, NumPy will apply the *broadcasting rules*.

# Broadcasting

In general, when NumPy expects arrays of the same shape but finds that this is not the case, it applies the so-called *broadcasting* rules:

## First rule
*If the arrays do not have the same rank, then a 1 will be prepended to the smaller ranking arrays until their ranks match.*

In [None]:
h = np.arange(5).reshape(1, 1, 5)
h

add a 1D array of shape `(5,)` to this 3D array of shape `(1,1,5)`. 

Applying the first rule of broadcasting!

In [None]:
h + [10, 20, 30, 40, 50]  # same as: h + [[[10, 20, 30, 40, 50]]]

## Second rule
*Arrays with a 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension.* 

*The value of the array element is repeated along that dimension.*

In [None]:
k = np.arange(6).reshape(2, 3)
k

add a 2D array of shape `(2,1)` to this 2D `ndarray` of shape `(2, 3)`. 

NumPy will apply the second rule of broadcasting:

In [None]:
k + [[100], [200]]  # same as: k + [[100, 100, 100], [200, 200, 200]]

Combining rules 1 & 2, we can do this:

In [None]:
k + [100, 200, 300]  # after rule 1: [[100, 200, 300]], and after rule 2: [[100, 200, 300], [100, 200, 300]]

And also, very simply:

In [None]:
k + 1000  # same as: k + [[1000, 1000, 1000], [1000, 1000, 1000]]

## Third rule
*After rules 1 & 2, the sizes of all arrays must match.*

In [None]:
try:
    k + [33, 44]
except ValueError as e:
    print(e)

Broadcasting rules are used in many NumPy operations, not just arithmetic operations.

For more information, check the documentation https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

## Upcasting
When trying to combine arrays with different `dtype`s, NumPy will *upcast* to a type capable of handling all possible values (regardless of what the *actual* values are).

In [None]:
k1 = np.arange(0, 5, dtype=np.uint8)
print(k1.dtype, k1)

In [None]:
k2 = k1 + np.array([5, 6, 7, 8, 9], dtype=np.int8)
print(k2.dtype, k2)

Note that `int16` is required to represent all *possible* `int8` and `uint8` values (from -128 to 255), even though in this case a uint8 would have sufficed.

In [None]:
k3 = k1 + 1.5
print(k3.dtype, k3)

# Conditional operators

The conditional operators also apply elementwise:

In [None]:
m = np.array([20, -5, 30, 40])
m < [15, 16, 35, 36]

And using broadcasting:

In [None]:
m < 25  # equivalent to m < [25, 25, 25, 25]

This is most useful in conjunction with boolean indexing (discussed below).

In [None]:
m[m < 25]