# What is Numpy

NumPy is the fundamental package for scientific computing with Python. 
It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. 
It is implemented in C and Fortran so when calculations are **vectorized**, performance is very good.

So, in a nutshell:

* a powerful Python extension for N-dimensional array
* a tool for integrating C/C++ and Fortran code
* designed for scientific computation: linear algebra and Signal Analysis

If you are a MATLAB&reg; user we recommend to read [Numpy for MATLAB Users](http://www.scipy.org/NumPy_for_Matlab_Users) and [Benefit of Open Source Python versus commercial packages](http://www.scipy.org/NumPyProConPage). 

I'm a supporter of the **Open Science Movement**, thus I humbly suggest you to take a look at the [Science Code Manifesto](http://sciencecodemanifesto.org/)

https://numpy.org/devdocs/reference/index.html#reference

# Getting Started with Numpy Arrays

NumPy's main object is the **homogeneous** ***multidimensional array***. It is a table of elements (usually numbers), all of the same type. 

In Numpy dimensions are called **axes**. 

The number of axes is called **rank**. 

The most important attributes of an ndarray object are:

* **ndarray.ndim**     - the number of axes (dimensions) of the array. 
* **ndarray.shape**    - the dimensions of the array. For a matrix with n rows and m columns, shape will be (n,m). 
* **ndarray.size**     - the total number of elements of the array. 
* **ndarray.dtype**    - numpy.int32, numpy.int16, and numpy.float64 are some examples. 
* **ndarray.itemsize** - the size in bytes of elements of the array. For example, elements of type float64 has itemsize 8 (=64/8) 

To use `numpy` need to import the module it using of example:

In [2]:
import numpy as np  # naming import convention

### Terminology Assumption

In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. 

### Reference Documentation

* On the web: [http://docs.scipy.org](http://docs.scipy.org)/

* Interactive help:

In [2]:
np.array?

In [3]:
np.con*?

## Numpy Array Object

`NumPy` has a multidimensional array object called ndarray. It consists of two parts as follows:
   
   * The actual data
   * Some metadata describing the data
    
    
The majority of array operations leave the raw data untouched. The only aspect that changes is the metadata.

<img src="https://github.com/leriomaggio/numpy_ep2015/blob/master/images/ndarray_with_details.png?raw=1" />

## Creating `numpy` arrays

There are a number of ways to initialize new numpy arrays, for example from

* a Python list or tuples
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.

### From lists

For example, to create new vector and matrix arrays from Python lists we can use the `numpy.array` function.

In [4]:
# a vector: the argument to the array function is a Python list
v = np.array([1,2,3,4])
v

array([1, 2, 3, 4])

In [5]:
# a matrix: the argument to the array function is a nested Python list
M = np.array([[1, 2], [3, 4]])
M

array([[1, 2],
       [3, 4]])

The `v` and `M` objects are both of the type `ndarray` that the `numpy` module provides.

In [6]:
print('Type of v: ', type(v))
print('Type of M: ', type(M))

Type of v:  <class 'numpy.ndarray'>
Type of M:  <class 'numpy.ndarray'>


The difference between the `v` and `M` arrays is only their shapes. 

To do so, we could use the `numpy.shape` function:

In [7]:
print('Shape of v: ', np.shape(v))
print('Shape of M: ', np.shape(M))

Shape of v:  (4,)
Shape of M:  (2, 2)


Alternatively, We can get information about the shape of an array by using the `ndarray.shape` **property** :

In [8]:
v.shape, M.shape

((4,), (2, 2))

Equivalently, we can get information about the **size** of the two `ndarrays`, namely the *total number of elements* in the array.

In [15]:
print('Size of v:', v.size)
print('Size of M:', M.size)

Size of v: 4
Size of M: 4


#### More properties of the `numpy array`

In [45]:
v.itemsize # bytes per element

8

In [16]:
M.itemsize # bytes per element

8

In [44]:
v.nbytes # number of bytes

32

In [17]:
M.nbytes # number of bytes

32

In [14]:
v.ndim

1

In [18]:
M.ndim # number of dimensions

2

## Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. 

Instead we can use one of the many **functions** in `numpy` that generates arrays of different forms. 

Some of the more common are: 

* `np.arange`; 
* `np.linspace`; 
* `np.logspace`; 
* `np.mgrid`;
* `np.random.rand`;
* `np.diag`;
* `np.zeros`;
* `np.ones`;
* `np.empty`;
* `np.tile`.

### `np.arange`

In [51]:
x = np.arange(10) 
print(x)

[0 1 2 3 4 5 6 7 8 9]


In [52]:
x = np.arange(10,12) 
print(x)

[10 11]


In [21]:
# create a range
x = np.arange(0, 10, 1) # arguments: start, stop, step
print(x)

[0 1 2 3 4 5 6 7 8 9]


In [53]:
x = np.arange(10,5,-1) 
print(x)

[10  9  8  7  6]


In [22]:
x = np.arange(-1, 1, 0.1)  # floating point step-wise range generatation
print(x)

[-1.00000000e+00 -9.00000000e-01 -8.00000000e-01 -7.00000000e-01
 -6.00000000e-01 -5.00000000e-01 -4.00000000e-01 -3.00000000e-01
 -2.00000000e-01 -1.00000000e-01 -2.22044605e-16  1.00000000e-01
  2.00000000e-01  3.00000000e-01  4.00000000e-01  5.00000000e-01
  6.00000000e-01  7.00000000e-01  8.00000000e-01  9.00000000e-01]


### `np.linspace` and `np.logspace`

In [26]:
# using linspace, both end points **ARE included**
np.linspace(0, 10, 25)

array([ 0.        ,  0.41666667,  0.83333333,  1.25      ,  1.66666667,
        2.08333333,  2.5       ,  2.91666667,  3.33333333,  3.75      ,
        4.16666667,  4.58333333,  5.        ,  5.41666667,  5.83333333,
        6.25      ,  6.66666667,  7.08333333,  7.5       ,  7.91666667,
        8.33333333,  8.75      ,  9.16666667,  9.58333333, 10.        ])

### `np.random.rand` & `np.random.randn`

In [0]:
# uniform random numbers in [0,1]
np.random.rand(5,5)

array([[ 0.33658948,  0.28564552,  0.73183017,  0.7395105 ,  0.66427382],
       [ 0.25942094,  0.43844615,  0.48250402,  0.24063916,  0.90171053],
       [ 0.51114245,  0.49587249,  0.61832302,  0.71996951,  0.22064571],
       [ 0.38625609,  0.44313367,  0.74975323,  0.57600147,  0.80771956],
       [ 0.84511666,  0.6064582 ,  0.62365173,  0.62766319,  0.80129396]])

In [0]:
# standard normal distributed random numbers
np.random.randn(5,5)

array([[ 0.65782724,  0.65168367,  0.58525852,  0.33781734, -0.00700978],
       [ 0.61574011,  0.59150639, -0.33797592, -0.2509655 ,  0.77237429],
       [-0.15693266, -0.38377945, -0.28140147,  0.90558314,  0.25437408],
       [-1.136108  ,  2.43964939,  0.28583627, -0.27540796, -0.57253111],
       [-0.79080395,  0.50525127,  2.1113386 , -0.33769711, -0.64914575]])

### `np.diag`

In [33]:
# a diagonal matrix
np.diag([1,2,3])

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

In [37]:
# diagonal with offset from the main diagonal
np.diag([1,2,3], k=1) 

array([[0, 1, 0, 0],
       [0, 0, 2, 0],
       [0, 0, 0, 3],
       [0, 0, 0, 0]])

### `np.eye`

In [38]:
# a diagonal matrix with ones on the main diagonal - identity matrix
np.eye(3)  # 3 is the 

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### `np.zeros` and `np.ones`

In [42]:
np.zeros(3)

array([0., 0., 0.])

In [43]:
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [40]:
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

### DIY

***Try by yourself*** the following commands:

    np.zeros((3,4))
    np.ones((3,4))
    np.empty((2,3))
    np.eye(5)
    np.diag(np.arange(5))
    np.tile(np.array([[6, 7], [8, 9]]), (2, 2))

## So, why is it useful then?

So far the `numpy.ndarray` looks awefully much like a Python list (or nested list). 

*Why not simply use Python lists for computations instead of creating a new array type?*

There are several reasons:

* Python lists are very general. 
    - They can contain any kind of object. 
    - They are dynamically typed. 
    - They do not support mathematical functions such as matrix and dot multiplications, etc. 
    - Implementing such functions for Python lists would not be very efficient because of the dynamic typing.
    
    
* Numpy arrays are **statically typed** and **homogeneous**. 
    - The type of the elements is determined when array is created.
    
    
* Numpy arrays are memory efficient.
    - Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of `numpy` arrays can be implemented in a compiled language (C and Fortran is used).

In [0]:
L = range(1000)

In [55]:
%timeit [i**2 for i in L]

1000 loops, best of 3: 259 µs per loop


In [0]:
a = np.arange(1000)

In [57]:
%timeit a**2

The slowest run took 32.76 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.33 µs per loop


# Basic Data Type

You may have noticed that, in some instances, array elements are
displayed with a trailing dot (e.g. `2.` vs `2`). This is due to a
difference in the data-type used:

In [58]:
a = np.array([1, 2, 3])
print(a.dtype)
print(type(a))


int64
<class 'numpy.ndarray'>


In [0]:
b = np.array([1., 2., 3.])
b.dtype

dtype('float64')

### Note

Different data-types allow us to store data more compactly in memory,
but most of the time we simply work with floating point numbers. Note
that, in the example above, NumPy auto-detects the data-type from the
input.

You can explicitly specify which data-type you want:

In [0]:
c = np.array([1, 2, 3], dtype=float)
c.dtype

dtype('float64')

The **default** data type is floating point:

In [0]:
a = np.ones((3, 3))
a.dtype

dtype('float64')

## Basic Data Types

    bool             | This stores boolean (True or False) as a bit

    inti             | This is a platform integer (normally either int32 or int64)
    int8             | This is an integer ranging from -128 to 127
    int16            | This is an integer ranging from -32768 to 32767
    int32            | This is an integer ranging from -2 ** 31 to 2 ** 31 -1
    int64            | This is an integer ranging from -2 ** 63 to 2 ** 63 -1
    
    uint8            | This is an unsigned integer ranging from 0 to 255
    uint16           | This is an unsigned integer ranging from 0 to 65535
    uint32           | This is an unsigned integer ranging from 0 to 2 ** 32 - 1
    uint64           | This is an unsigned integer ranging from 0 to 2 ** 64 - 1

    float16          | This is a half precision float with sign bit, 5 bits exponent, and 10 bits mantissa
    float32          | This is a single precision float with sign bit, 8 bits exponent, and 23 bits mantissa
    float64 or float | This is a double precision float with sign bit, 11 bits exponent, and 52 bits mantissa
    complex64        | This is a complex number represented by two 32-bit floats (real and imaginary components)
    complex128       | This is a complex number represented by two 64-bit floats (real and imaginary components)
    (or complex)


## Conversions and Type Casting

In [0]:
np.float64(42)  # int to float

42.0

In [0]:
np.int8(42.0)  # float to int8

42

In [0]:
np.bool(42)  # int to bool

True

In [59]:
np.bool(1)   # "special" int to bool

True

In [61]:
np.bool(42.0)  # float to bool

True

In [0]:
np.float(True)  # bool to float

1.0

In [0]:
np.float(False)

0.0

In [62]:
np.arange(7, dtype=np.uint16)

array([0, 1, 2, 3, 4, 5, 6], dtype=uint16)

In [63]:
np.int(42.0 + 1.j)  # complex to int

TypeError: ignored

In [0]:
np.float(42.0 + 1.j)  # complex to float

TypeError: can't convert complex to float

In [0]:
np.float(42.0 + 0.j)  # complex to float

TypeError: can't convert complex to float

In [65]:
cn = np.complex(42.0)  # Btw, you can convert a float to a complex..
print(cn)

(42+0j)


In [66]:
# Extracting the Real part..
cn.real

42.0

In [67]:
# .. and the Imaginary part
cn.imag

0.0

## Numerical Types and Representation

The **numerical dtype** of an array should be selected very carefully, as it directly affects the numerical representation of elements, that is: 

   * the number of **bytes used; 
   * the *numerical range*

So, then: **What happens if I try to represent a number that is Out of range?**

Let's have a go with **integers**, i.e., `int8` and `uint8`

In [69]:
x = np.zeros(4, 'int8')  # Integer ranging from -128 to 127
x

array([0, 0, 0, 0], dtype=int8)

In [70]:
x[0] = 127
x

array([127,   0,   0,   0], dtype=int8)

In [71]:
x[0] = 128
x

array([-128,    0,    0,    0], dtype=int8)

In [72]:
x[1] = 129
x

array([-128, -127,    0,    0], dtype=int8)

In [73]:
x[2] = 257  # i.e. (128 x 2) + 1
x

array([-128, -127,    1,    0], dtype=int8)

In [76]:
x[3] = 260  # i.e. (128 x 2) + 1
x

array([-128, -127,    1,    4], dtype=int8)