# What is Numpy

NumPy is the fundamental package for scientific computing with Python. 
It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. 
It is implemented in C and Fortran so when calculations are **vectorized**, performance is very good.

So, in a nutshell:

* a powerful Python extension for N-dimensional array
* a tool for integrating C/C++ and Fortran code
* designed for scientific computation: linear algebra and Signal Analysis

If you are a MATLAB&reg; user we recommend to read [Numpy for MATLAB Users](http://www.scipy.org/NumPy_for_Matlab_Users) and [Benefit of Open Source Python versus commercial packages](http://www.scipy.org/NumPyProConPage). 

I'm a supporter of the **Open Science Movement**, thus I humbly suggest you to take a look at the [Science Code Manifesto](http://sciencecodemanifesto.org/)

In [None]:
import numpy as np

In [None]:
a = np.arange(10).reshape(2, 5)
b = np.arange(20).reshape(5, 4)

In [None]:
a @ b  # Python 3.5+

# Getting Started with Numpy Arrays

NumPy's main object is the **homogeneous** ***multidimensional array***. It is a table of elements (usually numbers), all of the same type. 

In Numpy dimensions are called **axes**. 

The number of axes is called **rank**. 

The most important attributes of an ndarray object are:

* **ndarray.ndim**     - the number of axes (dimensions) of the array. 
* **ndarray.shape**    - the dimensions of the array. For a matrix with n rows and m columns, shape will be (n,m). 
* **ndarray.size**     - the total number of elements of the array. 
* **ndarray.dtype**    - numpy.int32, numpy.int16, and numpy.float64 are some examples. 
* **ndarray.itemsize** - the size in bytes of elements of the array. For example, elements of type float64 has itemsize 8 (=64/8) 

To use `numpy` need to import the module it using of example:

In [None]:
import numpy as np  # naming import convention

### Terminology Assumption

In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. 

### Reference Documentation

* On the web: [http://docs.scipy.org](http://docs.scipy.org)/

* Interactive help:

In [None]:
np.array?

If you're looking for something

In [None]:
np.lookfor('create array')

In [None]:
np.con*?

#### Help is your friend

Whenever in doubt, there is the `help` function to the rescue

In [None]:
# For example, try 
help(np.ndarray)

## Numpy Array Object

`NumPy` has a multidimensional array object called ndarray. It consists of two parts as follows:
   
   * The actual data
   * Some metadata describing the data
    
    
The majority of array operations leave the raw data untouched. The only aspect that changes is the metadata.

<img src="resources/imgs/ndarray_with_details.png" />

## Creating `numpy` arrays

There are a number of ways to initialize new numpy arrays, for example from

* a Python list or tuples
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.

### From lists

For example, to create new vector and matrix arrays from Python lists we can use the `numpy.array` function.

In [None]:
v = np.array([1,2,3,4])
v

In [None]:
M = np.array([[1, 2], [3, 4]])
M

The `v` and `M` objects are both of the type `ndarray` that the `numpy` module provides.

In [None]:
print('Type of v: ', type(v))
print('Type of M: ', type(M))

The difference between the `v` and `M` arrays is only their shapes. 

To do so, we could use the `numpy.shape` function:

In [None]:
print('Shape of v: ', np.shape(v))
print('Shape of M: ', np.shape(M))

Alternatively, We can get information about the shape of an array by using the `ndarray.shape` **property** :

In [None]:
v.shape, M.shape

Equivalently, we can get information about the **size** of the two `ndarrays`, namely the *total number of elements* in the array.

In [None]:
print('Size of v:', v.size)
print('Size of M:', M.size)

#### More properties of the `numpy array`

In [None]:
M.itemsize # bytes per element

In [None]:
M.nbytes # number of bytes

In [None]:
M.ndim # number of dimensions

## Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. 

Instead we can use one of the many **functions** in `numpy` that generates arrays of different forms. 

Some of the more common are: 

* `np.arange`; 
* `np.linspace`; 
* `np.logspace`; 
* `np.mgrid`;
* `np.random.rand`;
* `np.diag`;
* `np.zeros`;
* `np.ones`;
* `np.empty`;
* `np.tile`.

### `np.arange`

In [None]:
x = np.arange(0, 10, 1) 
print(x)

In [None]:
# floating point step-wise range generatation
x = np.arange(-1, 1, 0.1)  
print(x)

### `np.linspace` and `np.logspace`

In [None]:
# using linspace, both end points **ARE included**
np.linspace(0, 10, 25)

In [None]:
np.logspace(0, np.e**2, 10, base=np.e)

### `np.mgrid`

In [None]:
x, y = np.mgrid[0:5, 0:5]  # similar to meshgrid in MATLAB

In [None]:
x

In [None]:
y

### `np.random.rand` & `np.random.randn`

In [None]:
# uniform random numbers in [0,1]
np.random.rand(5,5)

In [None]:
# standard normal distributed random numbers
np.random.randn(5,5)

### `np.diag`

In [None]:
# a diagonal matrix
np.diag([1,2,3])

In [None]:
# diagonal with offset from the main diagonal
np.diag([1,2,3], k=1) 

### `np.eye`

In [None]:
# a diagonal matrix with ones on the main diagonal
np.eye(3)  # 3 is the 

### `np.zeros` and `np.ones`

In [None]:
np.zeros((3,3))

In [None]:
np.ones((3, 3))

### DIY

***Try by yourself*** the following commands:

    np.zeros((3,4))
    np.ones((3,4))
    np.empty((2,3))
    np.eye(5)
    np.diag(np.arange(5))
    np.tile(np.array([[6, 7], [8, 9]]), (2, 2))

In [None]:
np.zeros((3, 4))

In [None]:
np.tile(np.array([[6, 7], [8, 9]]), (2,3))

In [None]:
np.empty((2,3))

## So, why is it useful then?

So far the `numpy.ndarray` looks awefully much like a Python **list** (or **nested list**). 

*Why not simply use Python lists for computations instead of creating a new array type?*

There are several reasons:

* Python lists are very general. 
    - They can contain any kind of object. 
    - They are dynamically typed. 
    - They do not support mathematical functions such as matrix and dot multiplications, etc. 
    - Implementing such functions for Python lists would not be very efficient because of the dynamic typing.
    
    
* Numpy arrays are **statically typed** and **homogeneous**. 
    - The type of the elements is determined when array is created.
    
    
* Numpy arrays are memory efficient.
    - Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of `numpy` arrays can be implemented in a compiled language (C and Fortran is used).

In [None]:
import numpy as np

In [None]:
L = range(1000)

In [None]:
%timeit [i**2 for i in L]

In [None]:
a = np.arange(1000)

In [None]:
%timeit a**2

In [None]:
%timeit [element**2 for element in a]