# Numpy Basic Concepts

## 1 What is Numpy ?

NumPy is the fundamental package for scientific computing with Python. It is:

* a powerful Python extension for N-dimensional array
* a tool for integrating C/C++ and Fortran code
* designed for scientific computation: linear algebra and Signal Analysis

If you are a MATLAB&reg; user we recommend to read [Numpy for MATLAB Users](http://www.scipy.org/NumPy_for_Matlab_Users) and [Benefit of Open Source Python versus commercial packages](http://www.scipy.org/NumPyProConPage). For an idea of the Open Source Approach to science, we suggest the [Science Code Manifesto](http://sciencecodemanifesto.org/)

### 1.1 Documentation and reference:

* [Numpy Reference guide](http://docs.scipy.org/doc/numpy/reference/)
* [SciPy Reference](http://docs.scipy.org/doc/scipy/reference/)
* [Scipy Topical Software](http://www.scipy.org/Topical_Software)
* [Numpy Functions by Category](http://www.scipy.org/Numpy_Functions_by_Category)
* [Numpy Example List With Doc](http://www.scipy.org/Numpy_Example_List_With_Doc)  

Lets start by checking the Numpy version used in this Notebook:

In [1]:
import numpy as np
print ('numpy version: ', np.__version__)

('numpy version: ', '1.12.0')


## 2 Array Creation

NumPy's main object is the homogeneous ***multidimensional array***. It is a table of elements (usually numbers), all of the same type. In Numpy dimensions are called ***axes***. The number of axes is called ***rank***. The most important attributes of an ndarray object are:

* **ndarray.ndim**     - the number of axes (dimensions) of the array. 
* **ndarray.shape**    - the dimensions of the array. For a matrix with n rows and m columns, shape will be (n,m). 
* **ndarray.size**     - the total number of elements of the array. 
* **ndarray.dtype**    - numpy.int32, numpy.int16, and numpy.float64 are some examples. 
* **ndarray.itemsize** - the size in bytes of elements of the array. For example, elements of type float64 has itemsize 8 (=64/8) 

In [2]:
a = np.array([[0,1,2,3], [4,5,6,7], [8,9,10,11]])
rows, cols = np.shape(a)
print ('Rows:{0:03d} ; Cols:{0:03d}'.format(rows, cols))

Rows:003 ; Cols:003


**Try by yourself**   the following commands *(type or paste the commands in the cell below)*:

    a.ndim                  # Number of dimensions
    print a.dtype.name      # Type of data
    a.itemsize              # Size in bytes of elements
    a.size                  # Number of elements in the array

In [7]:
a.ndim
print(a.dtype.name)
print(a.itemsize)
print(a.size)
print(a)

int64
8
12
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


The type of the array can be specified at creation time:

In [8]:
b = np.array([[2,3], [6,7]], dtype=np.complex64)
print (b)

[[ 2.+0.j  3.+0.j]
 [ 6.+0.j  7.+0.j]]


### 2.1 Array creation functions

Often, the elements of an array are originally unknown, but its size is known. Hence, **NumPy** offers several functions to create arrays with initial placeholder content.

The function `zeros` creates an array full of zeros, the function `ones` creates an array full of ones, and the function `empty` creates an array whose initial content is random and depends on the state of the memory. By default, the dtype of the created array is float64.  
***Try by yourself*** the following commands:

    zeros((3,4))
    ones((3,4))
    empty((2,3))
    eye(3)
    diag(np.arange(5))
    np.tile(np.array([[6, 7], [8, 9]]), (2, 2))

In [19]:
np.tile(np.array([[6,7], [8,9]]), (3,2))

array([[6, 7, 6, 7],
       [8, 9, 8, 9],
       [6, 7, 6, 7],
       [8, 9, 8, 9],
       [6, 7, 6, 7],
       [8, 9, 8, 9]])

`zeros_like, ones_like` and `empty_like` can be used to create arrays of the same type of a given one

In [17]:
np.zeros_like(b)

array([[ 0.+0.j,  0.+0.j],
       [ 0.+0.j,  0.+0.j]], dtype=complex64)

### 2.2 Sequences and reshaping

Arrays can be created with ***linspace***, ***logspace*** (returning evenly spaced numbers, linear or logarithmic) or ***arange*** and then shaped in matrix form. **mgrid** is like the equivaled "meshgrid" in MATLAB.

In [20]:
np.logspace(1,5,3)

array([  1.00000000e+01,   1.00000000e+03,   1.00000000e+05])

In [33]:
x = np.arange(4).reshape(2,2)
print(x)

[[0 1]
 [2 3]]


In [34]:
# Use List comprehention to create a matrix
c = np.array([[10*j+i for i in range(3)] for j in range(4)])
print (c)

[[ 0  1  2]
 [10 11 12]
 [20 21 22]
 [30 31 32]]


Use *'newaxis'* to add a dimension (as for turning a row vector in a column vector):

In [35]:
d = np.linspace(0, 12, 5)
print (d)
print (d[:, np.newaxis])       # make into a column vector

[  0.   3.   6.   9.  12.]
[[  0.]
 [  3.]
 [  6.]
 [  9.]
 [ 12.]]


In [40]:
X, Y = np.mgrid[0:4, 0:6] # similar to meshgrid in MATLAB
X

array([[0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3]])

In [37]:
Y

array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

### 2.3 Sparse Matrices

We can create and manipulate sparse matrices as follows:

In [None]:
from scipy import sparse
X = np.random.random((5, 6)) # Create an array with many zeros
X[X < 0.85] = 0
print (X)
X_csr = sparse.csr_matrix(X) # turn X into a csr (Compressed-Sparse-Row) matrix
print (X_csr)

In [None]:
print (X_csr.toarray())       # convert back to a dense array

There are several other sparse formats that can be useful for various problems:

- `CSC` (compressed sparse column)
- `BSR` (block sparse row)
- `COO` (coordinate)
- `DIA` (diagonal)
- `DOK` (dictionary of keys)

The ``scipy.sparse`` submodule also has a lot of functions for sparse matrices
including linear algebra, sparse solvers, graph algorithms, and much more.

### 2.4 Random Numbers

In [41]:
np.random.rand(4,5) # uniform random numbers in [0,1]

array([[ 0.57897879,  0.2182849 ,  0.71412905,  0.85105183,  0.2979727 ],
       [ 0.24514884,  0.14669688,  0.18785645,  0.74472225,  0.14061606],
       [ 0.47525048,  0.87718709,  0.64131193,  0.82729023,  0.95650693],
       [ 0.77609161,  0.66450096,  0.19051333,  0.02897472,  0.29840524]])

In [42]:
np.random.randn(4,5) # standard normal distributed random numbers

array([[-0.27192225, -1.06377723, -1.51300418, -0.3642558 , -0.24201287],
       [ 1.71445128,  1.44929237,  0.1770945 , -1.48684098, -0.65529943],
       [-2.4301121 , -0.83981084,  1.89093333, -0.3442916 ,  1.1391274 ],
       [ 0.26653064, -1.498004  , -1.99776708, -0.08857519, -0.74023335]])

### 2.5 Casting

Forced casts:

In [43]:
a = np.array([1.7, 1.2, 1.6])
b = a.astype(int)           # <-- truncates to integer
b

array([1, 1, 1])

Rounding:

In [44]:
a = np.array([1.2, 1.5, 1.6, 2.5, 3.5, 4.5])
b = np.around(a)
print (b)                     # still floating-point
c = np.around(a).astype(int)
print (c)

[ 1.  2.  2.  2.  4.  4.]
[1 2 2 2 4 4]


## 3 Linear Algebra

In [45]:
# Transpose
print (x.T)

[[0 2]
 [1 3]]


Explore the available commands for numpy arrays (press x.+TAB) ***Try by yourself:*** 
    
    x.min()
    x.max()
    x.mean()
    x.cumsum()

In [46]:
x.min()

0

In [47]:
print (x*5)         # Scalar expansion
print (x+3)

[[ 0  5]
 [10 15]]
[[3 4]
 [5 6]]


In [49]:
print(x)
print(x.T)

print (x*x.T)       # Elementwise product
print (np.dot(x,x.T))  # Dot (matrix) product

[[0 1]
 [2 3]]
[[0 2]
 [1 3]]
[[0 2]
 [2 9]]
[[ 1  3]
 [ 3 13]]


### 4.1 Determinant of a square matrix

The `scipy.linalg.det()` function computes the determinant of a square matrix:

In [50]:
from scipy import linalg
arr = np.array([[1, 2],
               [3, 4]])
linalg.det(arr)

-2.0

### 4.2 Inverse of a square matrix

The `scipy.linalg.inv()` function computes the inverse of a square matrix:

In [51]:
print (linalg.inv(arr))

[[-2.   1. ]
 [ 1.5 -0.5]]


### 4.3 Advanced Linear Algebra

In **Scipy** many advanced operations are available (check the Scipy Reference), for example singular-value decomposition (SVD):

In [54]:
arr = np.arange(9).reshape((3, 3)) + np.diag([1, 0, 1])
uarr, spec, vharr = linalg.svd(arr)
print(linalg.svd(arr))

(array([[-0.1617463 , -0.98659196,  0.02178164],
       [-0.47456365,  0.09711667,  0.87484724],
       [-0.86523261,  0.13116653, -0.48390895]]), array([ 14.88982544,   0.45294236,   0.29654967]), array([[-0.45513179, -0.54511245, -0.70406496],
       [ 0.20258033,  0.70658087, -0.67801525],
       [-0.86707339,  0.45121601,  0.21115836]]))


The resulting array spectrum is:

In [55]:
spec

array([ 14.88982544,   0.45294236,   0.29654967])

## 5 Slicing -Indexing for MATLAB<sup>&reg;</sup> Users-

For MATLAB<sup>&reg;</sup> users: in Python, like many other languages, indexing start from **zero** and not from one like MATLAB.

Remember: slices (indexed subarrays) are references to memory in the original array, this means that if you modify a slice, you modify the original array. In other words a slice is a pointer to the original array.

In [56]:
b = np.arange(8).reshape(2,4)
print (b)

[[0 1 2 3]
 [4 5 6 7]]


### 5.1 Indexing single elements

***Try by yourself:***

    print b[0,0]
    print b[-1,-1]   # Last element
    print b[:,1]     # column number 1 (second column)

In [58]:
# Indexing single elements
print b[0,0]
print b[:,1]

0
[1 5]


<img src="files/utilities/numpy_array.jpg" > *Figure 01*

### 5.2 Indexing by rows and columns

In [61]:
# With reference to Figure 01:
a = np.array([[10*j+i for i in range(6)] for j in range(6)])
print(a)

[[ 0  1  2  3  4  5]
 [10 11 12 13 14 15]
 [20 21 22 23 24 25]
 [30 31 32 33 34 35]
 [40 41 42 43 44 45]
 [50 51 52 53 54 55]]


***Try by yourself:***

    print a[0,3:5]     # Orange
    print a[4:,4:]     # Blue
    print a[:, 2]      # Red
    print a[2::2, ::2] # Green

In [62]:
#Indexing multiple elements
print a[0,3:5]

[3 4]


To replicate an array use 'copy':

In [None]:
c = np.array(a, copy=True)

## 6 File Input / Output

Numpy has special functions for:

* Load/Save text files: `numpy.loadtxt()`/`numpy.savetxt()`
* Clever loading of text/csv files: `numpy.genfromtxt()`/`numpy.recfromcsv()`
* Fast and efficient, but numpy-specific, binary format: `numpy.save()`/`numpy.load()`

In particular Numpy can load and save native MATLAB<sup>&reg;</sup> files:

In [67]:
from scipy import io as spio
spio.savemat('test.mat', {'c': c}, oned_as='row') # savemat expects a dictionary
data = spio.loadmat('test.mat')
data['c']

array([[1, 2, 2, 2, 4, 4]])

---

Visit [www.add-for.com](<http://www.add-for.com/IT>) for more tutorials and updates.

This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.