## NumPy Basics Tutorial ##
by Briane Paul Samson

Welcome to our first tutorial in the COMET Data Science Workshops. This notebook will walk you through some basic NumPy functions and operations that you can use in any data analysis and data science project.


----------


From [numpy.org][1]:


> *NumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array
> object sophisticated (broadcasting) functions tools for integrating
> C/C++ and Fortran code useful linear algebra, Fourier transform, and
> random number capabilities Besides its obvious scientific uses, NumPy
> can also be used as an efficient multi-dimensional container of
> generic data. Arbitrary data-types can be defined. This allows NumPy
> to seamlessly and speedily integrate with a wide variety of
> databases.*

Source: http://nbviewer.jupyter.org/github/twistedhardware/mltutorial/blob/master/notebooks/IPython-Tutorial/4%20-%20Numpy%20Basics.ipynb

  [1]: http://www.numpy.org/

## Importing NumPy ##

In [None]:
import numpy as np

## Creating Arrays ##
NumPy allows you to create and operate on homogeneous multidimensional arrays faster than in standard Python.

**ndarray** (or its alias *array*) is NumPy's array class. A dimension is called an *axis*. The number of axes is called *rank*.

In creating an array from scratch, we can use **np.arange** with the following syntax:

**np.arange([start,] stop[, step,], dtype=None)**

In [None]:
np.arange(10)

In [None]:
np.arange(1, 10)

In [None]:
np.arange(1, 10, 2)

In [None]:
np.arange(1, 20, 3, dtype=np.float64)

Or use **array** for Python lists and sequences.


In [None]:
x = np.array([1, 3, 5, 7])
x

In [None]:
type(x)

In [None]:
np.array([(1, 3, 5), (7, 9, 11), (13, 15, 17)]).ndim

In [None]:
np.zeros((2, 3), dtype=np.int16)

In [None]:
np.ones((3, 3), dtype=np.int16)

In [None]:
np.empty((2, 2, 4))

In [None]:
np.linspace(2, 4, 10) #better for floating point arguments because of precision

In [None]:
np.random.random((3, 3))

Important Array Attributes
--------------------

**Dimensions**

the number of axes (dimensions) of the array. In the Python world, the number of dimensions is referred to as rank.

In [None]:
ds = np.arange(1, 10, 3)
print(ds)
ds.ndim

In [None]:
threeD = np.arange(1, 30, 2).reshape(3, 5)
print(threeD)

**Shape**

a tuple indicating the size of the array in each dimension. For a matrix with *n* rows and *m* columns, shape will be **(n, m)**


In [None]:
ds.shape

**Size**

the total number of elements in the array. This is the same as the product of *n* and *m* in **shape**.


In [None]:
ds.size

**Data Type**

the data type of the elements.


In [None]:
ds.dtype

**Item Size**

the size in bytes of the elements (i.e. `float64 = 64/8 = 8`).



In [None]:
ds.itemsize

Basic Operations
----------------
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

In [None]:
a = np.arange(5)
b = np.array([2, 4, 0, 1, 2])
a

In [None]:
diff = a-b
diff

In [None]:
b**2

In [None]:
2*b

In [None]:
np.sin(a)

In [None]:
np.sum(a)

In [None]:
np.max(a)

In [None]:
np.min(a)

In [None]:
b > 2

In [None]:
a*b #by default, matrix multiplication is elementwise

In [None]:
x = np.array([[1,1], [0,1]])
y = np.array([[2,0], [3,4]])
x*y

In [None]:
x.dot(y) #same as np.dot(x, y)

In [None]:
x.sum()

In [None]:
x.sum(axis=0)

In [None]:
x.sum(axis=1)

In [None]:
z = np.random.random((3, 4))
z

In [None]:
np.mean(z)

In [None]:
np.median(z)

In [None]:
np.std(z)

Reshaping Arrays
----------------

In [None]:
data_set = np.random.random((2,3))
data_set

In [None]:
np.reshape(data_set, (3,2))

In [None]:
np.reshape(data_set, (6,1))

In [None]:
np.reshape(data_set, (6))

In [None]:
np.ravel(data_set)

Slicing Arrays
--------------

In [None]:
data_set = np.random.random((5,10))
data_set

In [None]:
data_set[1]

In [None]:
data_set[1][0]

In [None]:
data_set[1,0]

**Slicing range of data**


In [None]:
data_set[2:4]

In [None]:
data_set[2:4,0]

In [None]:
data_set[2:4,0:2]

In [None]:
data_set[:,0]

**Slicing data with steps**


In [None]:
data_set[2:5:2]

In [None]:
data_set[::]

In [None]:
data_set[::2]

In [None]:
data_set[2:4]

In [None]:
data_set[2:4,::2]