### NUMPY - Multidimensional Data Arrays

It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use.

In [1]:
import numpy as np

In [2]:
np.__version__

'1.19.2'

In [3]:
print(np.info(np.add))

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out

## What is Array?

![](image/array.png)

In [4]:
# Python List

a = [1, 2, 3, 4]
b = [5, 6, 7, 8]

In [7]:
print(a+b)
print(a*b)

[1, 2, 3, 4, 5, 6, 7, 8]


TypeError: can't multiply sequence by non-int of type 'list'

In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. 



In [8]:
a_array = np.array(a)
b_array = np.array(b)

print(a_array + b_array)
print(a_array * b_array)

[ 6  8 10 12]
[ 5 12 21 32]


## Creating `numpy` arrays

There are some ways to initialize new numpy arrays:
* a Python list or tuples
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.
* reading data from files

### Lists

In [9]:
# vector: the argument to the array function is a list
v = np.array([1, 2, 3, 4, 5])

v

array([1, 2, 3, 4, 5])

In [10]:
# matrix: the argument to the array function is a nested list
m = np.array([
              [1, 2, 3], 
              [4, 5, 6]
            ])

m

array([[1, 2, 3],
       [4, 5, 6]])

The `v` and `m` objects are both of the type `ndarray` that the `numpy` module provides.

In [11]:
type(v), type(m)

(numpy.ndarray, numpy.ndarray)

In [12]:
a = [1,2,3]

In [13]:
type(a)

list

The difference between the `v` and `m` arrays is only their shapes. We can get information about the shape of an array by using the `ndarray.shape` property.

In [14]:
v.shape

(5,)

In [15]:
m.shape

(2, 3)

The number of elements in the array is available through the `ndarray.size` property

In [16]:
m.size

6

`numpy.ndarray` looks very similiar to the `list`. So, why not use the list instead?
`numpay.ndarray` is used for several reason:
1. Lists are very general. They can contain any kind of object. They do not support mathematical functions such as matrix and dot multiplication, etc. 
2. Numpy arrays are statically typed and homogenous. The type of the elements is determined when the array is created
3. Numpy arrays are memory efficient
4. It is fast for implementation of mathematical function

We can see the type of data of an array using `dtype`

In [17]:
m.dtype

dtype('int32')

If we want, we can explicitly define the type of the array data when we create it, using the `dtype` keyword argument: 

In [18]:
m = np.array([[1, 2, 3], [4, 5, 6]], dtype=float)

m

array([[1., 2., 3.],
       [4., 5., 6.]])

Common data types that can be used with `dtype` are: `int`, `float`, `complex`, `bool`, `object`, etc.

### Create Matrix Zeros

In [19]:
# One dimension

zeros_matrix = np.zeros(5)

zeros_matrix

array([0., 0., 0., 0., 0.])

In [23]:
#two dimension

zeros_matrix2 = np.zeros((5,2,4))
zeros_matrix2

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [None]:
# should  be in tuple format

zeros_matrix2 = np.zeros(shape=(5,2)) # 5 rows, 2 columns
zeros_matrix2 

### Matrix ones

In [24]:
#one dimension

matrix_ones = np.ones(5)
matrix_ones 

array([1., 1., 1., 1., 1.])

In [None]:
#3 dimension

matrix_ones2 = np.ones((3, 4, 2)) #3 rows, 4 columns, 2 depth
matrix_ones2

## > Exercise 1

1. Create a matrix from a list which has 4 rows and 3 columns

2. Create the following matrix
![](image/lat11.png)

3. Create a 2D matrix with size of 10

4. Create a 3D matrix of ones which has 2 rows, 3 columns, and 3 depth

5. Make the following arrays from zeros arrays and with for loops
![](image/exercise1.png)

In [34]:
a = np.zeros((5,3))
a

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [30]:
b = a + 2
b

array([[2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.]])

In [35]:
for i in a:
    i += 2
a

array([[2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.]])

### Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in `numpy` that generate arrays of different forms. Some of the more common are:

**arange**

In [36]:
# create a range

x = np.arange(10)

x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [37]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


In [38]:
# create a range

x = np.arange(10, 20) # arguments: start, stop

x

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [42]:
# create a range

x = np.arange(10, 20, 2) # arguments: start, stop, step

x

array([10, 12, 14, 16, 18])

In [40]:
x = np.arange(-1, 1, 0.1)

x

array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,
       -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,
       -2.00000000e-01, -1.00000000e-01, -2.22044605e-16,  1.00000000e-01,
        2.00000000e-01,  3.00000000e-01,  4.00000000e-01,  5.00000000e-01,
        6.00000000e-01,  7.00000000e-01,  8.00000000e-01,  9.00000000e-01])

The number 9.00000000e-01 already is a floating point number.
It's written in scientific notation and is equivalent to 9 * 10**-1 or 0.9.

#### linspace

In [50]:
np.info(np.linspace)

 linspace(*args, **kwargs)

Return evenly spaced numbers over a specified interval.

Returns `num` evenly spaced samples, calculated over the
interval [`start`, `stop`].

The endpoint of the interval can optionally be excluded.

.. versionchanged:: 1.16.0
    Non-scalar `start` and `stop` are now supported.

Parameters
----------
start : array_like
    The starting value of the sequence.
stop : array_like
    The end value of the sequence, unless `endpoint` is set to False.
    In that case, the sequence consists of all but the last of ``num + 1``
    evenly spaced samples, so that `stop` is excluded.  Note that the step
    size changes when `endpoint` is False.
num : int, optional
    Number of samples to generate. Default is 50. Must be non-negative.
endpoint : bool, optional
    If True, `stop` is the last sample. Otherwise, it is not included.
    Default is True.
retstep : bool, optional
    If True, return (`samples`, `step`), where `step` is the spacing
    between samples.
dty

In [49]:
# using linspace, both end points ARE included
np.linspace(0, 10) #Unlike arange that uses step, linspace uses the number of sample

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

In [51]:
np.linspace(1,10,4) 

array([ 1.,  4.,  7., 10.])

#### random data

In [52]:
from numpy import random

In [55]:
random.rand()

0.17577035621663983

In [60]:
#uniform random numbers in [0,1]
random.rand(5,4,2) 

array([[[0.23705977, 0.9552622 ],
        [0.62762698, 0.12954894],
        [0.26540048, 0.3072095 ],
        [0.57441031, 0.71068083]],

       [[0.23002594, 0.56352986],
        [0.27167848, 0.49787894],
        [0.02747366, 0.71146809],
        [0.11219496, 0.81186841]],

       [[0.49558849, 0.8818716 ],
        [0.9827702 , 0.1066238 ],
        [0.03924168, 0.87957167],
        [0.77835256, 0.68305711]],

       [[0.28392935, 0.26986687],
        [0.77504726, 0.10721146],
        [0.92087933, 0.51580894],
        [0.50316201, 0.2025724 ]],

       [[0.31689002, 0.76093634],
        [0.56551805, 0.80277119],
        [0.13342578, 0.80740062],
        [0.46406013, 0.40426382]]])

In [69]:
# standard normal distributed random numbers
x = random.randn(3,2)
x

array([[ 2.31613452, -1.30839248],
       [-0.95376582, -0.0874863 ],
       [-0.27890501, -1.35727722]])

In [70]:
x.dtype

dtype('float64')

In [None]:
x = np.ones(2, dtype = np.int64)
x

In [86]:
random.randint(10) #random int

0

In [87]:
random.randint(2, 10, size=4)

array([4, 7, 7, 2])

In [91]:
random.randint(5, 10, size=(4,2,2))

array([[[6, 5],
        [7, 6]],

       [[5, 6],
        [6, 8]],

       [[6, 5],
        [6, 6]],

       [[7, 9],
        [6, 7]]])

## Exercise 2

1. Generate a 1-D array containing 5 random integers from 0 to 100:

In [96]:
a1 = random.randint(100, size=5)
a1

array([42, 86, 83, 92, 81])

2. Generate a 2-D array with 3 rows, each row contains 5 random integers from 0 to 100

In [101]:
a2 = random.randint(100, size=(3,5))
a2

array([[98, 87, 85, 29, 58],
       [40,  9, 72, 28, 62],
       [ 3, 94, 41, 27, 81]])

3. Generate a 1-D array of 30 evenly spaced elements between 1.5 and 5.5, inclusive.

In [105]:
a3 = np.linspace(1.5, 5.5, 30)
a3

array([1.5       , 1.63793103, 1.77586207, 1.9137931 , 2.05172414,
       2.18965517, 2.32758621, 2.46551724, 2.60344828, 2.74137931,
       2.87931034, 3.01724138, 3.15517241, 3.29310345, 3.43103448,
       3.56896552, 3.70689655, 3.84482759, 3.98275862, 4.12068966,
       4.25862069, 4.39655172, 4.53448276, 4.67241379, 4.81034483,
       4.94827586, 5.0862069 , 5.22413793, 5.36206897, 5.5       ])