# NUMPY - Multidimensional Data Arrays

It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use.

In [1]:
import numpy as np

In [None]:
np.__version__

In [None]:
print(np.info(np.add))

## What is Array?

![](image/array.png)

In [None]:
# Python List

a = [1, 2, 3, 4]
b = [5, 6, 7, 8]

In [None]:
print(a+b)
print(a*b)

In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. 



In [None]:
a_array = np.array(a)
b_array = np.array(b)

print(a_array + b_array)
print(a_array * b_array)

## Creating `numpy` arrays

There are some ways to initialize new numpy arrays:
* a Python list or tuples
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.
* reading data from files

### Lists

In [None]:
# vector: the argument to the array function is a list
v = np.array([1, 2, 3, 4, 5])

v

In [None]:
# matrix: the argument to the array function is a nested list
m = np.array([[1, 2, 3], [4, 5, 6]])

m

The `v` and `m` objects are both of the type `ndarray` that the `numpy` module provides.

In [None]:
type(v), type(m)

In [None]:
a = [1,2,3]

In [None]:
type(a)

The difference between the `v` and `m` arrays is only their shapes. We can get information about the shape of an array by using the `ndarray.shape` property.

In [None]:
v.shape

In [None]:
m.shape

The number of elements in the array is available through the `ndarray.size` property

In [None]:
m.size

`numpy.ndarray` looks very similiar to the `list`. So, why not use the list instead?
`numpay.ndarray` is used for several reason:
1. Lists are very general. They can contain any kind of object. They do not support mathematical functions such as matrix and dot multiplication, etc. 
2. Numpy arrays are statically typed and homogenous. The type of the elements is determined when the array is created
3. Numpy arrays are memory efficient
4. It is fast for implementation of mathematical function

We can see the type of data of an array using `dtype`

In [None]:
m.dtype

If we want, we can explicitly define the type of the array data when we create it, using the `dtype` keyword argument: 

In [None]:
m = np.array([[1, 2, 3], [4, 5, 6]], dtype=float)

m

Common data types that can be used with `dtype` are: `int`, `float`, `complex`, `bool`, `object`, etc.

### Create Matrix Zeros

In [None]:
# One dimension

zeros_matrix = np.zeros(5)

zeros_matrix

In [None]:
#two dimension

zeros_matrix2 = np.zeros(5,2)

In [None]:
# should  be in tuple format

zeros_matrix2 = np.zeros((5,2)) # 5 rows, 2 columns
zeros_matrix2 

### Matrix ones

In [None]:
#one dimension

matrix_ones = np.ones(5)
matrix_ones 

In [None]:
#3 dimension

matrix_ones2 = np.ones((3, 4, 2)) #3 rows, 4 columns, 2 depth
matrix_ones2

## > Exercise 1

1. Create a matrix from a list which has 4 rows and 3 columns

In [None]:
my_matrix = np.array([[1,2,3],
                     [2,3,4],
                     [1,2,1],
                     [1,2,2]])
my_matrix

2. Create the following matrix
![](image/lat11.png)

In [None]:
my_matrix2 = np.array([[2,7,12,0],
                      [3,9,3,4],
                      [4,0,1,3]])
my_matrix2

3. Create a 2D matrix with size of 10

In [None]:
cara1 = np.array([[1,2,3,4,5],
                 [6,7,8,9,10]])
cara1

In [None]:
cara2 = np.zeros((2,5))
cara2

In [None]:
cara3 = np.array([[1,2],
                 [2,2],
                 [2,1],
                 [3,3],
                 [4,5]])
cara3

In [None]:
cara4 = np.array([[1],
                 [2],
                 [2],
                 [3],
                 [4],
                 [9],
                 [10],
                 [10],
                 [7],
                 [1]])
cara4

In [None]:
print("Size cara1: ", cara1.size)
print("Size cara2: ", cara2.size)
print("Size cara3: ", cara3.size)
print("Size cara4: ", cara4.size)

In [None]:
print("Shape cara1: ", cara1.shape)
print("Shape cara2: ", cara2.shape)
print("Shape cara3: ", cara3.shape)
print("Shape cara4: ", cara4.shape)

4. Create a 3D matrix of ones which has 2 rows, 3 columns, and 3 depth

In [7]:
m4 = np.array([[[1,2,3],
                [4,5,6],
                [7,8,9]],
               
                [[6,2,3],
                [4,5,8],
                [8,8,9]]])

m4

array([[[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]],

       [[6, 2, 3],
        [4, 5, 8],
        [8, 8, 9]]])

In [8]:
m4.shape

(2, 3, 3)

5. Make the following arrays from zeros arrays and with for loops
![](image/exercise1.png)

In [9]:
my_array = np.zeros((5,3))
my_array

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [10]:
for i in my_array:
    print(i+2)

[2. 2. 2.]
[2. 2. 2.]
[2. 2. 2.]
[2. 2. 2.]
[2. 2. 2.]


### Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in `numpy` that generate arrays of different forms. Some of the more common are:

**arange**

In [11]:
# create a range

x = np.arange(10)

x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


In [13]:
# create a range

x = np.arange(10, 20) # arguments: start, stop

x

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [17]:
# create a range

x = np.arange(10, 20, 2) # arguments: start, stop, step

x

array([10, 12, 14, 16, 18])

In [18]:
x = np.arange(-1, 1, 0.1)

x

array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,
       -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,
       -2.00000000e-01, -1.00000000e-01, -2.22044605e-16,  1.00000000e-01,
        2.00000000e-01,  3.00000000e-01,  4.00000000e-01,  5.00000000e-01,
        6.00000000e-01,  7.00000000e-01,  8.00000000e-01,  9.00000000e-01])

The number 9.00000000e-01 already is a floating point number.
It's written in scientific notation and is equivalent to 9 * 10**-1 or 0.9.

#### linspace

In [34]:
# using linspace, both end points ARE included
np.linspace(0, 10, 10) #Unlike arange that uses step, linspace uses the number of sample

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

#### random data

In [22]:
from numpy import random

In [24]:
#uniform random numbers in [0,1]
random.rand(5,5)

array([[0.68886226, 0.80376421, 0.68511052, 0.05933548, 0.00609701],
       [0.71453569, 0.98785297, 0.905757  , 0.56097693, 0.26333855],
       [0.13570713, 0.98642657, 0.48538833, 0.20870015, 0.59357451],
       [0.51402483, 0.30974573, 0.23055402, 0.52798755, 0.17149748],
       [0.24462066, 0.14881871, 0.72438958, 0.42477621, 0.66295972]])

In [25]:
# standard normal distributed random numbers
x = random.randn(5,5)
x

array([[ 1.22268108, -0.11597117,  1.04261376, -1.85356452,  0.27452336],
       [-0.42800941, -0.01834765,  1.15716121,  0.36945645,  1.70644982],
       [ 0.344245  ,  0.4125159 , -0.22873644, -2.04407581, -0.46607386],
       [-1.59228872,  0.26495182,  0.55572725, -0.00871542, -0.04690909],
       [ 0.2984943 , -1.8564972 ,  0.7993992 , -0.90392802, -0.32273761]])

In [26]:
x.dtype

dtype('float64')

In [27]:
x = np.ones(2, dtype = np.int64)
x

array([1, 1], dtype=int64)

In [28]:
random.randint(10)

4

In [32]:
random.randint(2, 10, size=4)

array([5, 3, 9, 7])

In [36]:
random.randint(2, 10, size=(4,2,2))

array([[[3, 4],
        [4, 5]],

       [[9, 7],
        [7, 4]],

       [[8, 9],
        [3, 2]],

       [[2, 6],
        [2, 5]]])

## Exercise 2

1. Generate a 1-D array containing 5 random integers from 0 to 100:

In [37]:
random.randint(0,100, size=5)

array([83, 75, 78,  6, 31])

2. Generate a 2-D array with 3 rows, each row contains 5 random integers from 0 to 100

In [None]:
random.randint()

3. Generate a 1-D array of 30 evenly spaced elements between 1.5 and 5.5, inclusive.

## Adding, removing, and sorting elements

In [38]:
# Append

my_arr = np.arange(10)
my_arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [40]:
np.append(my_arr, (10, 11, 12))

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

When axis is specified, values must have the correct shape.

In [42]:
my_arr1 = np.array([[1,2,3],
                    [2,4,6]])

In [43]:
np.append(my_arr1, [[2,1,1]], axis=0)

array([[1, 2, 3],
       [2, 4, 6],
       [2, 1, 1]])

In [45]:
a = np.array([[1, 2, 8], [5, 8, 10]])
b = np.array([[8, 2, 1], [3, 9, 1]])
c = np.concatenate((a, b), axis = 0)
print(c)

[[ 1  2  8]
 [ 5  8 10]
 [ 8  2  1]
 [ 3  9  1]]


In [47]:
x = np.array([[1,2,3], [1,1,1]])
y = np.array([[3,3,1], [1,2,3]])
x = np.concatenate((x, y), axis = 1)
print(x)

[[1 2 3 3 3 1]
 [1 1 1 1 2 3]]


In [48]:
# sorting np.sort()
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

In [49]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In [50]:
#Concatenate arrays

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [51]:
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [54]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

np.concatenate((x,y), axis=0) #if axis = None, then arrays are flatten before use

array([[1, 2],
       [3, 4],
       [5, 6]])

### delete elements of an array

In [55]:
arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
arr

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [56]:
np.delete(arr, 1, axis=0)

array([[ 1,  2,  3,  4],
       [ 9, 10, 11, 12]])

## Shape and Size of an array

`ndarray.ndim` : the number of axes \
`ndarray.size` : the total number of element \
`ndarray.shape` : the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).

In [57]:
my_array = np.array([[[0, 1, 2, 3],
                      [4, 5, 6, 7]],
                          
                      [[0, 1, 2, 3],
                      [4, 5, 6, 7]],
                         
                      [[0 ,1 ,2, 3],
                      [4, 5, 6, 7]]])

In [58]:
my_array.ndim

3

In [59]:
my_array.size

24

In [60]:
my_array.shape

(3, 2, 4)

### Reshaping an array

When you use the reshape method, the array you want to produce needs to have the same number of elements as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.

In [61]:
a = np.arange(6)
print(a)

[0 1 2 3 4 5]


In [62]:
a.reshape(2, 3)

array([[0, 1, 2],
       [3, 4, 5]])

## > Exercise 3

1. create a 3x3 matrix with values ranging from 2 to 10

In [67]:
x =np.arange(2,11).reshape(3,3)
x

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

2. concentenate the following arrays \
**[[0, 1, 3], [5, 7, 9]], [[0, 2, 4], [6, 8, 10]]**
![](image/lat3.png)

### Converting a 1D array into 2D array (add a new axis to an array)

You can use `np.newaxis` and `np.expand_dims` to increase the dimensions of your existing array.

`np.newaxis` will increase the dimension of array by one whe it is used once. \
1D -> 2D, 2D -> 3D, and so on

In [68]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [69]:
# convert a 1D array to a row vector by inserting an axis along the first dimension
a2 = a[np.newaxis, :]
a2

array([[1, 2, 3, 4, 5, 6]])

In [72]:
a2.shape

(1, 6)

In [70]:
# for a column vector, you can insert an axis along the second dimension
a3 = a[:, np.newaxis]
a3

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

In [71]:
a3.shape

(6, 1)

using `np.expand_dims`

In [73]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [76]:
# You can use np.expand_dims to add an axis at index position 1
b = np.expand_dims(a, axis=1)
b.shape

(6, 1)

In [77]:
b

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

In [78]:
# You can add an axis at index position 0 with
c = np.expand_dims(a, axis=0)
c.shape

(1, 6)

## Indexing and slicing

In [79]:
data = np.array([1, 2, 3, 4, 5])

In [80]:
data[1]

2

In [81]:
data[:3]

array([1, 2, 3])

In [82]:
data[1:]

array([2, 3, 4, 5])

In [84]:
data[-2:]

array([4, 5])

![](image/numpy.jpg)

You may want to take a section of your array or specific array elements to use in further analysis or additional operations. To do that, you’ll need to subset, slice, and/or index your arrays.

If you want to select values from your array that fulfill certain conditions, it’s straightforward with NumPy.

In [85]:
a = np.array([[1 , 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])

You can easily print all of the values in the array that are less than 5.

In [86]:
print(a[a < 5])

[1 2 3 4]


You can also select, for example, numbers that are equal to or greater than 5, and use that condition to index an array.

In [87]:
five_up = (a >= 5)
print(a[five_up])

[ 5  6  7  8  9 10 11 12]


In [88]:
divisible_by_2 = a[a%2==0]
print(divisible_by_2)

[ 2  4  6  8 10 12]


Or you can select elements that satisfy two conditions using the & and | operators:

In [89]:
c = a[(a > 2) & (a < 11)]
print(c)

[ 3  4  5  6  7  8  9 10]


In [90]:
# pipe, or, vertical bar: |
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


In [91]:
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [92]:
a[1,1:3]

array([6, 7])

In [93]:
z = np.array([[[0, 1, 3],
               [5, 7, 9]],
              
              [[0, 2, 4],
               [6, 8, 10]]])
z

array([[[ 0,  1,  3],
        [ 5,  7,  9]],

       [[ 0,  2,  4],
        [ 6,  8, 10]]])

In [108]:
z.shape

(2, 2, 3)

In [94]:
z[0] # access row 0

array([[0, 1, 3],
       [5, 7, 9]])

In [95]:
z[0,1] # access row 0, column 1

array([5, 7, 9])

In [96]:
z[1, 1, 1:] #access row 1, column 1, depth 1-2

array([ 8, 10])

In [106]:
a = np.array([1,3,1,4,5,2,1])
a

array([1, 3, 1, 4, 5, 2, 1])

In [107]:
a[1:4]

array([3, 1, 4])

In [103]:
a

array([ 1,  3,  1, 10,  5,  2,  1])

## > Exercise 4

1. create a null vector / 1D array of size 10 and update fifth value to 11.

2. Write a NumPy program to create a 2x3 arrays and change it into 3x2 arrays

3. Write a NumPy program to create a 2d array with 1 on the border and 0 inside.
![](image/lat1.png)

4. Take a look at the following matrix. Access the [2, 1,1]

In [None]:
z = np.array([[[0, 1, 3],
               [5, 7, 9],
               [6, 8, 10]],
              
              [[0, 2, 4],
               [6, 8, 10],
               [0, 1, 3]],
             
              [[1, 1, 2],
               [5, 2, 9],
               [1, 3, 3]]])


# Creating an array from existing data

You can easily use create a new array from a section of an existing array.

In [109]:
a = np.array([1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [110]:
arr1 = a[3:8]
arr1

array([4, 5, 6, 7, 8])

You can also stack two existing arrays, both vertically and horizontally. Let’s say you have two arrays, a1 and a2:

In [111]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3],
               [4, 4]])

In [112]:
np.vstack((a1, a2))

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

using `hstack`

In [113]:
np.hstack((a1, a2))

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

You can split an array into several smaller arrays using `hsplit`. You can specify either the number of equally shaped arrays to return or the columns after which the division should occur.

In [114]:
x = np.arange(1, 25).reshape(2, 12)
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

If you wanted to split this array into three equally shaped arrays, you would run:

In [115]:
y = np.hsplit(x, 3)
y

[array([[ 1,  2,  3,  4],
        [13, 14, 15, 16]]), array([[ 5,  6,  7,  8],
        [17, 18, 19, 20]]), array([[ 9, 10, 11, 12],
        [21, 22, 23, 24]])]

If you want to split your array after the third and fourth column, you’d run:

In [116]:
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [118]:
z = np.hsplit(x, (3, 5))

z

[array([[ 1,  2,  3],
        [13, 14, 15]]), array([[ 4,  5],
        [16, 17]]), array([[ 6,  7,  8,  9, 10, 11, 12],
        [18, 19, 20, 21, 22, 23, 24]])]

In [119]:
m = np.hsplit(x, (1,3))

m

[array([[ 1],
        [13]]), array([[ 2,  3],
        [14, 15]]), array([[ 4,  5,  6,  7,  8,  9, 10, 11, 12],
        [16, 17, 18, 19, 20, 21, 22, 23, 24]])]

# Basic array operations

In [120]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)



array([2, 3])

In [122]:
data

array([1, 2])

In [123]:
ones

array([1, 1])

In [124]:
data + ones

array([2, 3])

![](image/"np_data_plus_ones.png")

In [125]:
data - ones

array([0, 1])

In [126]:
data * data

array([1, 4])

In [127]:
data / data

array([1., 1.])

In [128]:
a = np.array([1, 2, 3, 4])

a.sum()

10

sum the rows/columns

In [129]:
b = np.array([[1, 1],
              [2, 2]])
b.sum(axis=0)

array([3, 3])

In [130]:
b.sum(axis=1)

array([2, 4])

# Broadcasting

There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar) or between arrays of two different sizes. For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

In [None]:
data = np.array([1.0, 2.0])
data * 1.6

![](image/np_multiply_broadcasting.png)

NumPy understands that the multiplication should happen with each cell. That concept is called <b> broadcasting </b>. Broadcasting is a mechanism that allows NumPy to perform operations on arrays of different shapes. 

# Working with Mathematical Formulas

The ease of implementing mathematical formulas that work on arrays is one of the things that make NumPy so widely used in the scientific Python community.

![](image/np_MSE_formula.png)

In [None]:
mse = (1/n) * np.sum(np.square(y_pred - labels))

![](image/np_mse_viz1.png)

![](image/np_mse_viz2.png)

# How to save and load NumPy objects?

The ndarray objects can be saved to and loaded from the disk files with `loadtxt` and `savetxt` functions that handle normal text files, `load` and `save` functions that handle NumPy binary files with a .npy file extension, and a `savez` function that handles NumPy files with a .npz file extension.

If you want to store a single ndarray object, store it as a .npy file using np.save. If you want to store more than one ndarray object in a single file, save it as a .npz file using `np.savez`. You can also save several arrays into a single file in compressed npz format with `savez_compressed`.

In [132]:
a = np.array([1, 2, 3, 4, 5, 6])

In [133]:
np.save('myfile', a)

In [134]:
# load file

b = np.load('myfile.npy')

In [135]:
b

array([1, 2, 3, 4, 5, 6])

You can save a NumPy array as a plain text file like a .csv or .txt file with `np.savetxt`.

In [136]:
csv_arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

In [137]:
np.savetxt('myfile2.csv', csv_arr)

In [138]:
np.loadtxt('myfile2.csv')

array([1., 2., 3., 4., 5., 6., 7., 8.])

# ASSIGNMENT

1. Create a 4x4 matrix with values ranging from 0 to 3.\
(The following image is just as an example. It doesn't represent the real image of arrays in the question)
![](image/assignment2.png)

2. Create and array with shape of 4x4 and turn it into two arrays along the second axis.\
![](image/assignment.png)