# NumPy

## Introduction

- NumPy is a general-purpose array-processing package
- Travis Oliphant First Released in 1995 (Released as __Numeric__; Changed to NumPy in 2006). It is written in Python and C
- NumPy is cross-platform and BSD-licensed. Often used with packages such as Matplotlib (plotting library) and SciPy (Scientific Python). Sometimes it is seen as an alternative to MATLAB. The term ‘Numpy’ is a portmanteau of the words ‘NUMerical’ and ‘PYthon
- It is a library that allows end-users to create **high-performance multidimensional array objects** and manipulate these arrays (objects)
- NumPy provides a gamut of high-level functions for mathematical and logical operations, Fourier transforms, array shape manipulations, linear algebra operations, random number generation, etc.

### Examples of Multi-dimensional Arrays

* Values of an experiment/simulation at discrete time steps.
* Signal recorded by a measurement device, e.g. sound wave.
* Pixels of an image, grey-level or colour.
* 3D data measured at different X-Y-Z positions, e.g. MRI scan.
* ...

The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the collection of items of the same type. Items in the collection can be accessed using a zero-based index.

Every item in an ndarray takes the same size of block in the memory. Each element in ndarray is an object of data-type object (called dtype).

Any item extracted from ndarray object (by slicing) is represented by a Python object of one of array scalar types.
<p><b>The above constructor takes the following parameters −<b></p>

<table>
    <tr>
        <th>Sr.No.</th>
        <th>Parameter & Description</th>
    <tr>
    <tr>
        <td>1. object</td>
        <td>Any object exposing the array interface method returns an array, or any (nested) sequence.</td>
    </tr>
    <tr>
        <td>2. dtype</td>
        <td>Desired data type of array, optional</td>
    </tr>
    <tr>
        <td>3. copy</td>
        <td>Optional. By default (true), the object is copied</td>
    </tr>
    <tr>
        <td>4. order</td>
        <td>C (row major) or F (column major) or A (any) (default)</td>
    </tr>
    <tr>
        <td>5. subok</td>
        <td>By default, returned array forced to be a base class array. If true, sub-classes passed through</td>
    </tr>
    <tr>
        <td>6. ndmin</td>
        <td>Specifies minimum dimensions of resultant array</td>
    </tr>
</table>

In [None]:
help('modules')

## Numpy Installation

Once you have installed `pip`, i.e., the default python package manager; installing numpy is straightforward.

`pip install numpy`

## Numpy Import

Conventionally, numpy is imported as `np` as shown below

In [None]:
import numpy as np
print(np.__version__)    # Check the numpy version

## Basic Usage

In [None]:
# Create a 3 x 3 array, with all elements initialized to 1. Default data type for array is float.
b = np.ones((3, 3))
print(b)
print(type(b))    # print datatype of b
print(b.dtype)    # print datatype of elements in b

In [None]:
# Create a 3 x 3 array, with all elements initialized to 1. Default data type for array is float.
b = np.ones((3, 3), dtype=str)
print(b)
print(type(b))    # print datatype of b
print(b.dtype)    # print datatype of elements in b

In [None]:
# Modify the data type of array elements
c = np.ones((5, 5), dtype=int)
print(c)
print(c.size)
c=c*2
k=1
for i in range(5):
    for j in range(5):
        k=k+1
        c[i][j]=pow(c[i][j],k)
print(c)        
print(type(c))
print(c.dtype)

In [None]:
d = np.array([11 + 12j, 13 + 14j, 15 + 16j])    # numpy array with complex data type
print(d)
d.dtype

In [None]:
e = np.array([False, True, True, False, False, True])    # numpy array with complex data type 
e.dtype

In [None]:
f = np.array(['Welcome', 'To', 'Numpy'])
f.dtype    # Unicode string of 7 characters

In [None]:
len(f)    # get the length of the array

In [None]:
# Shortcut to create an array of numbers
ar = np.arange(10)
print(ar)

In [None]:
# Indexing on 1-D array - This is similar to Python lists
ar[1], ar[3], ar[-1]

In [None]:
# Arrays are assigned by reference

a = np.array([10, 11, 12, 13]) # create array
print(a)       # print a's value
print(type(a)) # type of variable a 
print(id(a))
#del a
b = a  # reference assignment
#del a
print(id(b))
b[0] = 20
print(a[0])

## How to Get Help?

We encourage you to refer to the help, as often as possible. As it provides in-depth explanation of the functions, its usage and internals. This will help you build better intuition of what's happening behind the scenes. And often, you will need to tweak the arguments to function, specially when dealing with large datasets, to achieve good performance.

In [None]:
help(np.array) # About numpy array

In [None]:
help('array')  # BTW this is the built-in array and not numpy array

In [None]:
np.lookfor('array create') # If exact name is not known, search for the topic

In [None]:
help(np.zeros) # About zeros

## NumPy Multi-dimensional Arrays (ndarray)
This is one of the most important features of numpy. ndarray is an n-dimensional array, a grid of values of the same data type. To index into this array we have a tuple of nonnegative integers.

In [None]:
# a=np.ones((4,4), dtype=int)
# print(a.shape)

x=np.array([[[1,2],[3,4]],
            [[2,3],[5,5]],
            [[6,6],[7,7]]])
print(x.shape)
print(x)
print(x.ndim)

In [None]:
# Create a 2-D matrix, with 2 rows and 3 columns
b = np.array([[1, 2, 3], [4, 5, 6]])
print("A 2-D matrix")
print(b)
print("Dimensions:", b.ndim)  # Number of dimensions, 2 (i.e. rows and columns)
print("Shape:", b.shape) # Number of elements in each dimension, i.e. number of rows and number of columns
print("Rows:", len(b))  # Number of rows
print("Cols:", len(b[0])) # Number of columns
print("==========")
# Creat a 3-D array, with dimensions 2 x 2 x 2. Consider this as a stack of 2 matrices of dimensions 2 x 2
c = np.array([
    [[1, 2], [3, 4]],    # Matrix of dimension 2 x 2
    [[1, 1], [2, 2]]     # Matrix of dimension 2 x 2
    ])
print("A 3-D matrix")
print(c)
print("Dimensions:", c.ndim)
print("Shape:", c.shape)


## Indexing and slicing

Indexes are tuples of numbers

In [None]:
b = np.diag(np.arange(5))
print(b)

In [None]:
a = np.diag(np.arange(3))   # create a 3 by 3 array with diagonal elements set to 0, 1, 2
print(a)
a[2, 2]    # Indexing starts with 0. Here we are slicing one element in the second row and second column. We used a tuple 1, 1

In [None]:
a[2, 1] = 10 # Update array element
a

In [None]:
print(a[::-1, 0:2])  # We can use slicing concepts for all the dimensions!

In [None]:
ar = np.arange(10)
print(ar)
print(ar[:4])    # Slice, starting from zero and ending at three
print(ar[::-1])

In [None]:
ar[::2]    # Slice every second element, starting from 0

## Transpose

In [None]:
import numpy as np
help(np.transpose)

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Array shape:", arr.shape)
print(arr)
arr_transp = np.transpose(arr)
print("Transpose array shape:", arr_transp.shape)
print(arr_transp)
# + np.transpose([np.arange(0, 51, 10)])

## Updating Array Values

In [None]:
arr[1][1] = 2
print(arr)    # value 5 is updated by 2

arr_transp[1, 1] = 2  # value 5 updated by 2
print(arr_transp)

## Total Number of Elements in the Array

In [None]:
import numpy as np 
help(np.size)

In [None]:
arr.size 

## Array of Random Numbers

In [None]:
import numpy as np
help(np.random.random)

In [None]:
nr = np.random.random((3, 3)) # 2 dimensional array of 3 rows and 3 columns with random numbers 
# Elements in array are randomly generated using the random function.
print(nr)      

In [None]:
# TRY THIS OUT!
# a = np.random.random((1000, 1000))  # It will create an array of size 1k by 1k with random values inside it 
# print(a)

## Array Functions

### `np.ones()`

Returns a new array of given shape and data type, where all the elements are set to 1.

`ones(shape, dtype=None, order='C')`

The shape is an int or tuple of ints to define the size of the array. If we just specify an int variable, a one-dimensional array will be returned. For a tuple of ints, the array of given shape will be returned.<br>
The dtype is an optional parameter with default value as a float. It’s used to specify the data type of the array, for example, int. <br>
The order defines the whether to store multi-dimensional array in row-major (C-style) or column-major (Fortran-style) order in memory.

In [None]:
help(np.ones)

In [None]:
c = np.ones((3, 2))
print(c)

### `np.zeros()`

It is identical to `np.ones()`, except that all the elements are initialized to zero.

In [None]:
import numpy as np
help(np.zeros)

In [None]:
d = np.zeros((2, 3))
print(d)

### `np.eye()`

Returns an array where all the elements are equal to zero, except diagonal elements that are initialized to 1.

In [None]:
import numpy as np
help(np.eye)

In [None]:
e = np.eye((3))  # This creates the popularly known Identity Matrix!
print(e)

In [None]:
# Different dimensions of arrays created with np.eye
print(np.eye(5))
print()
print(np.eye(2,3))
print()
print(np.eye(3,3))
print()
print(np.eye(4, k=1))
print()
print(np.eye(5, k=2))
print()
print(np.eye(2, 3))
print()
print(np.eye(5, dtype=int))

### `full()`
Returns a new array with the same shape and type as a given, filled with the fill_value.
    
Syntax:

`full(shape, fill_value, dtype = None, order = 'C')`

In [None]:
help(np.full)

In [None]:
e = np.full((5, 4), 7, dtype=float, order = 'C') # Matrix of constant numbers
print(e)

### `np.linspace()`

Returns number spaces evenly w.r.t the given interval start and stop. This functions is similar to arange but instead of step size it uses a sample number.

Syntax:

`numpy.linspace(start, stop, num = 50, endpoint = True, retstep = False, dtype = None)

In [None]:
import numpy as np 
help(np.linspace)

In [None]:
f = np.linspace(1, 10, 4) # Generate 4 equally spaced values between 1 and 10
print(f)

In [None]:
# Sample other variations of linspace
print(np.linspace(1, 50, 30, retstep=True)) #ret --> steps or spacing b
print()
print(np.linspace(1, 100, num=5, retstep=True)) #default num is 50, how many nos to generate
print()
a = np.linspace(1, 100, num=5, endpoint=True, retstep=True)
print(a)
print(a[0][0])
np.linspace(1, 100, num=5, retstep=True)
np.linspace(1, 100, num=5, dtype=int)

In [None]:
1 + 1.6896551724137931

### `np.empty()`

Return a new array of given shape and type, without initializing entries.

In [None]:
import numpy as np
help(np.empty)

In [None]:
# empty, unlike zeros,it does not set the array values to zero
g1 = np.empty([2, 2], dtype=int) # Creates an empty array 
print(g1)

### Minimum Dimensions

Specifies minimum dimensions of the resultant array

In [None]:
h = np.array([1, 2, 3], ndmin=3) # ndmin provides minimum dimension
print(h)
print(h.ndim)
h = np.array([1, 2, 3]) # ndmin provides minimum dimension
print(h)
print(h.ndim)

### Complex Datatype 

In [None]:
i = np.array([1, 3, 4], dtype=complex) # It creates the array with the complex values.
print(i)
i = np.array([1, 3, 4]) # If datatype is not provided then numpy decides the datatype for us.
print(i)

### reshape()

In [None]:
import numpy as np
help(np.reshape)

In [None]:
j = np.array([[1, 2, 3], [4, 5, 6]])
print(j)
print()
print(j.reshape(3,1,2)) # It will change the shape of the array with 3 rows and two columns

### Creating a Row or Column Vector

In [None]:
vec_row = np.array([1, 2, 3])
vec_col = np.array([[4], [5], [6]])
print("Row vector: ", vec_row)
print("Column vector ", vec_col)

### Create a Matrix with `np.mat()`

In [None]:
help(np.mat)

In [None]:
matrix = np.array([[1,2],[3,4],[5,6,]])
print(matrix)
print(type(matrix))

matrix = np.mat([[1,2],[3,4],[5,6]])
print(matrix)
print(type(matrix))

### Creating a Sparse Matrix

For many problems with large datasets, we may have to create sparse matrices, where a lot of element values are missing (not available).

In [None]:
from scipy import sparse
matrix = np.mat([[1,1],[2,2],[3,3]])
print(matrix)

sparse_matrix = sparse.csr_matrix(matrix) # It is a compressed Sparse matrix 
print(sparse_matrix) # The output will have the position of the element and the element too

## Copies and Views

* Slicing creates a *view*, not a copy
* Modifying a view also modifies the original
* You can force a copy with `.copy()`

In [None]:
a = np.arange(10)
b = a[::2]    # This is a view
a, b

In [None]:
b[0] = 14    # Modify the view and the original also gets affected
a, b

In [None]:
a = np.arange(10)
b = a[::2].copy()  # This creates a new copy of the array
b[0] = 14          # This update doesn't impact the original
a, b

### Conditional and Logical Selection

In [None]:
# Conditional and Logical selection 
ar = [[34, 23, 56, 78], [76, 98, 6, 3], [23, 54, 77, 45]]
ar = np.array(ar)
ar > 60 # Compares each element in the array with the number 60 and returns true / false

In [None]:
# The good part is that we can use the above result and get the elements that are > 60
print(ar[(ar > 60)])    # Get all the values that are greater than 60

print(ar[(ar > 30) & (ar < 60)])    # We can have logical operations to generate index of elements to retrieve from the array

In [None]:
print(ar[ar>20])

In [None]:
# Broadcasting: to change multiple elements at once
arr3 = np.array([[23, 54, 65, 77]])
print(arr3)
arr3[:1,:3] = 40
print(arr3)

## Operations on Matrix Elements

In [None]:
matrix = np.mat([[1, 2, 3], [3, 4, 2], [4, 4, 6]])
add_100 = lambda i: i + 100
vectorized_mat = np.vectorize(add_100) # create a vectorize variable
#print(np.vectorize(add_100))
vectorized_mat(matrix) # Apply the vectorization effect on the defined matrix


In [None]:
add = lambda a,:a*10
print(add(4))

### Finding Maximum and Minimum Values


In [None]:
matrix = np.mat([[1, 2, 3], [3, 5, 4], [9, 8, 7]])
print("Max:", np.max(matrix))    # Returns the maximum element of the matrix
print("Min:", np.min(matrix))    # Returns the minimum elemnt of the matrix

### Applying Operations along a Specific Axis
Using the axis parameter we can also apply the operation along a certain axis:<br>
axis=0 means on each column <br>
axis=1 means on each row

In [None]:
matrix = np.mat([[1, 2, 3], [0, 1, 2], [3, 0, 2]])
print(matrix)
print("Max (axis=0):", np.max(matrix, axis=0))# Returns the maximum element of each row
print("Max (axis=1):", np.max(matrix, axis=1))

### Calculating Average Variance and Standard Deviation 

In [None]:
import numpy as np
matrix=np.array([[1,2,3],[6,4,5],[8,9,7]])
print("The mean of the matrix is :" ,np.mean(matrix))# It will give the mean of the matrix
print()
print("The variance of the matrix is :" ,np.var(matrix))# It will give the variance of the matrix
print()
print("The standard deviation of the matrix is :" ,np.std(matrix))# It will give the standard Deviation of the matrix

### Reshaping of the arrays or matrix

In [None]:
import numpy as np
matrix=np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(matrix.reshape(6,2))# Change the shape of the matrix
print()
print(matrix.reshape(2,6))# Change the shape of the matrix
print()
print(matrix.reshape(1, -1))# Change it to 2 D row array
print()
print(matrix.reshape(12))# provides a row array of 1 D

### Transposing a vector or matrix

In [None]:
import numpy as np 
matr=np.array([[1,2,3],[3,4,5],[6,8,9]])
print("The original Matrix is :\n", matr)
print()
print("The Transposed Matrix is :\n",matr.T)
print()

In [None]:
import numpy as np
def numpysum(n):
    a = np.arange(n) ** 2
    b = np.arange(n) ** 3
    print(a)
    print(b)
    c = a + b
    return c
print(numpysum(10))

In [None]:
import numpy as np
myarray=np.mat([[1,2,3,4],[4,3,2,1]],dtype=np.int64)
print(myarray)

#### To create a np arrays using lists

In [None]:
import numpy as np
alist=["a","Ram",8] # Define a list 
arr=np.array(alist) # conver a list to array
print(arr)# print the array
print(arr.dtype)# prints the data type for more information refer numpy documentation
print(arr.ndim) # Provides the dimension of the array
print(arr.shape)# if it is 1 D then you will just get the number of values as a tuple

#### To create a Matrix

In [None]:
arr_l=[[[1,2,3],[4,5,6],[7,8,'kk']],[[1,1,1],[3,3,3],[2,1,5]]]# Define a list of values 
matr=np.array(arr_l) # convert it into matrix 
#print(arr_l)
print(matr)# print matrix
#matr
print(matr.dtype)# print the data type of matrix
#matr.dtype
print(matr.ndim) # print the dimension of matrix
#matr.ndim
print(matr.shape) # print the shape of matrix
#matr.shape

In [None]:
a_list=[["x:Y"],{"name:Ram"}] 
arr=np.array(a_list)
print(arr)
print(arr.dtype)
print(arr.shape)

In [None]:
import numpy as np 
arr6=np.random.randint(2,8,(2,4,7))
print(arr6)

In [None]:
arr88 = np.random.randint(1,10,(3,3))
print(arr88)

In [None]:
arr89 = np.random.randn(2,6)
print(arr89)
#print(arr89.reshape(2,2))

**numpy.random.randn()** in Python

**About:** numpy.random.randn(d0, d1, …, dn) : creates an array of specified shape and fills it with random values as per standard normal distribution.

If positive arguments are provided, randn generates an array of shape (d0, d1, …, dn), filled with random floats sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1 (if any of the d_i are floats, they are first converted to integers by truncation). A single float randomly sampled from the distribution is returned if no argument is provided.


In [None]:
import numpy as np 
arr7=np.random.randn(6)
print(arr7)

In [None]:
import numpy as np 
arr8 =np.random.randn(3,2)
print(arr8)

In [None]:
import numpy as np 
arr9=np.random.randn(2,4,4)
print(arr9)
print()
arr10=np.random.randn(2,4,4)*2+3
print(arr10)

In [None]:
a=np.array([1,2,3],float)
b=a
c=a.copy()
a[0]=0
print(a)
print(b)
print(c)

In [None]:
import numpy as np 
b=np.array([1,2,3],float)
print(b)
b.fill(0)
print(b)

In [None]:
a=np.array(range(6),float).reshape((2,3))
print(a)
a.transpose()

### One-dimensional versions of multi-dimensional arrays can be generated with flatten:

In [None]:
b=np.array([[1,2,3],[4,5,6]],float)
print(b)
print(b.flatten())

### Two or more arrays can be concatenated together using the concatenate function with a tuple of the arrays to be joined:

In [None]:
a=np.array([1,2],float)
b=np.array([3,4,5,6],float)
c=np.array([7,8,9],float)
np.concatenate((a,b,c))

In [None]:
a=np.array([[1,2],[3,4]],float)
b=np.array([[5,6],[8,0]],float)
print(np.concatenate((a,b)))
print()
print(np.concatenate((a,b),axis=0))
print()
print(np.concatenate((a,b),axis=1))

### Finally, the dimensionality of an array can be increased using the newaxis constant in bracket notation:

In [None]:
a=np.array([1,2,3],float)
print(a)
print()
print()
print(a[:,np.newaxis])
print()
print()
print(a[:,np.newaxis].shape)
print()
print()
print(a[np.newaxis,:])
print()
print()
print(a[np.newaxis,:].shape)

In [None]:
import numpy as np 
a=np.array((1,2,3))
a

In [None]:
import numpy as np
x=np.linspace(0,75,num=6,retstep=True)
print(x)

In [None]:
6*60

## TASK

#### Which among the following will produce the same result as:
1. np.linspace(1,10,10,dtype='int32')
2. np.arange(1,11)
3. np.random.randint(1,11)
4. np.array(range(1,11))

In [None]:
print(np.linspace(1,10,10,dtype='int32'))
print()
print(np.arange(1,11))
print()
print(np.random.randint(1,11,10))
print()
print(np.array(range(1,11)))

### Conversion and other function

In [None]:
#Converting a 1-D array to a 2-D array using reshape() 
#Returns an array containing the same data with a new shape.
arr_co1=np.linspace(20,30,6,dtype='int32')
print(arr_co1.reshape(6))#(2,3)
print(arr_co1.reshape(2,3))#(2,3)
print()
print()
x=np.random.randint(5,10,(2,3))
print(x)
print()
y=x.reshape(3,2)
print(y)

In [None]:
arr_co2=np.random.randint(-20,20,20).reshape(2,10)
#arr_co2=np.arange(1,21).reshape(2,10)
print(arr_co2)
print(arr_co2.max())#axis=None is default gives max value
print(arr_co2.min())# gives min value
print(arr_co2.argmax())# gives the value at which maximum value from the function is attend
print(arr_co2.argmin())# gives the value at which minimum value from the function is attend

In [None]:
a=np.random.randint(10,100,(2,3))# Creates a 2 by 3 matrix of values between 10 and 100
print(a)
print()
print()
print(a.max(axis=1))#axis=0/1(row)
print()
print()
print(a.min(axis=0))# columnwise
print()
print()
print(a.argmax(axis=1)) #
print()
print()
print(a.argmax(axis=0))#

### Practice Work

### Write a NumPy program to test whether none of the elements of a given array is zero.

In [9]:
import numpy as np
x=np.array([[1,2,3],[4,0,6],[7,8,9]])
print(x.all())
            

False


### Write a NumPy program to test if any of the elements of a given array is non-zero

In [10]:
import numpy as np
x=np.array([[1,2,3],[4,0,6],[7,8,9]])
print(x.all())
            

False


### Write a NumPy program to generate five random numbers from the normal distribution.

In [11]:
print(np.random.randn(5))

[ 1.89522413  1.37165801 -1.25301524  0.25114411  0.05156743]


### Write a NumPy program to generate six random integers between 10 and 30.

In [13]:
print(np.random.randint(10,30,6))

[18 24 29 28 27 23]


### Write a NumPy program to get the numpy version and show numpy build configuration.

In [14]:
print(np.__version__)

1.18.1


### Write a NumPy program to  get help on the add function.

In [17]:
np.info(np.add)

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added. If ``x1.shape != x2.shape``, they must be broadcastable to a common shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``,

### create Numpy Array

To create a NumPy array we need to pass list of element values inside a square bracket as an argument to the np.array() function.

In [20]:
### A 3d array is a matrix of 2d array. 
### A 3d array can also be called as a list of lists where every element is again a list of elements.

import numpy as np 
ar1=np.array([1,2,3,4,5])
ar2=np.array([[1,2,3],[4,5,6]])
ar3=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(ar1)
print("-"*10)
print(ar2)
print("-"*10)
print(ar3)
print(ar3.shape)

[1 2 3 4 5]
----------
[[1 2 3]
 [4 5 6]]
----------
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
(2, 2, 3)


**Shape:**	A tuple that specifies the number of elements for each dimension of the array.<br>
**Size:**	The total number elements in the array.<br>
**Ndim:**	Determines the dimension an array.<br>
**nbytes:**	Number of bytes used to store the data.<br>
**dtype:**	Determines the datatype of elements stored in array.<br>

Data Types Supported by NumPy
The dtype method determines the datatype of elements stored in NumPy array. You can also explicitly define the data type using the dtype option as an argument of array function.

dtype	    |            Variants                |  Description
------------|------------------------------------|-------------------------
int	        | int8, int16, int32, int64          |  Integers
uint        | uint8, uint16, uint32, uint64	     |  Unsigned (nonnegative) integers
bool        | Bool	                             |  Boolean (True or False)
code>float  | float16, float32, float64, float128|  Floating-point numbers
complex	    | complex64, complex128, complex256  |  Complex-valued floating-point numbers

# Numeric data manipulation, taublar dataframe and data visualization

### Core features of numpy
We will start by looking at the api to create arrays, delete array, access element of arrays, delete element of arrays.

### Arrays creation
Arrays can be created in different ways. Here are some examples:

In [3]:
# create an array from an existing list, tuple or generator
l = [1, 2, 3]
t = [4, 5, 6]
g = range(7, 10)

a = np.array(l); print(a)
a = np.array(t); print(a)
a = np.array(g); print(a)

[1 2 3]
[4 5 6]
[7 8 9]


In [2]:
import numpy as np

In [8]:
# generate array from a particular function
a = np.arange(0, 1, 0.1) # from 0 to 1 with 0.1 steps
print(a)
b = np.linspace(0, 5, 10) # from 0 to 5 such that there are 10 elements
print(b.reshape(5,2))
c = np.empty(10) # 10 elements with random values inside
print(c)
d = np.zeros(10) # 10 elements with 0s inside
print(d)
e = np.full_like(d, 3) # array big like d but with 3 elements inside
print(e)
f = np.random.random(10)  # array of random elements
print(f)

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
[[0.         0.55555556]
 [1.11111111 1.66666667]
 [2.22222222 2.77777778]
 [3.33333333 3.88888889]
 [4.44444444 5.        ]]
[0.         0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
 3.33333333 3.88888889 4.44444444 5.        ]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
[0.13257961 0.99994971 0.58897059 0.25479458 0.4887622  0.22977009
 0.34845222 0.62290061 0.61217296 0.79995517]


## Accessing elements of arrays

Accessing elements of array is similar to python syntax for lists but it can be extended to a more powerful behaviour

In [14]:
a = np.array(range(5, 15))  # 10 elements from 5 to 15 excluded
a

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [15]:
first = a[0]  # access the first element
last = a[-1]  # access the last element
slice1 = a[:4] # access element from 0 to 4 excluded (0, 1, 2, 3)
slice2 = a[1:3] # access elements from 1 included to 3 excluded (1, 2)
slice3 = a[-3:-1]  # access element from the third last included to the last exclued (-3, -2)
slice4 = a[4:]  # access elements from 4 included until the end
slice5 = a[1:8:2]  # access from 1 inclued to 8 excluded with steps of 2 (1, 3, 5, 7)
slice6 = a[::3] # access every third element (0, 3, 6, 9)
# bonus: reverse an array
slice7 = a[::-1]

In [16]:
a

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [19]:
mask = a > 8
print(mask)
a[a>8]

[False False False False  True  True  True  True  True  True]


array([ 9, 10, 11, 12, 13, 14])

In [20]:
# more indexing methods
submask = [1, 3, 7, 9] 
slice8 = a[submask]  # access the elements in the positions described by submask

# boolean indexing.
mask = a > 8  # mask is a variable of booleans
print(mask)
slice9 = a[mask]  # a boolean maks can be used to select only certain element
slice10 = a[a > 8]  # alternative syntax

print(slice9, slice10)

[False False False False  True  True  True  True  True  True]
[ 9 10 11 12 13 14] [ 9 10 11 12 13 14]


## Some interesting numpy functions

### Math functions
numpy has a fairly big set of functions to for mathematical operations

In [21]:
# standard math functions, like mean, standard deviations, max, min, etc. 
# are all available
a = np.array([10, 8, 9])
print('Mean: ', np.mean(a))
print('Standard deviation: ', np.std(a))
print('Max: ', np.max(a))
print('Min: ', np.min(a))
print('Ix Max: ', np.argmax(a))
print('Ix Min: ', np.argmin(a))

Mean:  9.0
Standard deviation:  0.816496580927726
Max:  10
Min:  8
Ix Max:  0
Ix Min:  1


### Inserting and deleting elements

numpy are blocks of memory with a different layout than the list. For this reason it's not possible to append and element like a list, thoug similar functions are provided

In [3]:
import numpy as np
a = np.array([1, 2, 3])
# print(b)
b = np.append(a, 3); print(b) # insert the number 3 at the end
c = np.insert(a, 2, 12); print(c)  # insert the value 12 at index 2
d = np.delete(a, 1); print(d) # delete the value at index 1

[1 2 3 3]
[ 1  2 12  3]
[1 3]


## Vectorization and broadcasting

These following two concepts are probably the most powerful concepts used in numpy. Sometimes they're used without we notice it. Let's consider the following examples

In [4]:
a = np.array([1, 2, 3])

# I want the square of each element: there is the dedicated function
print(np.square(a))
# I want each element to the power of 6 for example:
print(np.power(a, 6)); print(a ** 6)
# I want to divide each element by 3
print(a / 3)
# I want to subtract 2 from each element
print(a - 2)

[1 4 9]
[  1  64 729]
[  1  64 729]
[0.33333333 0.66666667 1.        ]
[-1  0  1]


The operations between an array/vector and a scalar value are possible because numpy can smartly understand the dimension of both variables and apply the operator to each value of the first element using the value of the second.

In [6]:
x = np.array([1, 1, 3])
y = np.array([2, 3, 3])

# I  want to mulitply each element of x by the correspndding element of y
print(x * y)

[2 3 9]


So if there are two array with the same dimension (1d in this case) the operation are applied element-wise. If there is a 1d array and a scalar then the scalr is applied to each element of the array.

## Multi dimensional array

Sometimes we want to work with n-dimensional arrays. Let's see some examples

# Create a 2d array 2x2 of random elements
m = np.random.randint()
m

In [15]:
m = np.random.randint(0,4,4)
m.reshape(2,2)

array([[2, 3],
       [0, 0]])

In [16]:
# create a 4x6 array of elements from 1 to 24
m = np.arange(1, 25)
print(m)
m = m.reshape((4, 6))
m

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]


array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18],
       [19, 20, 21, 22, 23, 24]])

In [17]:
# example of error: 25 elements are not divisible by 4 so 
# I can't create a matrix of 4 rows
m = np.arange(0, 25)
m = m.reshape((4, 6))

ValueError: cannot reshape array of size 25 into shape (4,6)

In [26]:
# sometimes we only need to reshape an array knowing the number of rows 
# but we don't know the number od columns (or viceversa). In this case
# we can use -1 in the reshape function
m = np.arange(0, 60)
reshape1 = m.reshape((-1, 2)) # it will create a 30x2
print(reshape1.shape)
reshape2 = m.reshape((2, -1)) # it will create a 2x30
reshape3 = m.reshape((2, 6, -1)) # it will create a 2x6x5
print(reshape3)

(30, 2)
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]
  [20 21 22 23 24]
  [25 26 27 28 29]]

 [[30 31 32 33 34]
  [35 36 37 38 39]
  [40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]


## The axis arguments

some numpy functions accepts an `axis` argument. This is used in multidimensional array to apply the operation over a particular axis. Let's see a simple example:

**School grades**
There three students attending the physics class. Their marks over time are saved into a simple .txt file

In [27]:
%%writefile school_grades.txt
tizio, caio, sempronio
9, 9, 10
8, 6, 9
10, 9, 9
7, 7, 8

Writing school_grades.txt


In [31]:
# read the data into a single array
grades = np.genfromtxt('school_grades.txt', skip_header=1, delimiter=',')
# grades = np.genfromtxt('school_grades.txt', skip_header=0, delimiter=',', dtype=str)
grades

array([[ 9.,  9., 10.],
       [ 8.,  6.,  9.],
       [10.,  9.,  9.],
       [ 7.,  7.,  8.]])

In [32]:
# the shape of the array is 
grades.shape

(4, 3)

In [33]:
# I want the global average mark for each student
mean_marks_per_student = np.mean(grades, axis=0)
mean_marks_per_student

array([8.5 , 7.75, 9.  ])

In [34]:
# I want the average mark between the students for each test
mean_marks_per_test = np.mean(grades, axis=1)
mean_marks_per_test

array([9.33333333, 7.66666667, 9.33333333, 7.33333333])

So the array has a *4* rows and *3* columns. The shape can be accessed by the attribute `grades.shape` and returns a tuple `(nrows, ncolumns)`. The axis argument can be used to apply a function over an axis and it works this way:

* If the axis is 0 it will collpase the axis at the 0-th position in the shape of the array. This means from `(nrows, ncolumns)` -> `(ncolumns)`
* If the axis is 1 it will collapse the axis at the 1-st position in the shape of the array. This means that `(nrows, ncolumns)` -> `(nrows)`
* In general if we have an array of the shape `(s1, s2, s3, ...)` the axis is an integer number (or a tuple) from 0 to `ndim-1`. The number specified in the `axis` arguments will be the dimensions the will collapse.

```
a = np.array([...])
a.shape = (s1, s2, s3, s4)
           0,  1,  2,  3
           
np.mean(a, axis=0) -> (s2, s3, s4)
np.mean(a, axis=(0, 1)) -> (s3, s4)
np.mean(a, axis=3) -> (s1, s2, s3)
```

In [35]:
a = np.random.random(420)
print(a.shape)
a = a.reshape((2, 3, 2, 5, 7))
#              0, 1, 2, 3, 4
print(a.shape)

print(a.mean(axis=1).shape) # 0, 2, 3, 4
print(a.mean(axis=0).shape) # 1, 2, 3, 4
print(a.mean(axis=(2, 3)).shape) # 0, 1, 4
print(a.mean(axis=-1).shape) # 0, 1, 2, 3

(420,)
(2, 3, 2, 5, 7)
(2, 2, 5, 7)
(3, 2, 5, 7)
(2, 3, 7)
(2, 3, 2, 5)


<h1 style="color:green" align='center'>Numpy tutorial: iterate numpy array using nditer</h1>

In [None]:
import numpy as np

In [36]:
a = np.arange(12).reshape(3,4)
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

<h3 style="color:purple">Using normal for loop iteration</h3>

In [37]:
for row in a:
    for cell in row:
        print(cell)

0
1
2
3
4
5
6
7
8
9
10
11


<h3 style="color:purple">For loop with flatten</h3>

In [43]:
for cell in a.flatten():
    print(cell, end=" ")

0 1 2 3 4 5 6 7 8 9 10 11 

### tutorial

In [45]:
import numpy as np

a=np.array([[1,2,3],[4,5,6]])
a

array([[1, 2, 3],
       [4, 5, 6]])

In [46]:
import numpy as np
import time
import sys
SIZE = 1000000
l1 = range(SIZE)
l2 = range(SIZE)
a1=np.arange(SIZE)
a2=np.arange(SIZE)

# python list
start = time.time()
result = [(x+y) for x,y in zip(l1,l2)]
print("python list took: ",(time.time()-start)*1000)
# numpy array
start= time.time()
result = a1 + a2
print("numpy took: ", (time.time()-start)*1000)

python list took:  120.67580223083496
numpy took:  17.95196533203125


# Linear algebra with Numpy

It is possible to do symbolic linear algebrea with [Sympy](http://www.sympy.org/en/index.html) but for numeric computations [Numpy](http://www.numpy.org/) is a high performance library that should be used. 

Here is how it is described: 

> NumPy is the fundamental package for scientific computing with Python. It contains among other things: [...]
 useful linear algebra, Fourier transform, and random number capabilities.

In this section we will see how to:

- Manipulate matrices;
- Solve Matrix equations;
- Calculate Matrix inverse and determinants.

## Manipulating matrices

It is straightforward to create a Matrix using Numpy. Let us consider the following as a examples:

$$
A = \begin{pmatrix}
5 & 6 & 2\\
4 & 7 & 19\\
0 & 3 & 12
\end{pmatrix}
$$

$$
B = \begin{pmatrix}
14 & -2 & 12\\
4 & 4 & 5\\
5 & 5 & 1
\end{pmatrix}
$$


First, similarly to Sympy, we need to import Numpy:

In [None]:
import numpy as np

Now we can define A:

In [47]:
A = np.matrix([[5, 6, 2],
               [4, 7, 19],
               [0, 3, 12]])
A

matrix([[ 5,  6,  2],
        [ 4,  7, 19],
        [ 0,  3, 12]])

In [48]:
B = np.matrix([[14, -2, 12],
               [4, 4, 5],
               [5, 5, 1]])
B

matrix([[14, -2, 12],
        [ 4,  4,  5],
        [ 5,  5,  1]])

In [58]:
c=np.ones((5,5), dtype=int)
#print(c)
d=np.matrix(c)
print(d)

[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]]
[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]]


We can obtain the following straightforwardly:

- **5A** (or any other scalar multiple of **A**);
- **A ^ 3** (or any other exponent of **A**);
- **A + B**;
- **A - B**;
- **AB**

In [61]:
print(5 * A)
print(5*d)

[[25 30 10]
 [20 35 95]
 [ 0 15 60]]
[[5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]]


In [62]:
A ** 3

matrix([[ 557, 1284, 3356],
        [ 760, 2305, 6994],
        [ 288, 1074, 3519]])

In [None]:
A + B

In [None]:
A - B

In [63]:
A * B

matrix([[104,  24,  92],
        [179, 115, 102],
        [ 72,  72,  27]])

---

**EXERCISE** Compute $A ^ 2 - 2 A + 3$ with:

$$A = 
\begin{pmatrix}
1 & -1\\
2 & 1
\end{pmatrix}
$$

---

## Solving Matrix equations

We can use Numpy to (efficiently) solve large systems of equations of the form:

$$Ax=b$$

Let us illustrate that with:

$$
A = \begin{pmatrix}
5 & 6 & 2\\
4 & 7 & 19\\
0 & 3 & 12
\end{pmatrix}
$$

$$
b = \begin{pmatrix}
-1\\
2\\
1 
\end{pmatrix}
$$

In [64]:
A = np.matrix([[5, 6, 2],
               [4, 7, 19],
               [0, 3, 12]])
b = np.matrix([[-1], [2], [1]])

We use the `linalg.solve` command:

In [65]:
x = np.linalg.solve(A, b)
x

matrix([[ 0.45736434],
        [-0.62790698],
        [ 0.24031008]])

We can verify our result:

In [None]:
A * x

---

**EXERCISE** Compute the solutions to the matrix equation $Bx=b$ (using the $B$ defined earlier).

---

## Matrix inversion and determinants

Computing the inverse of a matrix is straightforward:

In [66]:
Ainv = np.linalg.inv(A)
Ainv

matrix([[-0.20930233,  0.51162791, -0.7751938 ],
        [ 0.37209302, -0.46511628,  0.6744186 ],
        [-0.09302326,  0.11627907, -0.08527132]])

We can verify that $A^{-1}A=\mathbb{1}$:

In [88]:
print(A * Ainv)
print()
for row in A*Ainv:
    for element in row:
        print(abs(element))
print()
for row in A*Ainv:
    for element in row:
        print(np.ceil(element))
print()
for row in A*Ainv:
    for element in row:
        print(np.floor(element))
print()
for row in A*Ainv:
    for element in row:
        print(np.round(element,0))
print()
for row in A*Ainv:
    for element in row:
        print(type(element))

[[ 1.00000000e+00  2.77555756e-17  3.05311332e-16]
 [-2.08166817e-16  1.00000000e+00 -2.08166817e-16]
 [ 5.55111512e-17 -5.55111512e-17  1.00000000e+00]]

[[1.00000000e+00 2.77555756e-17 3.05311332e-16]]
[[2.08166817e-16 1.00000000e+00 2.08166817e-16]]
[[5.55111512e-17 5.55111512e-17 1.00000000e+00]]

[[2. 1. 1.]]
[[-0.  1. -0.]]
[[ 1. -0.  1.]]

[[1. 0. 0.]]
[[-1.  1. -1.]]
[[ 0. -1.  1.]]

[[1. 0. 0.]]
[[-0.  1. -0.]]
[[ 0. -0.  1.]]

<class 'numpy.matrix'>
<class 'numpy.matrix'>
<class 'numpy.matrix'>


The above might not look like the identity matrix but if you look closer you see that the diagonals are all `1` and the off diagonals are a **very** small number (which from a computer's point of view is `0`).

To calculate the determinant:

In [82]:
np.linalg.det(A)

-128.99999999999997

---

**EXERCISE** Compute the inverse and determinant of $B$ (defined previously).

---

## Summary

In this section we have seen how to using Numpy:

- Manipulate matrices;
- Solve linear systems;
- Compute Matrix inverses and determinants.

This again just touches on the capabilities of Numpy.

Let us take a look at [Pandas](03 - Data analysis with Pandas.ipynb) for data analysis.

## Source(s): 
* https://numpy.org/devdocs/user/quickstart.html
* https://www.tutorialspoint.com/numpy/index.htm