<a href="https://colab.research.google.com/github/SivarajTechM/AV-CPW/blob/master/NumPy_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to NumPy

* NumPy is an abbreviation for **Numerical Python**
* It is a Python library consisting of multidimensional array objects and a collection of routines for processing those arrays
* Using NumPy, mathematical and logical operations on arrays can be performed. 
* Created by **Travis Oliphant** in 2005 by incorporating features of its predecessors Numarray and Numeric




# Environment Setup



*   Standard Python distribution doesn't come bundled with NumPy module.
*   A lightweight alternative is to install NumPy using popular Python package installer, pip.



In [0]:
pip install numpy

# NumPy vs. Python




*   Less Coding in NumPy



In [2]:
# Adding vectors using pure Python

def pythonsum(n):
  a = list(range(n))
  b = list(range(n))
  c = []
  
  for i in range(len(a)):
    a[i] = i ** 2
    b[i] = i ** 3
    c.append(a[i] + b[i])
  
  return c

pythonsum(10)

[0, 2, 12, 36, 80, 150, 252, 392, 576, 810]

In [3]:
# Adding vectors in NumPy

import numpy as np

def numpysum(n):
  a = np.arange(n) ** 2
  b = np.arange(n) ** 3
  c = a + b
  return c

numpysum(10)

array([  0,   2,  12,  36,  80, 150, 252, 392, 576, 810])

- NumPy is comparatively faster than Python

In [0]:
from datetime import datetime

size = int(input("Enter any large number as size of the array: "))

start_time = datetime.now()
c = pythonsum(size)
time_taken = datetime.now() - start_time

print("Time taken using Python: ", time_taken)


Enter any large number as size of the array: 10000000
Time taken using Python:  0:00:09.631001


In [0]:
size = int(input("Enter any large number as size of the array: "))

start_time = datetime.now()
c = numpysum(size)
time_taken = datetime.now() - start_time

print("Time taken using NumPy: ", time_taken)

Enter any large number as size of the array: 10000000
Time taken using NumPy:  0:00:00.399164


In [0]:
a = np.arange(5)
a.dtype
a.shape


(5,)

In [0]:
for i in range(5): print(i)

In [0]:
print(np.arange(5))

# Introduction to vectors and matrices

* A matrix is a group of numbers or elements which are arranged as a rectangular array

* The matrix's rows and columns are usually indexed by a letter. For *n x m* matrix, *n* represents the number of rows and *m* representes number of columns

* If *n = m* it is a square matrix

$$\begin{bmatrix} 1 & 2 & -1 \\ 3 & 0 & 1 \\ 0 & 2 & 4 \end{bmatrix}$$

* A vector is actually a matrix with one row or one column having more than one element. It can also be defined as a *1-by-m* or *n-by-1* matrix.

* Zero matrix has all 0

* Identity matrix has all diagonal elements as 1 while others are all 0

* When you multiply a matrix with its inverse, the result will be an identity matrix






# NumPy Array Object



* The most important object defined in NumPy is an N-dimensional array type called **ndarray**
* It describes the collection of items of the same type
* Items in the collection can be accessed using a zero-based index
* Every item in an ndarray takes the same size of block in the memory
* Each element in ndarray is an object of data-type object (called **dtype**)



# NumPy Data Types

| S. No. | Data Type  |
|--------|------------|
| 1      | bool_      |
| 2      | int_       |
| 3      | int8       |
| 4      | int16      |
| 5      | int32      |
| 6      | int64      |
| 7      | float_     |
| 8      | float16    |
| 9      | float32    |
| 10     | float64    |
| 11     | complex_   |
| 12     | complex64  |
| 13     | complex128 |

# Basics of NumPy array objects

- NumPy’s main object is the homogeneous multidimensional array. 
- It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. 
- In NumPy dimensions are called axes.

## Array Creation

You can create an array from a regular Python list or tuple using the array function. The type of the resulting array is deduced from the type of the elements in the sequences.

In [73]:
a = np.array([2,3,4])
a

array([2, 3, 4])

A frequent error consists in calling array with multiple numeric arguments, rather than providing a single list of numbers as an argument.

a = np.array(1,2,3,4)    # WRONG


a = np.array([1,2,3,4])  # RIGHT

array transforms sequences of sequences into two-dimensional arrays, sequences of sequences of sequences into three-dimensional arrays, and so on.

In [7]:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
x

array([[1, 2, 3],
       [4, 5, 6]])

In [8]:
# Type of array

print("The type of the array is : ", type(x))

The type of the array is :  <class 'numpy.ndarray'>


In [0]:
# Shape returns a tuple with dimensions of the array with rows & columns

print("The shape of the array is : ", x.shape)

The shape of the array is :  (2, 3)


In [0]:
# Size returns the total number of elements in the array

print("The total size is :", x.size)

The total size is : 6


In [0]:
# ndim returns the dimension of the ndarray

print("The dimension of the array is :", x.ndim)

The dimension of the array is : 2


In [0]:
# dtype returns the data type of the array elements

print("The data types of the array elements are :", x.dtype )

The data types of the array elements are : int64


In [0]:
# nbytes returns the memory consumption of the array

print("The array consumes :", x.nbytes , " bytes")

The array consumes : 48  bytes


In [0]:
# You can specify the data type while creating your array using dtype attribute

x = np.array([[1,2,3],[4,5,6]], dtype=np.float)
print(x)
print("Memory used: ", x.nbytes)

[[1. 2. 3.]
 [4. 5. 6.]]
Memory used:  48


In [0]:
x = np.array([[1,2,3],[4,5,6]], dtype=np.complex)
print(x)
print("Memory used: ", x.nbytes)

[[1.+0.j 2.+0.j 3.+0.j]
 [4.+0.j 5.+0.j 6.+0.j]]
Memory used:  96


In [0]:
x = np.array([[1,2,3],[4,5,6]], dtype=np.uint32)
print(x)
print("Memory used: ", x.nbytes)

[[1 2 3]
 [4 5 6]]
Memory used:  24


**You cannot change the dtype after creating the array**. 
However, we can create a copy of the array with a new dtype and with the astype attribute.

In [0]:
x_copy = np.array(x, dtype = np.float)
x_copy

array([[1., 2., 3.],
       [4., 5., 6.]])

In [0]:
x_copy_int = x_copy.astype(np.int)
x_copy_int

array([[1, 2, 3],
       [4, 5, 6]])

## Effect of dtype: 

Imagine a case where you are trying to identify and calculate the risks of an individual patient who has cancer.

If you have 100,000 records (rows), where each row represents a single patient, and each patient has 100 features (results of some of the tests), you have (100000, 100) arrays:

In [0]:
Data_Cancer= np.random.rand(100000,100)
print("Memory consumption with dtype as ", Data_Cancer.dtype, " : ", Data_Cancer.nbytes)

Data_Cancer_New = np.array(Data_Cancer, dtype = np.float32)
print("Memory consumption with dtype as ", Data_Cancer_New.dtype, " : ",  Data_Cancer_New.nbytes)

Memory consumption with dtype as  float64  :  80000000
Memory consumption with dtype as  float32  :  40000000


To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.

In [74]:
np.arange( 10, 30, 5 )

array([10, 15, 20, 25])

In [75]:
np.arange( 0, 2, 0.3 )    # it accepts float arguments

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

# NumPy array operations


## Creating NumPy array from a list:

In [12]:
my_list = [2, 14, 6, 8]

my_array = np.asarray(my_list)
type(my_array)

numpy.ndarray

## Arithmetic operations with scalar value:

In [0]:
print(my_array + 2)

print(my_array - 1)

print(my_array * 2)

print(my_array / 2)


[ 4 16  8 10]
[ 1 13  5  7]
[ 4 28 12 16]
[1. 7. 3. 4.]


## Arithmetic operations between arrays:


In [9]:
second_array = np.zeros(4) + 3
second_array

array([3., 3., 3., 3.])

In [13]:
my_array - second_array

array([-1., 11.,  3.,  5.])

In [10]:
third_array = np.ones(4) + 3
third_array

array([4., 4., 4., 4.])

In [14]:
my_array - third_array

array([-2., 10.,  2.,  4.])

# Creating Arrays

In [0]:
import numpy as np

# a = np.arange(15)
# a

a = np.arange(15).reshape(3,5)
a


array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [0]:
a.shape

(3, 5)

In [0]:
a.ndim

2

In [0]:
a.dtype.name

'int64'

In [0]:
a.itemsize

8

In [0]:
a.size

15

In [0]:
type(a)

numpy.ndarray

In [0]:
b = np.array([6,7,8])
b

array([6, 7, 8])

In [0]:
type(b)

numpy.ndarray

In [0]:
print(np.arange(10000).reshape(100,100))

[[   0    1    2 ...   97   98   99]
 [ 100  101  102 ...  197  198  199]
 [ 200  201  202 ...  297  298  299]
 ...
 [9700 9701 9702 ... 9797 9798 9799]
 [9800 9801 9802 ... 9897 9898 9899]
 [9900 9901 9902 ... 9997 9998 9999]]


In [0]:
np.set_printoptions(threshold=np.nan)

# Basic Operations

Arithmetic operations on arrays apply elementwise

In [0]:
a = np.array([10,20,30,40])
b = np.arange(4)
b

array([0, 1, 2, 3])

In [0]:
c = a - b
c

array([10, 19, 28, 37])

In [0]:
b ** 2


array([0, 1, 4, 9])

In [0]:
10 * np.sin(a)

array([-5.44021111,  9.12945251, -9.88031624,  7.4511316 ])

In [0]:
a < 35

array([ True,  True,  True, False])

For product operation, you can choose elementwise product or matrix product

In [3]:
import numpy as np

A = np.array([[1,1],[0,1]])
B = np.array([[2,0],[3,4]])
A * B # elementwise product

array([[2, 0],
       [0, 4]])

In [4]:
A.dot(B) # matrix product

array([[5, 4],
       [3, 4]])

In [5]:
np.dot(A, B) # another matrix product

array([[5, 4],
       [3, 4]])

# Indexing, Slicing and Iterating

One-dimensional arrays can be indexed, sliced and iterated over (like lists and other Python sequences)

In [15]:
a = np.arange(10) ** 2
a

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [17]:
a[2] #indexing

4

In [18]:
a[2:5] #slicing from 3 - 5

array([ 4,  9, 16])

In [19]:
a[:6:2] = -1 # equivalent to a[0:6:2] = -1; from start to position 6, set every 2nd element to -1

a

array([-1,  1, -1,  9, -1, 25, 36, 49, 64, 81])

In [20]:
a[ : :-1] # reverse print the array

array([81, 64, 49, 36, 25, -1,  9, -1,  1, -1])

In [16]:
for i in a:
  print(i**(1/2)) # iteration example

0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0


Multidimensional arrays can have one index per axis. These indices have to be mentioned in a tuple separated by commas

In [23]:
def f(x,y):
  return 10*x+y
  
b = np.fromfunction(f,(5,4),dtype=int)
b


array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

In [24]:
b[2,3] # indexing with row & column indices

23

In [25]:
b[0:5, 1] # each row in second column of b

array([ 1, 11, 21, 31, 41])

In [26]:
b[:,1] # same as above

array([ 1, 11, 21, 31, 41])

In [27]:
b[1:3, :] #each column in the second and third row of b

array([[10, 11, 12, 13],
       [20, 21, 22, 23]])

In [28]:
#missing indices are considered as complete slices :

b[-1] # prints the last row. Equivalent to b[-1,:]

array([40, 41, 42, 43])

In [29]:
#NumPy also allows you to write this using dots as b[i,...]

b[1,...] #same as b[1,:,:] or b[1]

array([10, 11, 12, 13])

In [30]:
b[...,2] #same as b[:,:,2] or b[:,2]

array([ 2, 12, 22, 32, 42])

In [33]:
# Iterating over mutlidimensional arrays is done with respect to first axis

for row in b:
  print(row)

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]


In [34]:
# To perform an operation on each element in the array, use the flat attribute

for element in b.flat:
  print(element)

0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43


# Shape Manipulation

## Changing shape of the array

An array has a shape given by the number of elements along each axis

In [35]:
b.shape

(5, 4)

- Shape can be changed with various commands
- Following commands return a modified array but do not change the original array

In [37]:
b.ravel() # returns the array, flattened

array([ 0,  1,  2,  3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 40,
       41, 42, 43])

In [38]:
b.reshape(4,5) # returns the array with modified shape 

array([[ 0,  1,  2,  3, 10],
       [11, 12, 13, 20, 21],
       [22, 23, 30, 31, 32],
       [33, 40, 41, 42, 43]])

In [39]:
b.T # returns the transposed array

array([[ 0, 10, 20, 30, 40],
       [ 1, 11, 21, 31, 41],
       [ 2, 12, 22, 32, 42],
       [ 3, 13, 23, 33, 43]])

- reshape function returns its argument with a modified shape
- resize method modifies the array itself

In [40]:
b

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

In [0]:
b.resize((4,5))

In [42]:
b

array([[ 0,  1,  2,  3, 10],
       [11, 12, 13, 20, 21],
       [22, 23, 30, 31, 32],
       [33, 40, 41, 42, 43]])

In [45]:
b.reshape(5,-1) # -1 in reshape operation automatically calculates it


array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

## Stacking together different arrays

Several arrays can be stacked together along different axes:

In [46]:
a = np.floor(10*np.random.random((2,2)))
a

array([[4., 9.],
       [3., 1.]])

In [47]:
b = np.floor(10*np.random.random((2,2)))
b

array([[8., 6.],
       [7., 7.]])

In [48]:
np.vstack((a,b)) # vertically stack

array([[4., 9.],
       [3., 1.],
       [8., 6.],
       [7., 7.]])

In [49]:
np.hstack((a,b)) # horizontal stack

array([[4., 9., 8., 6.],
       [3., 1., 7., 7.]])

## Splitting arrays

Using `hsplit`, you can split an array along its horizontal axis, either by specifying the number of equally shaped arrays to return, or by specifying the columns after which the division should occur

In [50]:
a = np.floor(10*np.random.random((2,12)))
a

array([[7., 4., 9., 5., 7., 2., 8., 1., 4., 7., 9., 6.],
       [2., 7., 9., 9., 2., 8., 1., 9., 9., 6., 2., 4.]])

In [51]:
np.hsplit(a,3) #splits a into 3

[array([[7., 4., 9., 5.],
        [2., 7., 9., 9.]]), array([[7., 2., 8., 1.],
        [2., 8., 1., 9.]]), array([[4., 7., 9., 6.],
        [9., 6., 2., 4.]])]

In [52]:
np.hsplit(a, (3,4)) #split a after the third and fourth column

[array([[7., 4., 9.],
        [2., 7., 9.]]), array([[5.],
        [9.]]), array([[7., 2., 8., 1., 4., 7., 9., 6.],
        [2., 8., 1., 9., 9., 6., 2., 4.]])]

`vsplit` splits along the vertical axis, and `array_split` allows one to specify along which axis to split

## Copies and Views

Simple assignments make no copy of array objects or of their data.

In [54]:
a = np.arange(12)
b = a # no new object is created
b is a # a and b are two names for the same ndarray object

True

In [55]:
b.shape = 3,4 #changes the shape of a
a.shape

(3, 4)

Python passes mutable objects as references, so function calls make no copy.

In [57]:
def f(x):
  print(id(x)) # id is a unique identifier of an object 
  
id(a)

139810974013920

In [58]:
f(a)

139810974013920


## View or Shallow Copy

Different array objects can share the same data. The view method creates a new array object that looks at the same data

In [59]:
c = a.view()
c is a

False

In [60]:
c.base is a    # c is a view of the data owned by a

True

In [61]:
c.shape = 2,6  # a's shape doesn't change
a.shape

(3, 4)

In [62]:
a


array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [63]:
c[0,4]=1234  # a's data changes
a

array([[   0,    1,    2,    3],
       [1234,    5,    6,    7],
       [   8,    9,   10,   11]])

Slicing an array returns a view of it:

In [68]:
s = a[ : , 1:3]     # spaces added for clarity; could also be written "s = a[:,1:3]"
s

array([[10, 10],
       [10, 10],
       [10, 10]])

In [0]:
s[:] = 10           # s[:] is a view of s. Note the difference between s=10 and s[:]=10
s

In [66]:
a

array([[   0,   10,   10,    3],
       [1234,   10,   10,    7],
       [   8,   10,   10,   11]])

## Deep copy

In [69]:
d = a.copy()       # a new array object with new data is created
d is a

False

In [70]:
d.base is a        # d doesn't share anything with a

False

In [72]:
d[0,0] = 9999
a

array([[   0,   10,   10,    3],
       [1234,   10,   10,    7],
       [   8,   10,   10,   11]])

In [0]:
a