### NumPy

- NumPy stands for Numerical Python
- A Python library that provides support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays

#### Why NumPy?

- NumPy aims to provide an array object that is up to 50x faster than traditional Python lists
- The array object in NumPy is called ndarray; it provides a lot of supporting functions that make working with ndarray very easy
- NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently which is the main reason why the former is faster than the latter


In [None]:
## Installing numpy
# pip install numpy

#### Creating ndarrays

In [1]:
## Creating an array of integers using array() method

import numpy as np
arr = np.array([1,2,3,4,5])
print(arr)


[1 2 3 4 5]


In [2]:
print(type(arr))

<class 'numpy.ndarray'>


To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray

In [3]:
## Creating array of zeros and ones
arr1 = np.zeros((2,3))
print("Array of zeros: \n", arr1)

arr2 = np.ones((1,5))
print("Array of ones: \n", arr2)


Array of zeros: 
 [[0. 0. 0.]
 [0. 0. 0.]]
Array of ones: 
 [[1. 1. 1. 1. 1.]]


In [4]:
## Creating array using arange()
## arange() is an array creation routine based on numerical ranges
## It creates an instance of ndarray with evenly spaced values and returns the reference to it

arr = np.arange(1, 15, 3, dtype = np.int32)
print(arr)

[ 1  4  7 10 13]


numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray

The first three parameters determine the range of the values, while the fourth specifies the type of the elements:

- start is the number (integer or decimal) that defines the first value in the array
- stop is the number that defines the end of the array and isn’t included in the array
- step is the number that defines the spacing (difference) between each two consecutive values in the array and defaults to 1
- dtype is the type of the elements of the output array and defaults to None

#### Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays)

`Nested array: arrays that have arrays as their elements`

In [5]:
## 0-D arrays, or Scalars, are the elements in an array 
## Each value in an array is a 0-D array
a=np.array(23)
print(a)



23


In [6]:
## An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array
## The most common and basic arrays

b=np.array([13,26,39])
print(b)

[13 26 39]


In [7]:
## An array that has 1-D arrays as its elements is called a 2-D array
## Often used to represent matrix or 2nd order tensors
c=np.array([[1,2,3],[4,5,6]])
print(c)


[[1 2 3]
 [4 5 6]]


In [9]:
## An array that has 2-D arrays (matrices) as its elements is called 3-D array
## These are often used to represent a 3rd order tensor
d=np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]])
print(d)


[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


In [12]:
## Checking dimensions
## ndim attribute returns an integer that indicates how many dimensions the array has
print(c.ndim)
print(a.ndim)

2
0


In [14]:
## Creating higher dimensional arrays by defining the number of dimensions by using the ndmin argument
# Specifies minimum dimensions of resultant array.
e= np.array([1,2,3,4],ndmin=5)
print(e)
print("number of dimensions:" , e.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions: 5


In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

#### Array Indexing

- Array indexing means accessing an array element by referring to its index number
- The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1, etc

In [15]:
arr = np.array([23,67,9,84])
print(arr)


[23 67  9 84]


In [19]:
## Accessing the second element of the array
arr[1]

67

In [20]:
## Accessing the 4th element
arr[3]

84

In [24]:
## Getting third and fourth elements from the above array and adding them
arr[2]+arr[3]

93

In [25]:
## To access elements from 2-D arrays we can use comma separated integers
## representing the dimension and the index of the element
arr=np.array([[1,2,3,4,5],[6,7,8,9,10]])
print('2nd element on 1st dim:', arr[0,1])

2nd element on 1st dim: 2


In [26]:
print('5th element on 2nd dim:',arr[1,4])


5th element on 2nd dim: 10


In [28]:
## To access elements from 3-D arrays we can use comma separated integers 
## representing the dimensions and the index of the element
arr=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(arr)
## Accessing the third element of the second array of the first array
print(arr[0,1,2])

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
6


##### Explanation of above example

`arr[0, 1, 2]` prints the value `6`.

And this is why:

The first number represents the first dimension, which contains two arrays: `[[1, 2, 3], [4, 5, 6]]` and `[[7, 8, 9], [10, 11, 12]]`

Since we selected 0, we are left with the first array: `[[1, 2, 3], [4, 5, 6]]`

The second number represents the second dimension, which also contains two arrays: `[1, 2, 3]` and `[4, 5, 6]`

Since we selected 1, we are left with the second array: `[4, 5, 6]`

The third number represents the third dimension, which contains three values: 4, 5, 6

Since we selected 2, we end up with the third value: 6

In [30]:
## Negative indexing is used to access an array from the end
arr =np.array([[1,2,3,4,5],[6,7,8,9,10]])
print('last element in 2nd dim:', arr[1,-1])

last element in 2nd dim: 10


#### Array Slicing

- Slicing in python means taking elements from one given index to another given index
- We may pass slice instead of index like this: `[start:end]`
- We can also define the step: `[start:end:step]`
- If we don't pass start it's considered 0
- If we don't pass end it considers length of array in that dimension
- If we don't pass step it's considered 1

In [31]:
arr = np.array([1,2,3,4,5,6,7])
print(arr[1:5])

[2 3 4 5]


The result includes the start index, but excludes the end index

In [32]:
## Slicing elements from index 4 to the end of the array
print(arr[4:])

[5 6 7]


In [33]:
## Slicing elements from the beginning to index 4 (not included)
print(arr[:4])

[1 2 3 4]


In [34]:
## Negative Slicing - use the minus operator to refer to an index from the end

## Slicing from the index 3 from the end to index 1 from the end
print(arr[-3:-1])

[5 6]


In [35]:
## Using the step value to determine the step of the slicing

## Returning every other element from index 1 to index 5
print(arr[1:5:2])

[2 4]


In [36]:
## Returning every other element from the entire array
print(arr[::2])

[1 3 5 7]


In [40]:
## Slicing 2D arrays
## From the second element, slicing elements from index 1 to index 4 (not included)
arr = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print(arr[1, 1:4])

[7 8 9]


In [47]:
## Returning index 2 from both elements
print(arr[0:2,2])

[3 8]


In [48]:
# ## Slicing index 1 to index 4 from both elements\
print(arr[0:2,1:4])

[[2 3 4]
 [7 8 9]]


#### NumPy Data Types

- i - integer
- b - boolean
- u - unsigned integer
- f - float
- c - complex float
- m - timedelta
- M - datetime
- O - object
- S - string
- U - unicode string
- V - fixed chunk of memory for other type ( void )

##### Checking the Data Type of an Array
The NumPy array object has a property called dtype that returns the data type of the array

In [49]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

int32


In [50]:
arr1 = np.array(['apple', 'orange', 'cherry']) # object data type <U6
print(arr1.dtype)

<U6


In [None]:
## Creating Arrays With a Defined Data Type
## The array() function can take an optional argument "dtype" 
## that allows us to define the expected data type of the array elements\

## Creating an array with data type string


For i, u, f, S and U we can define size as well

In [56]:
## Creating an array with data type 4 bytes integer
arr=np.array([1,2,3,4], dtype='i4')
print(arr)
print(arr.dtype)

[1 2 3 4]
int32


In [57]:
## Converting Data Type on Existing Arrays
## Make a copy of the array with the astype() method
## The astype() function creates a copy of the array, and allows you to specify the data type as a parameter

arr = np.array([1.1, 2.1, 3.1])

print(arr)
print(arr.dtype)

[1.1 2.1 3.1]
float64


In [58]:
newarr=arr.astype('i')
print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [None]:
## Alternative way



#### NumPy Array Copy vs View

- The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array

- The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy

- The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view

In [59]:
## Making a copy, changing the original array, and displaying both arrays

arr = np.array([1, 2, 3, 4, 5])
x=arr.copy()
arr[0]=42
print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


In [60]:
## Making a view, changing the original array, and displaying both arrays

arr = np.array([1, 2, 3, 4, 5])
x=arr.view()
arr[0]=42
print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]


In [61]:
## Making a view, changing the view, and displaying both arrays

arr = np.array([1, 2, 3, 4, 5])
x=arr.view()
x[0]=31

print(arr)
print(x)

[31  2  3  4  5]
[31  2  3  4  5]


#### Shape of an Array

- The shape of an array is the number of elements in each dimension
- NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements

In [62]:
## Printing the shape of a 2-D array

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print (arr.shape)

(2, 4)


The example above returns (2, 4), which means that the array has 2 dimensions, and each dimension has 4 elements.

In [63]:
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('shape of array :', arr.shape)

[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


Integers at every index tells about the number of elements the corresponding dimension has.

In the above case at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.

#### Reshaping Arrays

- Reshaping means changing the shape of an array
- By reshaping we can add or remove dimensions or change number of elements in each dimension

In [64]:
## Converting a 1-D array with 12 elements into a 2-D array
## such that the outermost dimension will have 4 arrays, each with 3 elements

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr= arr.reshape(4,3)
print(newarr)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [65]:
## Converting a 1-D array with 12 elements into a 3-D array
## such that the outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr1=arr.reshape(2,3,2)
print(newarr1)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


Note: We can reshape an array into any shape as long as the elements required for reshaping are equal in both shapes

For eg, we can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements

In [68]:
arr=np.array([2,3,4,5,6,7,8,9])
newarr2=arr.reshape(1,2,4)
print(newarr2)

[[[2 3 4 5]
  [6 7 8 9]]]


In [69]:
## Flattening array - converting a multidimensional array into a 1D array

arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr=arr.reshape(-1)
print(newarr)

[1 2 3 4 5 6]


In [70]:
arr.flatten(order='C')  #C MEANS TO FLATTEN IN ROW MAJOR(C -STYLE) ORDER
                         #F MEANS TO FLATTEN IN COLUMN MAJOR (FORTRAN -STYLE) ORDER

array([1, 2, 3, 4, 5, 6])

In [71]:
arr.flatten(order='F')

array([1, 4, 2, 5, 3, 6])

#### Sorting Arrays

Arranging elements of a NumPy ndarray object in an ordered sequence is achieved by a function called sort() 

In [72]:
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))


[0 1 2 3]


This method returns a copy of the array, leaving the original array unchanged

In [73]:
## Sorting the array alphabetically

arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))

['apple' 'banana' 'cherry']


In [74]:
## Using the sort() method on a 2-D array will render both arrays sorted

arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))

[[2 3 4]
 [0 1 5]]


Note: np. sort() function does not allow us to sort an array in descending order

#### Searching Arrays

You can search an array for a certain value, and return the indexes that get a match using the where() method

In [75]:
## Finding the indexes where the value is 4
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x=np.where(arr ==4)
print(x)

(array([3, 5, 6], dtype=int64),)


In [76]:
## Finding the indexes where the values are even

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x=np.where(arr%2==0)
print(x)

(array([1, 3, 5, 7], dtype=int64),)


#### Joining NumPy Arrays 

- Joining means putting contents of two or more arrays in a single array
- In SQL we join tables based on a key, whereas in NumPy we join arrays by axes
- We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis; if axis is not explicitly passed, it is taken as 0.


In [78]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr=np.concatenate((arr1,arr2))  #by default axis 0
print(arr)


[1 2 3 4 5 6]


In [87]:
## Joining two 2-D arrays along rows (axis=1)

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print(arr1)
print(arr2)
arr=np.concatenate((arr1,arr2),axis=1)  #along the columns
print(arr)

[[1 2]
 [3 4]]
[[5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]


#### Joining Arrays Using Stack Functions

- Stacking is same as concatenation, the only difference is that stacking is done along a new axis
- We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking
- We pass a sequence of arrays that we want to join to the stack() method along with the axis; if axis is not explicitly passed it is taken as 0

In [83]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(arr1)
print(arr2)
arr=np.stack((arr1,arr2),axis=1)
print(arr)

[1 2 3]
[4 5 6]
[[1 4]
 [2 5]
 [3 6]]


np.hstack combines NumPy arrays horizontally and np. vstack combines arrays vertically

In [85]:
## Stacking Along Rows - hstack()

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(arr1)
print(arr2)
arr=np.hstack((arr1,arr2))
print(arr)

[1 2 3]
[4 5 6]
[1 2 3 4 5 6]


In [86]:
## Stacking Along Columns - vstack()

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

print(arr1)
print(arr2)
arr=np.vstack((arr1,arr2))
print(arr)

[1 2 3]
[4 5 6]
[[1 2 3]
 [4 5 6]]


#### Splitting NumPy Arrays

- Splitting is reverse operation of Joining, that is breaks one array into multiple arrays
- We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits

In [88]:
## Splitting the array in 3 parts

arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr,3)    #np.array_split(obj,no of parts)
print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


The return value is an array containing three arrays

In [90]:
## If the array has less elements than required, it will adjust from the end accordingly
## Splitting the array in 4 parts

arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr,4)    #np.array_split(obj,no of parts)
print(newarr)

[array([1, 2]), array([3, 4]), array([5]), array([6])]


We also have the method split() available but it will not adjust the elements when elements are less in source array for splitting like in example above, array_split() worked properly but split() would fail.

In [None]:
## The return value of the array_split() method is an array containing each of the split as an array
## If you split an array into 3 arrays, you can access them from the result just like any array element

arr = np.array([1, 2, 3, 4, 5, 6])


In [None]:
## Splitting a 2-D array into three 2-D arrays

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])


#### NumPy Arithmetic Operations

Input arrays for performing arithmetic operations such as add(), subtract(), multiply(), and divide() must be either of the same shape or should conform to array broadcasting rules

In [92]:
arr1 = np.array([[1,2,3],[4,5,6]])
arr2 = np.array([[7,8,9],[10,11,12]])

In [93]:
## Adding the two arrays
print(np.add(arr1,arr2))


[[ 8 10 12]
 [14 16 18]]


In [94]:
# ## Subtracting one array from the other
print(np.subtract(arr1,arr2))


[[-6 -6 -6]
 [-6 -6 -6]]


In [95]:
## Multiplying the two arrays
print(np.multiply(arr1,arr2))


[[ 7 16 27]
 [40 55 72]]


In [96]:
## Dividing one array by the other
print(np.divide(arr1,arr2))


[[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


##### numpy.power()

This function treats elements in the first input array as base and returns it raised to the power of the corresponding element in the second input array.

In [104]:
a = np.array([10,100,1000])
print(np.power(a,2))
print(np.power(a,3))
print(np.power(a,4))

[    100   10000 1000000]
[      1000    1000000 1000000000]
[     10000  100000000 -727379968]


In [98]:
b = np.array([1,2,3])
print(np.power(a,b))


[        10      10000 1000000000]


##### numpy.mod()

Returns the remainder of division of the corresponding elements in the input array; the function numpy.remainder() also produces the same result

In [99]:
a = np.array([10,20,30]) 
b = np.array([3,5,7])

print("Applying mod() function: ", np.mod(a,b))
print("Applying remainder() function: ", np.remainder(a,b))



Applying mod() function:  [1 0 2]
Applying remainder() function:  [1 0 2]


#### NumPy Matrix Operations

- A matrix is a specialized 2-D array that retains its 2-D nature through operations
- Some popular matrix operations include additon, multiplication, transpose, determinant, rank and so on

In [106]:
## Addition of two matrices

A = np.array([[2, 4], [5, -6]])
B = np.array([[9, -3], [3, 6]])
print(A)
print(B)
P=A+B  #np.add(A,B)
print(P)

[[ 2  4]
 [ 5 -6]]
[[ 9 -3]
 [ 3  6]]
[[11  1]
 [ 8  0]]


In [107]:
#multiply matrix
c=np.dot(A,B)
print(c)

[[ 30  18]
 [ 27 -51]]


In [109]:
## Transpose of a matrix

arr = np.array([[1, 1], [2, 1], [3, -3]])
print(arr.T)


[[ 1  2  3]
 [ 1  1 -3]]


In [None]:
## Alternative method of transposition

In [110]:
## Square root of each matrix element

arr = np.array([[2,4,9],[121,100,8]])
print(np.sqrt(arr))

[[ 1.41421356  2.          3.        ]
 [11.         10.          2.82842712]]
