#Numpy

Whenever we have to analyze our dataset in Python we use Numpy library.

Numpy is a fundamental library for scientific computing.It provides supports for arrays and matrices, along with a collection of mathematical operations to operate on these data structures.In this lesson, we will focus on covering the basics of Numpy and arrays present in Numpy which are built in C and somehow integrated in Numpy which helps us use those data structures in Python and we will learn about vectorized operations

In [None]:
import numpy as np

## Creating arrays
## Create a 1-Dimension Array

## arr1 = np.array(iterable)
arr1 = np.array([1,2,3,4,5,6])
print(arr1,type(arr1))

[1 2 3 4 5 6] <class 'numpy.ndarray'>


### Shape of the array

In [None]:
print(arr1.shape)

(6,)


### Reshape array

This function converts a 1-D array to 2-D array and reshapes the array

In [None]:
arr1.reshape(1,6)

array([[1, 2, 3, 4, 5, 6]])

In [None]:
arr1.reshape(6,1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

### Flattening an array into 1-D

In [None]:
List_2D = [[1,2,3],[4,5,6]]
print(List_2D,type(List_2D))
arr = np.array(List_2D)
print(arr,type(arr))
print(arr.reshape(-1))

[[1, 2, 3], [4, 5, 6]] <class 'list'>
[[1 2 3]
 [4 5 6]] <class 'numpy.ndarray'>
[1 2 3 4 5 6]


### Check dimensions

In [None]:
arr.ndim

2

### Convert into 1-D

In [None]:
arr = arr.reshape(-1)
print(arr,type(arr),arr.ndim)

[1 2 3 4 5 6] <class 'numpy.ndarray'> 1


### Arrange function

In [37]:
np.arange(0,10,2)

array([0, 2, 4, 6, 8])

In [38]:
np.arange(0,10,2).reshape(5,1)

array([[0],
       [2],
       [4],
       [6],
       [8]])

In [39]:
np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [40]:
## identity matrix
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [42]:
## array
arr = np.array([[1,2,3],[4,5,6]])

print("Array:\n",arr)
print("Shape:",arr.shape) #Output : (2,3)
print("Number of dimensions:",arr.ndim) #Output : 2
print("Size (number of elements):",arr.size)  #Output : 6
print("Data type:",arr.dtype) #Output : int64 (based on platform)
print("Item size (in bytes):",arr.itemsize) #Output 8 (based on platform)

Array:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Number of dimensions: 2
Size (number of elements): 6
Data type: int64
Item size (in bytes): 8


### Vectorized operations

In [44]:
## Addition operation
arr1 = np.array([1,2,3,4,5,6])
arr2 = np.array([10,20,30,40,50,60])

##Element wise addition happens
print("Addition operation:",arr1+arr2)

Addition operation: [11 22 33 44 55 66]


In [45]:
## Element wise subtraction
print("Subtraction operation",arr1-arr2)

Subtraction operation [ -9 -18 -27 -36 -45 -54]


In [46]:
## Element wise multiplication
print("Multiplication operation",arr1*arr2)

Multiplication operation [ 10  40  90 160 250 360]


In [47]:
## Element wise division
print("Division operation",arr1/arr2)

Division operation [0.1 0.1 0.1 0.1 0.1 0.1]


In [48]:
## Element wise int division
print("Int Division Operation",arr1//arr2)

Int Division Operation [0 0 0 0 0 0]


In [50]:
## Element wise modulus operation
print("Modulus operation",arr2 % arr1)

Modulus operation [0 0 0 0 0 0]


### Universal functions

In [51]:
## Square Root
arr = np.array([1,2,3,4,5])
print(np.sqrt(arr))

[1.         1.41421356 1.73205081 2.         2.23606798]


In [52]:
##  Exponential
print(np.exp(arr))

[  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]


In [54]:
## sine
print(np.sin(arr))

[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]


In [55]:
## natural log
print(np.log(arr))

[0.         0.69314718 1.09861229 1.38629436 1.60943791]


### Array Slicing and Indexing

In [58]:
arr = [[1,2,3],[4,5,6],[7,8,9]]
print(arr,type(arr))

[[1, 2, 3], [4, 5, 6], [7, 8, 9]] <class 'list'>


In [59]:
arr = np.array(arr)

### The above point is where we set foundation of machine learning models we provide them with arrays not with 2-D lists beacuse each data point has to contain same number of features or columns

In [62]:
## Accessing elements
print(arr[0])
print(arr[0][1])
print(arr[2][1])
print(arr[1][2])
print(arr[0][2])
print(arr[2][2])

[1 2 3]
2
8
6
3
9


In [63]:
## Slicing
print(arr[1:,2:])

[[6]
 [9]]


In [65]:
print(arr[1:][:])

[[4 5 6]
 [7 8 9]]


In [66]:
arr[0,0] = 100
print(arr,type(arr))

[[100   2   3]
 [  4   5   6]
 [  7   8   9]] <class 'numpy.ndarray'>


In [67]:
arr[1:]=100
print(arr)

[[100   2   3]
 [100 100 100]
 [100 100 100]]


### Statistical Concepts -- Normalization

Normalized data is when it has a mean of 0 and a standard deviation of 1

In [68]:
## Create an array
data = np.array([1,2,3,4,5,6])

## Find the mean and standard deviation of the data
mean = np.mean(data)
std_deviation = np.std(data)

## normalize the data
normalized_data = (data - mean) / std_deviation
print(normalized_data,type(normalized_data))

[-1.46385011 -0.87831007 -0.29277002  0.29277002  0.87831007  1.46385011] <class 'numpy.ndarray'>


### Important Statistical Information

In [69]:
## Median of our data
median = np.median(data)
print(f"Median of our data is {median}")

## Variance
variance = np.var(data)
print(f"Variance of our data is {variance}")

Median of our data is 3.5
Variance of our data is 2.9166666666666665


In [70]:
## Logical Operation
data > 5

array([False, False, False, False, False,  True])

In [71]:
data[data>5]

array([6])