<a href="https://colab.research.google.com/github/palash04/Artificial-Intelligence/blob/master/NumpyBasics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Numpy 
NumPy (Numerical Python) is a powerful, and extensively used, library for storage and calculations. It is designed for dealing with numerical data. It allows data storage and calculations by providing data structures, algorithms, and other useful utilities. For example, this library contains basic linear algebra functions, Fourier transforms, and advanced random number capabilities. It can also be used to load data to Python and export from it.

## Why Numpy?
Data comes in all shapes and sizes. We can have image data, audio data, text data, numerical data, etc. We have all these heterogeneous sources of data but computers understand only 0’s and 1’s — At its core, data can be thought of as arrays of numbers. In fact, the prerequisite for performing any data analysis is to convert the data into numerical form. This means it is important to be able to store and manipulate arrays efficiently, and this is where Python’s NumPy package comes into picture.

NumPy arrays are like Python’s lists. Their advantage is that they provide more efficient storage and data operations as the arrays grow larger in size. This is the reason NumPy arrays are at the core of nearly all data science tools in Python. This, in turn, implies that it is essential to know NumPy well!




# Creating Arrays

In [1]:
import numpy as np

# There are two ways for creating numpy arrays
# a. Arrays from list

# Create a 1D array from the list 
arr1 = np.array([1,2,3,4])
print(arr1)
print(type(arr1))
print()

# Create a 1D float array
arr2 = np.array([1,2,3,4], dtype='float32')
print(arr2)
print(type(arr2))
print()

# Create a 2D array from the list of lists
lists = [[1,2,3],[4,5,6],[6,7,8]]
arr2d = np.array(lists)
print(arr2d)
print()

arr1 = np.array([1,2,3,4])
print(arr1)

# Vector element-wise operations
print(arr1 * 2)
print(arr1 + 2)
print(arr2 * arr1)

# b. Arrays from scratch

# Create an integer array of size 100 filled with zeros
np.zeros(shape=100, dtype='int')

# Create a 3x3 floating point array filled with ones
np.ones(shape=(3,3), dtype='float32')

# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 3
# (this is similar to the built-in range() function)
np.arange(0,20,3)

# Create an array of 100 values evenly spaced between 0 and 1
np.linspace(0,1,100)

# Create a 3x3 array of uniformly distributed random values between 0 and 1
np.random.random(size=(3,3))

# Create a 3x3 array of random integers in the interval [0,10)
np.random.randint(0,10,size=(3,3))

# Create a 3x3 array of normally distributed random values with mean 0 and standard deviation 1
np.random.normal(0,1,(3,3))

# One-d array of random integers
np.random.randint(10, size=6)

# 2d array of random integers
np.random.randint(10, size=(3,3))

# 3d array of random integers
np.random.randint(10, size=(3,4,5))


[1 2 3 4]
<class 'numpy.ndarray'>

[1. 2. 3. 4.]
<class 'numpy.ndarray'>

[[1 2 3]
 [4 5 6]
 [6 7 8]]

[1 2 3 4]
[2 4 6 8]
[3 4 5 6]
[ 1.  4.  9. 16.]


array([[[1, 1, 7, 7, 2],
        [6, 6, 1, 4, 2],
        [8, 6, 1, 3, 0],
        [5, 8, 3, 5, 0]],

       [[0, 3, 6, 0, 2],
        [8, 7, 3, 9, 8],
        [1, 5, 9, 1, 6],
        [3, 3, 8, 0, 5]],

       [[4, 6, 1, 4, 8],
        [5, 3, 3, 2, 7],
        [3, 6, 3, 5, 5],
        [4, 8, 8, 9, 5]]])

# Array Attributes

Each array has the following attributes:

- ndim: the number of dimensions
- shape: the size of each dimension
- size: the total size of the array
- dtype: the data type of the array
- itemsize: the size (in bytes) of each array element
- nbytes: the total size (in bytes) of the array

In [2]:
import numpy as np

# Create a 3x3 array of random integers in the interval [0,10)
x = np.random.randint(0,10, size=(3,3))

print(f'ndim: {x.ndim}')
print(f'shape: {x.shape}')
print(f'x size: {x.size}')
print(f'dtype: {x.dtype}')
print(f'itemsize: {x.itemsize} bytes')
print(f'nbytes: {x.nbytes} bytes')


ndim: 2
shape: (3, 3)
x size: 9
dtype: int64
itemsize: 8 bytes
nbytes: 72 bytes


# Array indexing and slicing
Indexing in NumPy is similar to Python’s standard list indexing. In a 1D array, we can access the ith value by specifying the index of the element we need in square brackets. One important thing to remember here is that indexing in Python starts at zero.

In [3]:
##################
# Array Indexing #
##################

# Input array
x = np.array([1,2,3,4,5,6])

# Access the first value of x
x[0]

# Access the third value of x
x[2]

# Use negative indices to index from the end of the array

# Get the last value of x
print (x[-1])

# Get the second last value of x
print (x[-2])

# If we have a multidimensional array, and want to access items based on both column and row, 
# we can pass the row and column indices at the same time using a comma-separated tuple as shown in the examples below.

x2 = np.array([[3,2,5,5],[0,1,5,8],[3,0,5,0]])
print (x2)

# Get value in 3rd row and 4th column
print (x2[2,3])

# 3rd row and last value from the 3rd column of x2
print (x2[2,-1])

# Replace the value in the 1st row and 1st column of x2 with 1
x2[0,0] = 1
print (x2) 

#################
# Array slicing #
#################

# Slicing array is a way to access subarrays, i.e., accessing multiple or a range of elements from an array instead of individual items. 
# In other words, when you slice arrays you get and set smaller subsets of items within larger arrays.
# we need to use square brackets to access individual elements. 
# But this time, we also need the slice notation, “:” to access a slice or a range of elements of a given array, x
# x[start:stop:step]

# Input array

x1 = np.arange(10)
print(x1)

# print the first 5 elements of the array
print (x1[:5])

# print elements after index 4
print (x1[4:])

# from 4th to 6th position
print (x1[4:7])

# print elements at even place
print (x1[ : : 2])

# print elements from 1st position step by 2 (every other element starting at index 1)
print (x1[1::2])

# reverse the array
print (x1[::-1])

# reverse every other element starting from index 5
print (x1[5::-2])

# Multi-dimensional slices

x2 = np.array([[0,1,2],[3,4,5],[6,7,8]])

# Extract the first two rows and first two columns
print (x2[ :2 , :2])

# all rows and every other column
print (x2[ : , ::2 ])


6
5
[[3 2 5 5]
 [0 1 5 8]
 [3 0 5 0]]
0
0
[[1 2 5 5]
 [0 1 5 8]
 [3 0 5 0]]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4]
[4 5 6 7 8 9]
[4 5 6]
[0 2 4 6 8]
[1 3 5 7 9]
[9 8 7 6 5 4 3 2 1 0]
[5 3 1]
[[0 1]
 [3 4]]
[[0 2]
 [3 5]
 [6 8]]


# Reshaping of array

Reshaping is about changing the way items are arranged within the array so that the shape of the array changes but the overall number of dimensions stays the same, e.g., you can use it to convert a 1D array into 2D array.

Reshaping is a very useful operation and it can easily be done using the reshape() method. Since a picture speaks a thousand words, let’s see the effects of reshaping visually:

![Screenshot 2020-10-21 at 16 07 16](https://user-images.githubusercontent.com/26361028/96708880-7e123f00-13b7-11eb-832c-c0e6f0b8c943.png)



In [4]:
import numpy as np

# create a 3x3 grid with numbers 1 to 9
x = np.arange(1,10).reshape((3,3))
print(x)
print()

x = np.array([1,2,3])
print(x)
print(x.shape)
print()

# row vector via reshape
x_rv = x.reshape((1,3))
print(x_rv)
print(x_rv.shape)
print()

# column vector via reshape
x_cv = x.reshape((3,1))
print(x_cv)
print(x_cv.shape)
print()

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[1 2 3]
(3,)

[[1 2 3]]
(1, 3)

[[1]
 [2]
 [3]]
(3, 1)



# Concatenation of Arrays

In [5]:
# Concatenate two or more arrays at once
x = np.array([1,2,3])
y = np.array([4,5,6])
z = np.array([11,11,11])

np.concatenate([x,y,z])

# Concatenate 2d arrays
x = np.array([[1,2,3],[4,5,6],[11,11,11]])
np.concatenate([x,x])

x = np.array([3,4,5])
grid = np.array([[1,2,3],[9,10,11]])

np.vstack([x,grid]) # vertically stack the arrays

z = np.array([[19],[19]])
np.hstack([grid,z])  # horizontally stack the arrays


array([[ 1,  2,  3, 19],
       [ 9, 10, 11, 19]])

# Splitting

In [6]:
x = np.arange(10)

x1,x2,x3 = np.split(x, [3,6],)
print (x1, x2, x3)

[0 1 2] [3 4 5] [6 7 8 9]


# Computations on Numpy Arrays


In [7]:
import numpy as np

x = np.arange(10)

# Native arithmetic operators
print(f'x = {x}')
print(f'x + 5 = {x + 5}')
print(f'x - 5 = {x - 5}')
print(f'x * 5 = {x * 5}')
print(f'x / 5 = {x / 5}')
print(f'x ** 2 = {x ** 2}')
print(f'x % 2 = {x % 2}')

# Trigonometric functions
theta = np.linspace(0, np.pi, num=4)
print(f'theta = {theta}')
print(f'sin(theta) = {np.sin(theta)}')
print(f'cos(theta) = {np.cos(theta)}')
print(f'tan(theta) = {np.tan(theta)}')

# Logarithms and exponentials
x = [1,2,3]
print(f'x = {x}')
print(f'e^x = {np.exp(x)}')
print(f'2^x = {np.exp2(x)}')
print(f'3^x = {np.power(3,x)}')

print(f'ln(x) = {np.log(x)}')
print(f'log2(x) = {np.log2(x)}')
print(f'log10(x) = {np.log10(x)}')


x = [0 1 2 3 4 5 6 7 8 9]
x + 5 = [ 5  6  7  8  9 10 11 12 13 14]
x - 5 = [-5 -4 -3 -2 -1  0  1  2  3  4]
x * 5 = [ 0  5 10 15 20 25 30 35 40 45]
x / 5 = [0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8]
x ** 2 = [ 0  1  4  9 16 25 36 49 64 81]
x % 2 = [0 1 0 1 0 1 0 1 0 1]
theta = [0.         1.04719755 2.0943951  3.14159265]
sin(theta) = [0.00000000e+00 8.66025404e-01 8.66025404e-01 1.22464680e-16]
cos(theta) = [ 1.   0.5 -0.5 -1. ]
tan(theta) = [ 0.00000000e+00  1.73205081e+00 -1.73205081e+00 -1.22464680e-16]
x = [1, 2, 3]
e^x = [ 2.71828183  7.3890561  20.08553692]
2^x = [2. 4. 8.]
3^x = [ 3  9 27]
ln(x) = [0.         0.69314718 1.09861229]
log2(x) = [0.        1.        1.5849625]
log10(x) = [0.         0.30103    0.47712125]


In [8]:
# Reduce method
x = np.arange(1,6)
sum_all = np.add.reduce(x)
print(sum_all)

# Accumulate method
x = np.arange(1,6)
sum_acc = np.add.accumulate(x)
print(sum_acc)

15
[ 1  3  6 10 15]


# Aggregations

In [9]:
import numpy as np

x = np.random.random(100)

print(f'Sum of values is: {np.sum(x)}')
print(f'Mean value is: {np.mean(x)}')
print(f'Max value is: {np.max(x)}')
print(f'Min value is: {np.min(x)}')
print()

# Aggregate operations on multi-dimensional array
grid = np.random.random((3,4))
print (grid)

print(f'Overall sum: {np.sum(grid)}')
print(f'Overall min: {np.min(grid)}')

# Row wise and column wise min
print(f'Column wise minimum: {np.amin(grid, axis=0)}')
print(f'Row wise minimum: {np.amin(grid, axis=1)}')

Sum of values is: 45.807788964835886
Mean value is: 0.4580778896483589
Max value is: 0.9792634816800948
Min value is: 0.02750765307339853

[[0.35923326 0.58122717 0.4171537  0.26368982]
 [0.86349691 0.04502433 0.30465066 0.35473602]
 [0.41406378 0.41481561 0.4536483  0.81313174]]
Overall sum: 5.284871275935817
Overall min: 0.04502432970983661
Column wise minimum: [0.35923326 0.04502433 0.30465066 0.26368982]
Row wise minimum: [0.26368982 0.04502433 0.41406378]


# Comparisons

In [10]:
import numpy as np

x = np.arange(1,10)

print(x < 2)
print(x >= 4)

x = np.array([1,2,3,4,5])
(2 * x) == (x ** 2)

x = np.arange(10)
print(x)

# How many values are less than 6
print(np.count_nonzero(x < 6))

# Are there any values greater than 8
print(np.any(x > 8))

# Are all values less than 10
print(np.all(x < 10))


[ True False False False False False False False False]
[False False False  True  True  True  True  True  True]
[0 1 2 3 4 5 6 7 8 9]
6
True
True


# Boolean Masks


In [11]:
x = np.random.randint(0,10,(3,3))
print(x)

# Boolean array
print(x < 6)

# Boolean mask
print(x[x < 6])


[[8 8 0]
 [1 2 7]
 [1 3 1]]
[[False False  True]
 [ True  True False]
 [ True  True  True]]
[0 1 2 1 3 1]
