### WHAT IS NUMPY?

*NumPy is an open source library available in Python that aids in mathematical, scientific, engineering, and data science programming. NumPy is an incredible library to perform mathematical and statistical operations. It works perfectly well for multi-dimensional arrays and matrices multiplication.*

*NumPy is memory efficiency, meaning it can handle the vast amount of data more accessible than any other library.*

*The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the collection of items of the same type. Items in the collection can be accessed using a zero-based index.*

*Every item in an ndarray takes the same size of block in the memory. Each element in ndarray is an object of data-type object (called dtype).*

*Any item extracted from ndarray object (by slicing) is represented by a Python object of one of array scalar types.*

In [0]:
import numpy as np

In [0]:
np.__version__

In [0]:
# Create an 1d array from a list
list1 = [0,1,2,3,4]
arr1d = np.array(list1)

# Print the array and its type
print(type(arr1d))
arr1d

*The key difference between an array and a list is, arrays are designed to handle vectorized operations while a python list is not.*

*That means, if you apply a function it is performed on every item in the array, rather than on the whole array object.*

In [0]:
# Create a 2d array from a list of lists
list2 = [[0,1,2], [3,4,5], [6,7,8]]
arr2d = np.array(list2)
arr2d

*Another characteristic is that, once a numpy array is created, you cannot increase its size. To do so, you will have to create a new array. But such a behavior of extending the size is natural in a list.*

In [0]:
# Create a float 2d array
arr2d_f = np.array(list2, dtype='float')
arr2d_f

In [0]:
# Convert to int then to str datatype
arr2d_f.astype('int').astype('str')

In [0]:
# Create an object array to hold numbers as well as strings
arr1d_obj = np.array([1, 'a'], dtype='object')
arr1d_obj

*To summarise, the main differences with python lists are:*

1. *Arrays support vectorised operations, while lists don’t.*
2. *Once an array is created, you cannot change its size. You will have to create a new array or overwrite the existing one.*
3. *Every array has one and only one dtype. All items in it should be of that dtype.*
4. *An equivalent numpy array occupies much less space than a python list of lists.*


In [0]:
# Create a 2d array with 3 rows and 4 columns
list2 = [[1, 2, 3, 4],[3, 4, 5, 6], [5, 6, 7, 8]]
arr2 = np.array(list2, dtype='float')
arr2

# shape
print('Shape: ', arr2.shape)

# dtype
print('Datatype: ', arr2.dtype)

# size
print('Size: ', arr2.size)

# ndim
print('Num Dimensions: ', arr2.ndim)

In [0]:
# Extract the first 2 rows and columns
arr2[:2, :2]

In [0]:
list2[:2, :2]  # error

In [0]:
# Get the boolean output by applying the condition to each element.
b = arr2 > 4
b

In [0]:
# Reverse only the row positions
arr2[::-1, ]

In [0]:
# Reverse the row and column positions
arr2[::-1, ::-1]

In [0]:
# Insert a nan and an inf
arr2[1,1] = np.nan  # not a number
arr2[1,2] = np.inf  # infinite
arr2

In [0]:
# Replace nan and inf with -1. Don't use arr2 == np.nan
missing_bool = np.isnan(arr2) | np.isinf(arr2)
arr2[missing_bool] = -1  
arr2

In [0]:
# mean, max and min
print("Mean value is: ", arr2.mean())
print("Max value is: ", arr2.max())
print("Min value is: ", arr2.min())

In [0]:
# Row wise and column wise min
print("Column wise minimum: ", np.amin(arr2, axis=0))
print("Row wise minimum: ", np.amin(arr2, axis=1))

In [0]:
# Assign portion of arr2 to arr2a. Doesn't really create a new array.
arr2a = arr2[:2,:2]  
arr2a[:1, :1] = 100  # 100 will reflect in arr2
arr2

In [0]:
# Copy portion of arr2 to arr2b
arr2b = arr2[:2, :2].copy()
arr2b[:1, :1] = 101  # 101 will not reflect in arr2
arr2

In [0]:
# RESHAPE

# numpy.reshape(array, shape, order = ‘C’): 
# Shapes an array without changing data of array.

array = np.arange(8)
print("Original array :", array)
 
# shape array with 2 rows and 4 columns
array = np.arange(8).reshape(2, 4)
print("\narray reshaped with 2 rows and 4 columns :", array)
 
# shape array with 2 rows and 4 columns
array = np.arange(8).reshape(4 ,2)
print("\narray reshaped with 2 rows and 4 columns :", array)
 
# Constructs 3D array
array = np.arange(8).reshape(2, 2, 2)
print("\nOriginal array reshaped to 3D :", array)

In [0]:
# Reshape a 3x4 array to 4x3 array
arr2.reshape(4, 3)

In [0]:
# Flatten it to a 1d array
arr2.flatten()

In [0]:
# FLATTEN

# numpy.ndarray.flatten(order = ‘C’): 
# Return a copy of the array collapsed into one dimension.

array = np.array([[1, 2], [3, 4]])

# using flatten method
print(array.flatten())

# using flatten method
print(array.flatten('F'))

In [0]:
# Changing the flattened array does not change parent
b1 = arr2.flatten()  
b1[0] = 100  # changing b1 does not affect arr2
arr2

In [0]:
# Changing the raveled array changes the parent also.
b2 = arr2.ravel()  
b2[0] = 101  # changing b2 changes arr2 also
arr2

In [0]:
# ARANGE

# arange([start,] stop[, step,][, dtype]): 
# Returns an array with evenly spaced elements as per the interval. The interval mentioned is half opened i.e. [Start, Stop)

print("A: ", np.arange(4))
print("B: ", np.arange(4, 10))
print("C: ", np.arange(4, 20, 3))

In [0]:
# Generating Sequence

# Lower limit is 0 be default
print(np.arange(5))  

# 0 to 9
print(np.arange(0, 10))  

# 0 to 9 with step of 2
print(np.arange(0, 10, 2))  

# 10 to 1, decreasing order
print(np.arange(10, 0, -1))

In [0]:
# LINSPACE

# numpy.linspace(start, stop, num = 50, endpoint = True, retstep = False, dtype = None):
# Returns number spaces evenly w.r.t interval. Similiar to arange but instead of step it uses sample number.

print("A: ", np.linspace(2.0, 3.0, num=5))
print("B: ", np.linspace(0, 2, 10))

# restep set to True
print("C: ", np.linspace(2.0, 3.0, num=5 , retstep=True))

In [0]:
# Start at 1 and end at 50
np.linspace(start=1, stop=50, num=10, dtype=int)

In [0]:
np.zeros([2,2])

In [0]:
np.ones([2,2])

In [0]:
a = [1,2,3] 

# Repeat whole of 'a' two times
print('Tile:   ', np.tile(a, 2))

# Repeat each element of 'a' two times
print('Repeat: ', np.repeat(a, 2))

In [0]:
# Iterating through array

import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)

print ('Original array is:')
print (a)
print ('\n')

print ('Modified array is:')
for x in np.nditer(a):
   print (x),

In [0]:
# Python program for iterating over array using particular order

a = np.arange(12) 
 
# shape array with 3 rows and 4 columns 
a = a.reshape(3,4) 
 
print('Original array is:') 
print(a)
print()  
 
print('Modified array in C-style order:')
 
# iterating an array in a given order  
for x in np.nditer(a, order = 'C'): 
    print(x)

print('Modified array in F-style order:')

# iterating an array in a given order   
for x in np.nditer(a, order = 'F'): 
    print(x)

In [0]:
# Random numbers between [0,1) of shape 2,2
print(np.random.rand(2,2))

# Normal distribution with mean=0 and variance=1 of shape 2,2
print(np.random.randn(2,2))

# Random integers between [0, 10) of shape 2,2
print(np.random.randint(0, 10, size=[2,2]))

# One random number between [0,1)
print(np.random.random())

# Random numbers between [0,1) of shape 2,2
print(np.random.random(size=[2,2]))

# Pick 10 items from a given list, with equal probability
print(np.random.choice(['a', 'e', 'i', 'o', 'u'], size=10))  

# Pick 10 items from a given list with a predefined probability 'p'
print(np.random.choice(['a', 'e', 'i', 'o', 'u'], size=10, p=[0.3, .1, 0.1, 0.4, 0.1]))  # picks more o's

In [0]:
# Set the random seed
np.random.seed(100)

# Create random numbers between [0,1) of shape 2,2
print(np.random.rand(2,2))

In [0]:
# Create random integers of size 10 between [0,10)
np.random.seed(200)
arr_rand = np.random.randint(0, 10, size=10)
print(arr_rand)

# Get the unique items and their counts
uniqs, counts = np.unique(arr_rand, return_counts=True)
print("Unique items : ", uniqs)
print("Counts       : ", counts)

In [0]:
# WHERE METHOD

# Create an array
import numpy as np
arr_rand = np.array([8, 8, 3, 7, 7, 0, 4, 2, 5, 2])
print("Array: ", arr_rand)

# Positions where value > 5
index_gt5 = np.where(arr_rand > 5)
print("Positions where value > 5: ", index_gt5)

In [0]:
a = np.arange(9).reshape((3, 3))
print(a)

print(np.where(a < 4, -1, 100))

In [0]:
# Take items at given index
arr_rand.take(index_gt5)

In [0]:
# If value > 5, then yield 'gt5' else 'le5'
np.where(arr_rand > 5, 'gt5', 'le5')

In [0]:
# Location of the max
print('Position of max value: ', np.argmax(arr_rand))  

# Location of the min
print('Position of min value: ', np.argmin(arr_rand)) 

In [0]:
# Find Location isNan
x = np.array([[1,2,3,4],
              [2,3,np.nan,5],
              [np.nan,5,2,3]])

np.argwhere(np.isnan(x))

In [0]:
# Find Location Non Zero
y = np.array([[1,2,3,4],
              [2,3,0,5],
              [np.nan,5,2,3]])

np.count_nonzero(y)

In [0]:
np.count_nonzero(~np.isnan(x))

In [0]:
# CONCATENATE

# There are 3 different ways of concatenating two or more numpy arrays.

# Method 1: np.concatenate by changing the axis parameter to 0 and 1
# Method 2: np.vstack and np.hstack
# Method 3: np.r_ and np.c_

a = np.zeros([4, 4])
b = np.ones([4, 4])

# Vertical Stack Equivalents (Row wise)
np.concatenate([a, b], axis=0)  
np.vstack([a,b])  
np.r_[a,b]  

In [0]:
# Horizontal Stack Equivalents (Coliumn wise)
np.concatenate([a, b], axis=1) 
np.hstack([a,b])  
np.c_[a,b]

In [0]:
# SORTING

arr = np.random.randint(1,6, size=[8, 4])
arr

In [0]:
# Sort each columns of arr
np.sort(arr, axis=0)

In [0]:
# Argsort the first column
sorted_index_1stcol = arr[:, 0].argsort()

# Sort 'arr' by first column without disturbing the integrity of rows
arr[sorted_index_1stcol]

In [0]:
# Descending sort
arr[sorted_index_1stcol[::-1]]

In [0]:
# Sort by column 0, then by column 1
lexsorted_index = np.lexsort((arr[:, 1], arr[:, 0])) 
arr[lexsorted_index]

In [0]:
# Define a scalar function
def foo(x):
    if x % 2 == 1:
        return x**2
    else:
        return x/2

# On a scalar
print('x = 10 returns ', foo(10))
print('x = 11 returns ', foo(11))

In [0]:
# Vectorize foo(). Make it work on vectors.
foo_v = np.vectorize(foo, otypes=[float])

print('x = [10, 11, 12] returns ', foo_v([10, 11, 12]))
print('x = [[10, 11, 12], [1, 2, 3]] returns ', foo_v([[10, 11, 12], [1, 2, 3]]))

In [0]:
# Create a 4x10 random array
np.random.seed(100)
arr_x = np.random.randint(1,10,size=[4,10])
print(arr_x)

# Define func1d
def max_minus_min(x):
    return np.max(x) - np.min(x)

# Apply along the rows
print('Row wise: ', np.apply_along_axis(max_minus_min, 1, arr=arr_x))

# Apply along the columns
print('Column wise: ', np.apply_along_axis(max_minus_min, 0, arr=arr_x))

In [0]:
# Create a 1D array
x = np.arange(5)
print('Original array: ', x)

# Introduce a new column axis
x_col = x[:, np.newaxis]
print('x_col shape: ', x_col.shape)
print(x_col)

# Introduce a new row axis
x_row = x[np.newaxis, :]
print('x_row shape: ', x_row.shape)
print(x_row)

In [0]:
# Create the array and bins
x = np.arange(10)
bins = np.array([0, 3, 6, 9])
print(x)

# Get bin allotments
np.digitize(x, bins)

In [0]:
# Matrix Multiplication - Row into column

# input two matrices 
mat1 = ([1, 6, 5],[3 ,4, 8],[2, 12, 3]) 
mat2 = ([3, 4, 6],[5, 6, 7],[6,56, 7]) 
  
# This will return dot product 
res = np.dot(mat1,mat2) 
  
# print resulted matrix 
print(res) 

In [0]:
# Percentile

a = np.array([[30,40,70],[80,20,10],[50,90,60]]) 

print('Our array is:')
print(a) 
print('\n')  
      
print('Applying percentile() function:')
print(np.percentile(a,50) )
print('\n')  
      
print('Applying percentile() function along axis 1:' )
print(np.percentile(a,50, axis = 1) )
print('\n') 

print('Applying percentile() function along axis 0:' )
print(np.percentile(a,50, axis = 0))