# 1. Creating Arrays

NumPy arrays are basically just Python lists with added features. In fact, you can easily convert a Python list to a Numpy array using the np.array function, which takes in a Python list as its required argument. The function also has quite a few keyword arguments, but the main one to know is dtype. The dtype keyword argument takes in a NumPy type and manually casts the array to the specified type.

In [1]:
import numpy as np

### One dimentional Array

In [6]:
d1=np.array([1,2,3,4,5,5,6])

In [11]:
d1

array([1, 2, 3, 4, 5, 5, 6])

In [16]:
#Checking dimention of the Array
print(d1.ndim)

#Checking the shape of the array
print(d1.shape)

#checking the type of the Array
print(d1.dtype)

1
(7,)
int32


### Two Dimentional Array

In [19]:
d2=np.array([[1,2,3,4],[2,4,6,8],[3,6,9,12]])

In [20]:
d2

array([[ 1,  2,  3,  4],
       [ 2,  4,  6,  8],
       [ 3,  6,  9, 12]])

In [17]:
#Checking dimention of the Array
print(d2.ndim)

#Checking the shape of the array
print(d2.shape)

#checking the type of the Array
print(d2.dtype)

2
(2, 4)
int32


### Three Dimentional Array

In [87]:
np.array([ [[0, 1],[2, 3]] , [[4, 5],[6, 7]] ])

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

### String type array

In [12]:
arr = np.array([["hello", "hello2"],["her","hi"]])

In [24]:
arr

array([['hello', 'hello2'],
       ['her', 'hi']], dtype='<U6')

### Multiple type array

When the elements of a NumPy array are mixed types, then the array's type will be upcast to the highest level type. This means that if an array input has mixed int and float elements, all the integers will be cast to their floating-point equivalents. If an array is mixed with int, float, and string elements, everything is cast to strings.

The code below is an example of np.array upcasting. Both integers are cast to their floating-point equivalents.

In [18]:
np.array([1,'Hello',1.0,True])

array(['1', 'Hello', '1.0', 'True'], dtype='<U11')

# 2. Copying Array

Similar to Python lists, when we make a reference to a NumPy array it doesn't create a different array. Therefore, if we change a value using the reference variable, it changes the original array as well. We get around this by using an array's inherent copy function. The function has no required arguments, and it returns the copied array.

In the code example below, c is a reference to a while d is a copy. Therefore, changing c leads to the same change in a, while changing d does not change the value of b.

In [39]:
# Assigning Array Does not copy an array to other Variable
a = np.array([1,2,3,4,5])

assign_array = a
print(assign_array, a)

# After Changing the value
assign_array[3]=88
print(assign_array, a)

[1 2 3 4 5] [1 2 3 4 5]
[ 1  2  3 88  5] [ 1  2  3 88  5]


In [42]:
# Copying the Array 
b = np.array([2,3,5,8,9])
copied = b.copy()

#Before Changing the value
print(copied, b)

#Changing the the values in copied Array
copied[3] = 1000

#After Changing the element
print(copied, b)


[2 3 5 8 9] [2 3 5 8 9]
[   2    3    5 1000    9] [2 3 5 8 9]


# 3. Casting

We cast NumPy arrays through their inherent astype function. The function's required argument is the new type for the array. It returns the array cast to the new type.

The code below shows an example of casting using the astype function. The dtype property returns the type of an array.

In [49]:
arr = np.array([0, 1, 2])
# Checking the type of the current array
print(arr.dtype)

# Changing the data type of the array
arr = arr.astype(np.float32)

# After changing the data type 
print(arr.dtype)

int32
float32


# 4. Ranged data in Array
While np.array can be used to create any array, it is equivalent to hardcoding an array. This won't work when the array has hundreds of values. Instead, NumPy provides an option to create ranged data arrays using np.arange. The function acts very similar to the range function in Python, and will always return a 1-D array.

The code below contains example usages of np.arange.

In [56]:
np.arange(1,10)

# Starting Argument =1
# Ending Argument =2

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [57]:
np.arange(-3,18)

# Starting Argument =-3
# Ending Argument =18

array([-3, -2, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13,
       14, 15, 16, 17])

In [58]:
np.arange(0,100, 10)
# Starting Argument = 0
# Ending Argument = 100
# Step Argument = 10

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

### np.linspace

To specify the number of elements in the returned array, rather than the step size, we can use the np.linspace function.

This function takes in a required first two arguments, for the start and end of the range, respectively. The end of the range is inclusive for np.linspace, unless the keyword argument endpoint is set to False. To specify the number of elements, we set the num keyword argument (its default value is 50).

The code below shows example usages of np.linspace. It also takes in the dtype keyword argument for manual casting.

In [60]:
np.linspace(5,11, num =4)

# Starting Argument = 5
# Ending Argument = 11
# Total No of elements in the array = 4

array([ 5.,  7.,  9., 11.])

In [63]:
# endpoint = Flase means that it will start from the starting argument 
#and will complete the (num size) with not included end argument

np.linspace(5, 11, num=4, endpoint=False)

array([5. , 6.5, 8. , 9.5])

In [70]:
# Specifying the data type

np.linspace(5, 11, num=30, dtype=np.int32)

array([ 5,  5,  5,  5,  5,  6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,
        8,  8,  8,  9,  9,  9,  9,  9, 10, 10, 10, 10, 11])

In [68]:
# Default the num size = 50 

np.linspace(1,2)

array([1.        , 1.02040816, 1.04081633, 1.06122449, 1.08163265,
       1.10204082, 1.12244898, 1.14285714, 1.16326531, 1.18367347,
       1.20408163, 1.2244898 , 1.24489796, 1.26530612, 1.28571429,
       1.30612245, 1.32653061, 1.34693878, 1.36734694, 1.3877551 ,
       1.40816327, 1.42857143, 1.44897959, 1.46938776, 1.48979592,
       1.51020408, 1.53061224, 1.55102041, 1.57142857, 1.59183673,
       1.6122449 , 1.63265306, 1.65306122, 1.67346939, 1.69387755,
       1.71428571, 1.73469388, 1.75510204, 1.7755102 , 1.79591837,
       1.81632653, 1.83673469, 1.85714286, 1.87755102, 1.89795918,
       1.91836735, 1.93877551, 1.95918367, 1.97959184, 2.        ])

# 5. Reshaping data
The function we use to reshape data in NumPy is np.reshape. It takes in an array and a new shape as required arguments. The new shape must exactly contain all the elements from the input array. For example, we could reshape an array with 12 elements to (4, 3), but we can't reshape it to (4, 4).

We are allowed to use the special value of -1 in at most one dimension of the new shape. The dimension with -1 will take on the value necessary to allow the new shape to contain all the elements of the array.

The code below shows example usages of np.reshape.

In [77]:
arr = np.arange(8)
print(arr)

reshaped=np.reshape(arr, (2, 4))
print(f"\n{reshaped}")

[0 1 2 3 4 5 6 7]

[[0 1 2 3]
 [4 5 6 7]]


In [89]:
arr = np.arange(8)

#reshaping into 3D
np.reshape(arr, (2, 2, 2))

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [94]:
arr = np.arange(8)

# Reshaping into 1 Column
arr.reshape(-1,1)

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7]])

#### Flatten Method

While the np.reshape function can perform any reshaping utilities we need, NumPy provides an inherent function for flattening an array. Flattening an array reshapes it into a 1D array. Since we need to flatten data quite often, it is a useful function.

The code below flattens an array using the inherent flatten function.

In [111]:
arr = np.arange(8)
arr=np.reshape(arr, (2,2,2))
print(arr)

#Applying the faltten Method
arr.flatten()

[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]


array([0, 1, 2, 3, 4, 5, 6, 7])

# 6.Transposing
Similar to how it is common to reshape data, it is also common to transpose data. Perhaps we have data that's supposed to be in a particular format, but some new data we get is rearranged. We can just transpose the data, using the np.transpose function, to convert it to the proper format.

The code below shows an example usage of the np.transpose function. The matrix rows become columns after the transpose.

In [131]:
arr = np.arange(15)

reshaped = np.reshape(arr, (3,5))
print(f"Reshaped data\n{reshaped}\nShape = {reshaped.shape}\n")

#After Transposing the Columns will change into rows and rows will be into columns
transposed = reshaped.transpose()
print(f"Transposed data\n{transposed}\nShape = {transposed.shape}")

Reshaped data
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
Shape = (3, 5)

Transposed data
[[ 0  5 10]
 [ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]]
Shape = (5, 3)


# 7. Zeros and ones
Sometimes, we need to create arrays filled solely with 0 or 1. For example, since binary data is labeled with 0 and 1, we may need to create dummy datasets of strictly one label. For creating these arrays, NumPy provides the functions np.zeros and np.ones. They both take in the same arguments, which includes just one required argument, the array shape. The functions also allow for manual casting using the dtype keyword argument.

The code below shows example usages of np.zeros and np.ones.

#### Ones

In [141]:
# 9,9 is the shape which we want the matrix of ones
print(np.ones((9,9)))

[[1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]]


In [142]:
np.ones((6,2))

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

#### Zeros

In [84]:
np.zeros((9,9))

array([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.]])

# 8. Matrix multiplication
Since NumPy arrays are basically vectors and matrices, it makes sense that there are functions for dot products and matrix multiplication. Specifically, the main function to use is np.matmul, which takes two vector/matrix arrays as input and produces a dot product or matrix multiplication.

The code below shows various examples of matrix multiplication. When both inputs are 1-D, the output is the dot product.

Note that the dimensions of the two input matrices must be valid for a matrix multiplication. Specifically, the second dimension of the first matrix must equal the first dimension of the second matrix, otherwise np.matmul will result in a ValueError.

In [156]:
# Dot product
arr1 = np.array([1, 2, 3])
arr2 = np.array([-3, 0, 10])

print(np.matmul(arr1, arr2))

27


In [153]:
# When rows = columns then the Matrix Multiplication will show no error and will multiply
arr= np.arange(12)

matrix_1 = np.reshape(arr,(2,6))
matrix_2 = np.reshape(arr,(6,2))

print(matrix_1)
print(matrix_2)

np.matmul(matrix_1,matrix_2)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]]
[[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]]


array([[110, 125],
       [290, 341]])

In [158]:
# When rows != columns then the Matrix Multiplication will show error and will not multiply
arr3 = np.array([[1, 2],
                 [3, 4],
                 [5, 6]])

np.matmul(arr3,arr3)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)

# 9. Random integers
Similar to the Python random module, NumPy has its own submodule for pseudo-random number generation called np.random. It provides all the necessary randomized operations and extends it to multi-dimensional arrays. To generate pseudo-random integers, we use the np.random.randint function.

The code below shows example usages of np.random.randint.

In [163]:
import numpy as np
print(np.random.randint(5))
print(np.random.randint(5))
print(np.random.randint(5, high=6))

random_arr = np.random.randint(4, high=14,size=(2, 2))
print(random_arr)

0
4
5
[[10  4]
 [12  5]]


In [113]:
np.random.randint(5, size=(2,2), high = 14)

array([[ 5,  9],
       [ 7, 11]])

#### Explanation
The np.random.randint function takes in a single required argument, which actually depends on the high keyword argument. If high=None (which is the default value), then the required argument represents the upper (exclusive) end of the range, with the lower end being 0. Specifically, if the required argument is n, then the random integer is chosen uniformly from the range [0, n).

If high is not None, then the required argument will represent the lower (inclusive) end of the range, while high represents the upper (exclusive) end.

The size keyword argument specifies the size of the output array, where each integer in the array is randomly drawn from the specified range. As a default, np.random.randint returns a single integer.

# 10. Utility functions
Some fundamental utility functions from the np.random module are np.random.seed and np.random.shuffle. We use the np.random.seed function to set the random seed, which allows us to control the outputs of the pseudo-random functions. The function takes in a single integer as an argument, representing the random seed.

The code below uses np.random.seed with the same random seed. Note how the outputs of the random functions in each subsequent run are identical when we set the same random seed.

In [186]:
np.random.seed(1)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
                               size=(2, 2))
print(random_arr)

# New seed
np.random.seed(2)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
                               size=(2, 2))
print(random_arr)

# Original seed
np.random.seed(1)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
                               size=(2, 2))
print(random_arr)

5
[[15 75]
 [12 78]]
8
[[18 75]
 [25 46]]
5
[[15 75]
 [12 78]]


#### .shuffle method 

The np.random.shuffle function allows us to randomly shuffle an array. Note that the shuffling happens in place (i.e. no return value), and shuffling multi-dimensional arrays only shuffles the first dimension.

The code below shows example usages of np.random.shuffle. Note that only the rows of matrix are shuffled (i.e. shuffling along first dimension only).

In [194]:
vec = np.array([1, 2, 3, 4, 5])
print(f"Original = {vec}\n")

np.random.shuffle(vec)
print(f"Shuffle = {vec}\n")

np.random.shuffle(vec)
print(f"Again shuffle = {vec}")


Original = [1 2 3 4 5]

Shuffle = [5 3 1 4 2]

Again shuffle = [4 1 3 5 2]


# 11.Array accessing
Accessing NumPy arrays is identical to accessing Python lists. For multi-dimensional arrays, it is equivalent to accessing Python lists of lists.

The code below shows example accesses of NumPy arrays.

In [195]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr[0])
print(arr[4])

arr = np.array([[6, 3],
                [0, 2]])
# Subarray
print(arr[1,1])

1
5
2


# 12. Slicing
NumPy arrays also support slicing. Similar to Python, we use the colon operator (i.e. arr[:]) for slicing. We can also use negative indexing to slice in the backwards direction.

The code below shows example slices of a 1-D NumPy array.

In [29]:
arr = np.array([1, 2, 3, 4, 5])
print(repr(arr[:]))
print(repr(arr[1:]))
print(repr(arr[2:4]))
print(repr(arr[:-1]))
print(repr(arr[-2:]))

array([1, 2, 3, 4, 5])
array([2, 3, 4, 5])
array([3, 4])
array([1, 2, 3, 4])
array([4, 5])


For multi-dimensional arrays, we can use a comma to separate slices across each dimension.

The code below shows example slices of a 2-D NumPy array.

In [196]:
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])
print(arr[:])
print(arr[1:])
print(arr[:, -1])
print(arr[:, 1:])
print(arr[0:1, 1:])
print(arr[0, 1:])

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[4 5 6]
 [7 8 9]]
[3 6 9]
[[2 3]
 [5 6]
 [8 9]]
[[2 3]]
[2 3]


# 13. Analysis
It is often useful to analyze data for its main characteristics and interesting trends. there are a few techniques in NumPy that allow us to quickly inspect data arrays.
For example, we can obtain minimum and maximum values of a NumPy array using its inherent min and max functions. This gives us an initial sense of the data's range, and can alert us to extreme outliers in the data.

The code below shows example usages of the min and max functions.

In [197]:
import numpy as np
arr = np.array([[0, 72, 3],
                [1, 3, -60],
                [-3, -2, 4]])
print(arr.min())
print(arr.max())

print(repr(arr.min(axis=0)))
print(repr(arr.max(axis=-1)))

-60
72
array([ -3,  -2, -60])
array([72,  3,  4])


In our example, we use axis=0 to find an array of the minimum values in each column of arr and axis=1 to find an array of the maximum values in each row of arr.





# 13. Statistical metrics
NumPy also provides basic statistical functions such as np.mean, np.var, and np.median, to calculate the mean, variance, and median of the data, respectively.

The code below shows how to obtain basic statistics with NumPy. Note that np.median applied without axis takes the median of the flattened array.

In [32]:
arr = np.array([[0, 72, 3],
                [1, 3, -60],
                [-3, -2, 4]])
print(np.mean(arr))
print(np.var(arr))
print(np.median(arr))
print(repr(np.median(arr, axis=-1)))

2.0
977.3333333333334
1.0
array([ 3.,  1., -2.])


Each of these functions takes in the data array as a required argument and axis as a keyword argument.