# Operations on NumPy Arrays

The learning objectives of this section are:

* Manipulate arrays
    * Reshape arrays
    * Stack arrays
* Perform operations on arrays
    * Perform basic mathematical operations
    * Apply built-in functions 
    * Apply your own functions 
    * Apply basic linear algebra operations 


### Manipulating Arrays

Let's look at some ways to manipulate arrays, i.e. changing the shape, combining and splitting arrays, etc.   

#### Reshaping Arrays

Reshaping is done using the ```reshape()``` function.


In [1]:
import numpy as np

# Reshape a 1-D array to a 3 x 4 array
some_array = np.arange(0, 12).reshape(3, 4)
print(some_array)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [2]:
# Can reshape it further 
some_array.reshape(2, 6)

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

In [3]:
# If you specify -1 as a dimension, the dimensions are automatically calculated
# -1 means "whatever dimension is needed" 
some_array.reshape(4, -1)

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

```array.T``` returns the transpose of an array.

In [4]:
# Transposing an array
some_array.T

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

Problem 1

Given a 3x5 NumPy array arr, write a code snippet to extract a subarray that contains only the elements in the second and third rows and the last three columns. Print the resulting subarray.

In [8]:
import numpy as np

# Given array
arr = np.array([
    [10, 20, 30, 40, 50],
    [60, 70, 80, 90, 100],
    [110, 120, 130, 140, 150]
])

# Slicing to extract subarray with second and third rows and last three columns
subarray = arr[1:3, 2:]

# Print the resulting subarray
print("Subarray:\n", subarray)


Subarray:
 [[ 80  90 100]
 [130 140 150]]


Problem 2

Create a 1D NumPy array of 20 random integers between 1 and 100. Split this array into 4 equal-sized subarrays and print each subarray.

In [14]:
import numpy as np

# Create a 1D NumPy array of 20 random integers between 1 and 100
array = np.random.randint(1, 101, 20)

# Split the array into 4 equal-sized subarrays
split_arrays = np.array_split(array, 4)

# Print each subarray
print("Original array:", array)
print("Split arrays:")
for i, subarray in enumerate(split_arrays):
    print(f"Subarray {i+1}: {subarray}")


Original array: [ 27  86  25  20  54  69  96  57  49  91  83   5  91  67  19  10  44  88
  27 100]
Split arrays:
Subarray 1: [27 86 25 20 54]
Subarray 2: [69 96 57 49 91]
Subarray 3: [83  5 91 67 19]
Subarray 4: [ 10  44  88  27 100]


### Stacking and Splitting Arrays

#### Stacking: ```np.hstack()``` and ```n.vstack()```

Stacking is done using the ```np.hstack()``` and ```np.vstack()``` methods. For horizontal stacking, the number of rows should be the same, while for vertical stacking, the number of columns should be the same.

In [5]:
# Creating two arrays
array_1 = np.arange(12).reshape(3, 4)
array_2 = np.arange(20).reshape(5, 4)

print(array_1)
print("\n")
print(array_2)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]]


In [6]:
# vstack
# Note that np.vstack(a, b) throws an error - you need to pass the arrays as a list
np.vstack((array_1, array_2))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

Similarly, two arrays having the same number of rows can be horizontally stacked using ```np.hstack((a, b))```.

Problem 3

Horizontally stack two arrays using hstack, and finally, vertically stack the resultant array with the third array.

Example:
Input 1:
[[1, 2],
 [5, 6]]

[[3, 4],
 [7, 8]]

[[9, 10, 11, 12]]
Output 1:
[[1 2 3 4]
 [5 6 7 8]
 [9 10 11 12]]

In [None]:
list_1 = input_list[0]
list_2 = input_list[1]
list_3 = input_list[2]

# Import NumPy
import numpy as np
array1 = np.array(list_1)
array2 = np.array(list_2)
array3 = np.array(list_3)

# Step 1: Horizontally stack the first two arrays
horizontally_stacked = np.hstack((array1, array2))

# Step 2: Vertically stack the resultant array with the third array
final_result = np.vstack((horizontally_stacked, array3))

# Print the final stacked array
print(final_result)

### Perform Operations on Arrays

Performing mathematical operations on arrays is extremely simple. Let's see some common operations.


#### Basic Mathematical Operations

NumPy provides almost all the basic math functions - exp, sin, cos, log, sqrt etc. The function is applied to each element of the array.


In [7]:
# Basic mathematical operations
a = np.arange(1, 20)

# sin, cos, exp, log
print(np.sin(a))
print(np.cos(a))
print(np.exp(a))
print(np.log(a))

[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155
  0.6569866   0.98935825  0.41211849 -0.54402111 -0.99999021 -0.53657292
  0.42016704  0.99060736  0.65028784 -0.28790332 -0.96139749 -0.75098725
  0.14987721]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362  0.28366219  0.96017029
  0.75390225 -0.14550003 -0.91113026 -0.83907153  0.0044257   0.84385396
  0.90744678  0.13673722 -0.75968791 -0.95765948 -0.27516334  0.66031671
  0.98870462]
[2.71828183e+00 7.38905610e+00 2.00855369e+01 5.45981500e+01
 1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03
 8.10308393e+03 2.20264658e+04 5.98741417e+04 1.62754791e+05
 4.42413392e+05 1.20260428e+06 3.26901737e+06 8.88611052e+06
 2.41549528e+07 6.56599691e+07 1.78482301e+08]
[0.         0.69314718 1.09861229 1.38629436 1.60943791 1.79175947
 1.94591015 2.07944154 2.19722458 2.30258509 2.39789527 2.48490665
 2.56494936 2.63905733 2.7080502  2.77258872 2.83321334 2.89037176
 2.94443898]


#### Apply User Defined Functions

You can also apply your own functions on arrays. For e.g. applying the function ```x/(x+1)``` to each element of an array.

One way to do that is by looping through the array, which is the non-numpy way. You would rather want to write **vectorised code**. 

The simplest way to do that is to vectorise the function you want, and then apply it on the array. Numpy provides the ```np.vectorize()``` method to vectorise functions.

Let's look at both the ways to do it.

In [8]:
print(a)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [9]:
# The non-numpy way, not recommended
a_list = [x/(x+1) for x in a]
print(a_list)

[0.5, 0.6666666666666666, 0.75, 0.8, 0.8333333333333334, 0.8571428571428571, 0.875, 0.8888888888888888, 0.9, 0.9090909090909091, 0.9166666666666666, 0.9230769230769231, 0.9285714285714286, 0.9333333333333333, 0.9375, 0.9411764705882353, 0.9444444444444444, 0.9473684210526315, 0.95]


In [10]:
# The numpy way: vectorize the function, then apply it
f = np.vectorize(lambda x: x/(x+1))
f(a)

array([0.5       , 0.66666667, 0.75      , 0.8       , 0.83333333,
       0.85714286, 0.875     , 0.88888889, 0.9       , 0.90909091,
       0.91666667, 0.92307692, 0.92857143, 0.93333333, 0.9375    ,
       0.94117647, 0.94444444, 0.94736842, 0.95      ])

In [11]:
# Apply function on a 2-d array: Applied to each element 
b = np.linspace(1, 100, 10)
f(b)

array([0.5       , 0.92307692, 0.95833333, 0.97142857, 0.97826087,
       0.98245614, 0.98529412, 0.98734177, 0.98888889, 0.99009901])

This also has the advantage that you can vectorize the function once, and then apply it as many times as needed. 

Problem 4

Given an array, 'array_3' divide each element with 5. 
Hint: Create a vectorized function, then apply it to the array_3.

In [None]:
list_1 = input_list[0:2]
list_2 = input_list[2:4]
import numpy as np
array_1 = np.array(list_1)
array_2 = np.array(list_2)
array_3 =np.hstack((array_1,array_2))

function =  np.vectorize(lambda x: x / 5)

print(function(array_3))

#### Apply Basic Linear Algebra Operations

NumPy provides the ```np.linalg``` package to apply common linear algebra operations, such as:
* ```np.linalg.inv```: Inverse of a matrix
* ```np.linalg.det```: Determinant of a matrix
* ```np.linalg.eig```: Eigenvalues and eigenvectors of a matrix
    
Also, you can multiple matrices using ```np.dot(a, b)```. 


In [12]:
# np.linalg documentation
help(np.linalg)

Help on package numpy.linalg in numpy:

NAME
    numpy.linalg

DESCRIPTION
    Core Linear Algebra Tools
    -------------------------
    Linear algebra basics:
    
    - norm            Vector or matrix norm
    - inv             Inverse of a square matrix
    - solve           Solve a linear system of equations
    - det             Determinant of a square matrix
    - lstsq           Solve linear least-squares problem
    - pinv            Pseudo-inverse (Moore-Penrose) calculated using a singular
                      value decomposition
    - matrix_power    Integer power of a square matrix
    
    Eigenvalues and decompositions:
    
    - eig             Eigenvalues and vectors of a square matrix
    - eigh            Eigenvalues and eigenvectors of a Hermitian matrix
    - eigvals         Eigenvalues of a square matrix
    - eigvalsh        Eigenvalues of a Hermitian matrix
    - qr              QR decomposition of a matrix
    - svd             Singular value decomposition 

In [13]:
# Creating arrays
a = np.arange(1, 10).reshape(3, 3)
b= np.arange(1, 13).reshape(3, 4)
print(a)
print(b)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [14]:
# Inverse
np.linalg.inv(a)

array([[ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15],
       [-6.30503948e+15,  1.26100790e+16, -6.30503948e+15],
       [ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15]])

In [15]:
# Determinant
np.linalg.det(a)

-9.51619735392994e-16

In [16]:
# Eigenvalues and eigenvectors
np.linalg.eig(a)

(array([ 1.61168440e+01, -1.11684397e+00, -9.75918483e-16]),
 array([[-0.23197069, -0.78583024,  0.40824829],
        [-0.52532209, -0.08675134, -0.81649658],
        [-0.8186735 ,  0.61232756,  0.40824829]]))

In [17]:
# Multiply matrices
np.dot(a, b)

array([[ 38,  44,  50,  56],
       [ 83,  98, 113, 128],
       [128, 152, 176, 200]])

## Sorting and Searching in NumPy


In [17]:
import numpy as np

# 1D array example
array_1d = np.array([5, 3, 8, 1, 2])

# Sort the 1D array
sorted_array_1d = np.sort(array_1d)

print("Original 1D array:", array_1d)
print("Sorted 1D array:", sorted_array_1d)


Original 1D array: [5 3 8 1 2]
Sorted 1D array: [1 2 3 5 8]


In [19]:
import numpy as np

# 2D array example
array_2d = np.array([[8, 4, 2], [3, 6, 1], [9, 7, 5]])

# Sort the 2D array along the last axis (default behavior, sorts each row)
sorted_array_2d_row = np.sort(array_2d)

# Sort the 2D array along axis 0 (sorts each column)
sorted_array_2d_col = np.sort(array_2d, axis=0)

print("Original 2D array:\n", array_2d)
print("\nSorted 2D array (sort rows):\n", sorted_array_2d_row)
print("\nSorted 2D array (sort columns):\n", sorted_array_2d_col)


Original 2D array:
 [[8 4 2]
 [3 6 1]
 [9 7 5]]

Sorted 2D array (sort rows):
 [[2 4 8]
 [1 3 6]
 [5 7 9]]

Sorted 2D array (sort columns):
 [[3 4 1]
 [8 6 2]
 [9 7 5]]


In [21]:
import numpy as np

# Create a NumPy array
array = np.array([10, 20, 30, 40, 50, 60])

# Use np.where() to find indices of elements greater than 30
indices = np.where(array > 30)

print("Original array:", array)
print("Indices of elements greater than 30:", indices)
print("Elements greater than 30:", array[indices])


Original array: [10 20 30 40 50 60]
Indices of elements greater than 30: (array([3, 4, 5], dtype=int64),)
Elements greater than 30: [40 50 60]


In [23]:
import numpy as np

# Create a sorted NumPy array
sorted_array = np.array([10, 20, 30, 40, 50, 60])

# Use np.searchsorted() to find the insertion index for the value 35
insert_index = np.searchsorted(sorted_array, 35)

print("Sorted array:", sorted_array)
print("Index where 35 should be inserted to maintain order:", insert_index)


Sorted array: [10 20 30 40 50 60]
Index where 35 should be inserted to maintain order: 3


### Compare Computation Times in NumPy and Standard Python Lists

We mentioned that the key advantages of numpy are convenience and speed of computation. 

You'll often work with extremely large datasets, and thus it is important point for you to understand how much computation time (and memory) you can save using numpy, compared to standard python lists.   

Let's compare the computation times of arrays and lists for a simple task of calculating the element-wise product of numbers. 

In [4]:
## Comparing time taken for computation
import numpy as np

list_1 = [i for i in range(1000000)]
list_2 = [j**2 for j in range(1000000)]

# list multiplication
import time

# store start time, time after computation, and take the difference
t0 = time.time()
product_list = list(map(lambda x, y: x*y, list_1, list_2))
t1 = time.time()
list_time = t1 - t0 
print(t1-t0)


# numpy array 
array_1 = np.array(list_1)
array_2 = np.array(list_2)

t0 = time.time()
array_3 = array_1*array_2
t1 = time.time()
numpy_time = t1 - t0

print(t1-t0)

print("The ratio of time taken is {}".format(list_time/numpy_time))

0.17412137985229492
0.006808757781982422
The ratio of time taken is 25.573149380208697
