## Introduction



The NumPy library, short for "Numerical Python," is a powerful Python package designed for numerical computations. It provides efficient tools to work with arrays, which are collections of elements (usually numbers) organized in one or more dimensions. NumPy arrays are much faster and more memory-efficient compared to native Python lists, making them ideal for scientific and engineering applications. Additionally, NumPy offers a wide array of mathematical functions, including linear algebra, statistics, and interpolation, which help solve complex problems with concise and optimized code. By the end of module you will be able to:

1.  Know about basic Numpy data creation and manipulation techniques
2.  Understand the Concept of Linear Interpolation
    -   Clearly explain what linear interpolation is and when it is used.
    -   Write and interpret the mathematical equation for linear interpolation.
3.  Recognize the Role of NumPy in Interpolation
    -   Identify NumPy as a powerful library for numerical computations, specifically as a tool for performing interpolation.
4.  Use `np.interp` to Perform Linear Interpolation in Python
    -   Understand the syntax and arguments of the `np.interp` function and apply it effectively to estimate intermediate values.
    -   Differentiate between known data points and new data points where interpolation is applied.
5.  Apply Interpolation to Practical Problems
    -   Use linear interpolation to fill in missing data or estimate values in real-world datasets.
    -   Recognize interpolation as part of the data processing pipeline, including its limitations (e.g., assuming a linear relationship in cases where the underlying function may not be linear).



### Vectors and such



A vector can be thought of as a list of numbers, but all its elements must be of the same type, such as integers, floats, or booleans. In mathematics, vectors are used to represent points in space, whether in two dimensions, three dimensions, or beyond. Linear algebra employs vector notation to efficiently illustrate geometric relationships and systems of linear equations, while also offering numerous methods for manipulating vector data, such as the dot product, cross product, matrix inversion, eigenvectors etc etc.. Indeed, vector arithmetic forms the foundation of most computational techniques, playing a crucial role in nearly all artificial intelligence endeavors. The meteoric rise of Nvidia is owed to the fact that GPU's are particularly well suited for vector based computations. Let's take a look at a simple 2-D vector addition:

Let $\mathbf{a} = \begin{bmatrix} 2 \\ 3 \end{bmatrix}$ and $\mathbf{b} = \begin{bmatrix} 4 \\ 1 \end{bmatrix}$. 

The addition of these two vectors is given by:

$$
\mathbf{c} = \mathbf{a} + \mathbf{b} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} + \begin{bmatrix} 4 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 + 4 \\ 3 + 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 4 \end{bmatrix}
$$

You will notice that the vector elements are added element wise. If you try this with a list, it will not work



In [1]:
a = [1, 3]
b = [4, 1]
c = a + b
print(c)

When we utilize NumPy data types, Python becomes aware of the principles of linear algebra. The ability of NumPy data types to handle element-wise operations is crucial not only for linear algebra but also often eliminates the necessity of writing loops. So let's take a closer look how to create and manipulate Numpy data. 

Creating arrays and matrices



In [1]:
import numpy as np

v = np.array([1,2])  # creating an array from a list
z = np.zeros(3)  # creating an array of zeros
o = np.ones(4) #  # creating an array of one
print(f"v = {v}")
print(f"z = {z}")
print(f"z=o = {o}")

In [1]:
import numpy as np

# creating a matrix from a list of lists
M = np.array([[1, 2], [3, 4], [5, 6]])
print(M)

#### Creating an array with a specified datatype



When you create a Numpy array, it defaults to float numbers. It is however possible to create arrays of integers or boolean values. You encountered the `dtype` keyword before with pandas.



In [1]:
import numpy as np

i = np.ones(3, dtype=int64) # dtype = data type
b = np.ones(3, dtype=bool)  # int/float/bool and object
print(f"i = {i}")
print(f"b = {b}")

#### Manipulating array structure



Switching rows and columns (Transposing) 



In [1]:
import numpy as np

# creating a matrix from a list of lists
M = np.array([[1, 2], [3, 4], [5, 6]])
print(M)
print()
print(M.T)

We can also change the geometry of an existing array using the `reshape` method 



In [1]:
import numpy as np

# creating a matrix from a list of lists
M = np.array([[1, 2], [3, 4], [5, 6]])
M_column = M.reshape(6,1)
M_row =  M.reshape(1,6)
M_flat = M.reshape(6)
print(f"M_flat = {M_flat}")
print(f"M_row = {M_row}")
print(f"M_column = {M_column}")

You can query the shape of an array as



In [1]:
print(M_column.shape)

#### Accessing array elements



Uses the regular slicing syntax



In [1]:
import numpy as np

# creating a matrix from a list of lists
M = np.array([[1, 2], [3, 4], [5, 6]])
print(M)
print(f"M[1,1] = {M[0,1]}") # [row, column]

#### Numpy and type hints



Numpy supports type hinting through the `Numpy.typing` module. It is also possible to reflect the shape of the array, but this becomes quite complex and is beyond the scope of this course.



In [1]:
import numpy as np
import numpy.typing as npt

NDArrayInt = npt.NDArray[np.int64] # declare new type hint
M: NDArrayInt = np.array([[1, 2], [3, 4], [5, 6]])

#### Vector operations



Since Numpy knows about the mathematical meaning of a vector vector-math becomes straight forward:



In [1]:
import numpy as np

u = np.array([1,5])
v = np.array([-2, 0.5])

print(f"u * 2 = {u * 2}")
print(f"u + v = {u + v}")

#### Plotting vectors as arrows



2-D vector math is easily visualized with matplotlib's arrow  object.   In the code below I use the `*` and `**` operators to achieve more compact and readable form. You may recall that python automatically expands list type objects when we write `a, b = [1,3]`. There are however cases where this does not happen automatically, e.g, if you pass a list to a function. We can use the `*` to manually unpack list-type data, and the `**` operator to unpack dictionaries . This way we dot have to repeat lengthy default options each time we plot an arrow.



In [1]:
import numpy as np
import matplotlib.pyplot as plt

# ---- define vectors ---------------- #
origin = (0,0)  # lets assume that all vectors start at 0,0
v = np.array([1,2])
u = np.array([1,-3])

# ---- vector math ------------------- #

# ---- set default arrow options ----- #
arrow_options = {"length_includes_head":True,
                 "width":0.01,
                 "head_width":0.1,
                 }

# ---- plot vectors ------------------ #
fig, ax = plt.subplots()
ax.arrow(*origin, *v,color="C0",label="v",**arrow_options)
ax.arrow(*origin, *u,color="C1",label="u",**arrow_options)
ax.legend()
plt.show()

#### Rotating a vector with a translation matrix



Vector addition allows us to translate vectors in space. If we want to rotate a vector, we can use a rotation matrix like this:

\begin{equation}
\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}
\end{equation}

Let's do this with python



In [1]:
import numpy as np
from math import sin, cos, radians
import matplotlib.pyplot as plt

u = np.array([1,0]) # set vector
phi = radians(30) # sin/cos expect radians = deg * pi/180
M = [[cos(phi), -sin(phi)],
     [sin(phi), cos(phi)],
     ]

v = np.matmul(M, u) # Matrix first, vector last!

# ---- set default arrow options ----- #
arrow_options = {"length_includes_head":True,
                 "width":0.01,
                 "head_width":0.1,
                 }

# ---- plot vectors ------------------ #
fig, ax = plt.subplots()
ax.arrow(*origin, *u,color="C0",label="v",**arrow_options)
ax.arrow(*origin, *v,color="C1",label="u",**arrow_options)
ax.legend()
plt.show()

We can extend this notation for 3 dimensions and add the translation amount into the matrix as well,

\begin{equation}
\begin{bmatrix} 
x' \\ y' \\ z' \\ 1 
\end{bmatrix} = \begin{bmatrix} 
\cos(\theta) & -\sin(\theta) & 0 & t_x \\ 
\sin(\theta) & \cos(\theta) & 0 & t_y \\ 
0 & 0 & 1 & t_z \\ 
0 & 0 & 0 & 1 
\end{bmatrix} \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix}
\end{equation}

and what you end up with is the foundation of all computer graphics (mapping projections etc. etc.). And before you ask, the additional 1 is required because the translation component is a 4th element in the matrix. So we need to add an additional row to make it a 4 by 4 matrix, and additional 1 in the vector, so that we can multiply it with a 4 by 4 matrix. This notation is also called homogeneous  coordinates. 



### Linear Interpolation



NumPy offers a wide array of powerful functions for various mathematical computations. One notable example is its interpolation function. Linear interpolation is a technique used to estimate values within the range of a set of known data points. This method is particularly useful when dealing with discrete data and needing to derive a smooth or intermediate value between two observed points. Essentially, linear interpolation assumes that the change between these two points is linear, allowing it to be approximated by a straight line.

Mathematically, if we have two known points $(x_1, y_1)$ and $(x_2, y_2)$, and we want to estimate the value of $y$ for an intermediate value $x$, we use the following formula for linear interpolation:

$$
y = y_1 + \frac{(x - x_1)}{(x_2 - x_1)} \cdot (y_2 - y_1)
$$

Here:

-   $x_1$ and $x_2$ are the known $x$-values (data points),
-   $y_1$ and $y_2$ are the corresponding $y$-values (data points),
-   $x$ is the value where we want to estimate $y$.

This formula essentially calculates how far $x$ is between $x_1$ and $x_2$ (the weight or fraction) and applies the same proportion to the $y$-values. Linear interpolation is widely used in cases where a quick and straightforward method is needed to fill gaps or generate smoothed transitions in data.



#### A practical example



Below, we demonstrate how to use NumPy's `np.interp` function for linear interpolation. We'll create a simple dataset of known $x$-values and their corresponding $y$-values, and then estimate the $y$-values for a set of new $x$-values not present in the original data. We'll also visualize the results using Matplotlib to show how interpolation works.



In [1]:
import numpy as np
import matplotlib.pyplot as plt

# Known data points
x_known = np.array([1, 2, 3, 4, 5])  # x-values
y_known = np.array([2, 4, 6, 8, 10])  # y-values

# New x-values (where we want to estimate y-values)
x_new = np.array([1.5, 2.5, 3.5, 4.5])

# Perform linear interpolation using np.interp
y_new = np.interp(x_new, x_known, y_known)

# Plot the results
plt.figure(figsize=(8, 5))

# Plot the known data points
plt.scatter(x_known, y_known, color='blue', label='Known Points', zorder=3)
plt.plot(x_known, y_known, color='blue', linestyle='--', alpha=0.6, label='Known Line')

# Plot the interpolated points
plt.scatter(x_new, y_new, color='red', label='Interpolated Points', zorder=3)
# Add titles and labels
plt.show()

#### Explanation of the Code



1.  Input Data: We provide `x_known` and `y_known` arrays that define the known data points.
2.  New Points: The `x_new` array contains the $x$-values where we want to estimate $y$-values.
3.  Linear Interpolation: The `np.interp` function takes three arguments:
    -   The new $x$-values (`x_new`) where $y$-values need to be interpolated.
    -   The original $x$-values (`x_known`).
    -   The original $y$-values (`y_known`).
4.  Output: It returns the interpolated $y$-values (`y_new`) corresponding to the $x_new$ values.
5.  Visualization: Known data points are visualized with blue dots, and interpolated points are shown in red. The dashed blue line offers visual guidance for the known trend.



#### Sample Outcome



The figure will show:

-   The known points as blue circles.
-   The interpolated points as red dots at fractional positions (e.g., $x = 1.5, y = 3$).
-   A dotted blue line connecting the data, illustrating the linear relationships.

