# 0. Motivation

In various engineering and science disciplines, the need to quantify accumulated quantities such as areas, volumes, and energy arises frequently. Numerical integration provides a practical approach to compute these accumulations, especially when analytical solutions are challenging or the underlying mathematical function may not be explicitly known, and only a set of discrete data points is available.

By the end of this section, you should be able to:
- Recognize scenarios where numerical integration is essential
- Understand the practical significance of numerical integration
- Apply different numerical integration techniques
- Estimate the accuracy of these numerical approximations
- Implement these methods in Python for real-world applications

# 1. Integration

Integration, in mathematical terms, involves finding the accumulated quantity or area under a curve. The definite integral of a function $f(x)$ over an interval $[a, b]$ is denoted as:

$$
\int_{a}^{b} f(x) \, dx
$$

This represents the signed area under the curve of $f(x)$ between $a$ and $b$. While analytical methods exist, numerical integration becomes indispensable in situations where analytical solutions are impractical or unavailable. We can split the interval $[a, b]$ into $n$ subintervals, and model each subinterval as a rectangle, each with width $h$. As the number of subintervals $n$ increases (or equivalently, the spacing $h$ decreases), the sum of the areas of the rectangles converges to the integral of the function over $[a, b]$.

$$\int_{a}^{b} f(x) \, dx = \lim_{h\rightarrow 0}\sum_{i=0}^{n-1}{f(x_i)}{h}$$

<br>

<figure>
  <img src="https://upload.wikimedia.org/wikipedia/commons/6/61/Riemann_sum_%28rightbox%29.gif
" style="width:35%">
    <figcaption style="text-align:center"><strong>Integral of a function:</strong> <a href="https://upload.wikimedia.org/wikipedia/commons/6/61/Riemann_sum_%28rightbox%29.gif">https://en.m.wikipedia.org/</a></figcaption>   
</figure>

# 2. Numerical Integration 

Similar to numerical differentiation, numerical integration encounters scenarios where data points are discrete. This limitation arises in practical applications where continuous measurements are unfeasible. All numerical integration methods share a common procedural framework:

1. Partition the integration interval $[a, b]$ into $n$ subintervals
2. Employ a specific geometric shape to estimate the area of each subinterval (the shape varies between different methods)
3. Take the sum of the areas of all subintervals

There are five fundamental numerical integration methods that we will explore:
1. Left Riemann
2. Right Riemann
3. Midpoint Rule
4. Trapezoidal Rule
5. Simpson's Rule

The following figure illustrates the five different numerical integration methods. 

<br>

<figure>
  <img src="https://docs.google.com/drawings/d/e/2PACX-1vQp6LxOy2XFAIarNGtHwyhlxhTBXE6Xsq6EGb2XZSPAbXmcWMLY3IU1xpRqIsDBRn5dC3wwAp9ip0z9/pub?w=1439&h=186" style="width:100%">
    <figcaption style="text-align:center"><strong>Numerical integration methods</strong></figcaption>   
</figure>

<br>


## 2.1. Left Riemann

The left Riemann method estimates the integral by partitioning $[a, b]$ into rectangles, each with height equal to the value of the function evaluated at the **left** endpoint of each subinterval. For an interval $[a, b]$, the approximation is given by:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_i) \cdot (x_{i+1} - x_i)
$$

If the spacing between the data points is a constant value $h$, the equation can be written as:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_i) \cdot h
$$

## 2.2. Right Riemann

Similarly, the right Riemann method estimates the integral by partitioning $[a, b]$ into rectangles, but each with height equal to the value of the function evaluated at the **right** endpoint of each subinterval. For an interval $[a, b]$, the approximation is given by:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_{i+1}) \cdot (x_{i+1} - x_i)
$$

If the spacing between the data points is a constant value $h$, the equation can be written as:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_{i+1}) \cdot h
$$

## 2.3. Midpoint Rule

The midpoint rule , similar to left and right Riemann, estimates the integral by partitioning $[a, b]$ into rectangles, but each with height equal to the value of the function evaluated at the **midpoint** of the endpoints of each subinterval. Because this method requires evaluating the function at the midpoints, which is not possible if only discrete data are available. For an interval $[a, b]$, the approximation is given by:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f\left(\dfrac{x_{i+1}+x_i}{2}\right) \cdot (x_{i+1} - x_i)
$$

If the spacing between the data points is a constant value $h$, the equation can be written as:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f\left(\dfrac{x_{i+1}+x_i}{2}\right) \cdot h
$$

## 2.4. Trapezoid Rule


The trapezoidal rule estimates the integral by partitioning $[a, b]$ into trapezoids. For an interval $[a, b]$, the approximation is given by:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}\dfrac{f(x_{i+1})+f(x_i)}{2} \cdot (x_{i+1} - x_i)
$$


If the spacing between the data points is a constant value $h$, the equation can be written as:

$$
\int_{a}^{b} f(x) \, dx \approx \dfrac{h}{2}\sum_{i=0}^{n-1}\big(f(x_{i+1})+f(x_i)\big)
$$

If you expand the equation above, it should be evident that the function will be evaluated twice for all points except $x_0$ and $x_n$. To make the computation more efficient, the equation can be rewritten to avoid reevaluating the same terms again using:

$$
\int_{a}^{b} f(x) \, dx \approx \dfrac{h}{2}\left(f(x_0)+2\sum_{i=1}^{n-1}f(x_{i})+f(x_n)\right)
$$

## 2.5. Simpson's Rule

Simpson's rule uses quadratic polynomials to estimate the integral. The data are subdivided into groups of three points, $x_{i-1}, x_{i}, \text{and } x_{i+1}$, which extend over two subintervals. Then, a quadratic polynomial $\beta_0+\beta_1x+\beta_2x^2$ is fit into these three points using interpolation. The resulting polynomial is integrated exactly to get the area of the two subintervals, and then the process is repeated for the next three points.

With some algebra and manipulation, the approximated integral of $f(x)$ over an interval $[a, b]$ using Simpson's rule can be written as:

$$
\int_{a}^{b} f(x) \, dx \approx \dfrac{h}{3}\left(f(x_0)+4\sum_{i=1 \newline odd}^{n-1}f(x_{i})+2\sum_{i=2 \newline even}^{n-2}f(x_{i})+f(x_n)\right)
$$

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Write a function <code>left_int(f, a, b, n)</code> which takes as input a function object <code>f</code> and three scalar values <code>a, b, n</code>. The function should return an estimate of the integral of <code>f</code> over <code>[a, b]</code> using the left Riemann method and <code>n</code> equally spaced subintervals.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try your function <code>left_int(f, a, b, n)</code> for $f(x)=\cos(x)$ over $x = [1, 3]$ and different values of <code>n</code>. Then, compute the analytical value of the integral and the error between the numerical approximation and the analytical value.</div>

In [None]:
# Estimate integral and display results
I_left = ...
print(f"Integral Left Riemann  ~ {I_left}")

# Analytical solution for integral
I_exact = ...
print(f"Integral Exact         = {I_exact}")

# Calculate Error
print(f"Error Left Riemann     = {np.abs(I_exact - I_left)}")

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Write a function <code>mid_int(f, a, b, n)</code> which takes as input a function object <code>f</code> and three scalar values <code>a, b, n</code>. The function should return an estimate of the integral of <code>f</code> over <code>[a, b]</code> using the midpoint rule and <code>n</code> equally spaced subintervals.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try your function <code>mid_int(f, a, b, n)</code> for $f(x)=\cos(x)$ over $x = [1, 3]$ and different values of <code>n</code>. Then, compute the analytical value of the integral and the error between the numerical approximation and the analytical value.</div>

In [None]:
# Estimate integral and display results
I_mid = mid_int(np.cos, 1, 3, 10)
print(f"Integral Midpoint Rule ~ {I_mid}")

# Analytical solution for integral
I_exact = np.sin(3)- np.sin(1)
print(f"Integral Exact         = {I_exact}")

# Calculate Error
print(f"Error Midpoint Rule    = {np.abs(I_exact - I_mid)}")

# 3. Integration in Python

Several Python modules have existing numerical integration functions. The `scipy.integrate` sub-package provides several integration techniques. One such function is `scipy.integrate.quad()`, which is used for general purpose integration. To use the function, we will need to import SciPy: `import scipy` and then use `scipy.integrate.quad()`. The `scipy.integrate.quad()` function takes in many arguments that you can find in the documentation [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html#scipy.integrate.quad). The most important arguments are the function you want to find its integral, `func`, the lower and upper limits of integration, `a` and `b`, respectively: `scipy.integrate.quad(func, a, b)`. 

If the function being integrated, `func`, requires extra arguments besides the $x$ values, you can pass them using the optional argument `args`: `scipy.integrate.quad(func, a, b, args=(...))`.

Note that `scipy.integrate.quad()` returns two values, in the following order:
1. The integral of `func` from `a` to `b`
2. An estimate of the absolute error in the result

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Use <code>quad()</code> to integrate $f(x)=\cos(x)$ over $x = [1, 3]$.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Use <code>quad()</code> to integrate $f(x)=x^n - c$ over $x = [1, 3]$, where <code>n</code> and <code>c</code> are scalar values provided by the user.</div>

# 4. Error in Numerical Integration

Similar to numerical differentiation, numerical integration methods are affected by errors. Understanding the accuracy of these methods and the magnitude of the error is crucial for their reliable application.

Calculating the error exactly, as in the examples above, is generally not possible. Instead, we estimate the order or magnitude of the error. Big-O notation is commonly used to describe the error's order as a function of the step size $h$.

## 4.1. Left Riemann

The left Riemann method is said to have an error that is the same order as the spacing $h$:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_i) \cdot h + O(h)
$$

## 4.2. Right Riemann

The right Riemann method is said to have an error that is the same order as the spacing $h$:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f(x_{i+1}) \cdot h + O(h)
$$

## 4.3. Midpoint Rule

The midpoint rule is said to have an error that is the same order as $h^2$:

$$
\int_{a}^{b} f(x) \, dx \approx \sum_{i=0}^{n-1}f\left(\dfrac{x_{i+1}+x_i}{2}\right) \cdot h + O(h^2)
$$

## 4.4. Trapezoid Rule

The trapezoid rule is said to have an error that is the same order as $h^2$:

$$
\int_{a}^{b} f(x) \, dx \approx \dfrac{h}{2}\left(f(x_0)+2\sum_{i=1}^{n-1}f(x_{i})+f(x_n)\right) + O(h^2)
$$

## 4.5. Simpson's Rule

Simpson's rule has an error that is the same order as $h^4$:

$$
\int_{a}^{b} f(x) \, dx \approx \dfrac{h}{3}\left(f(x_0)+4\sum_{i=1 \newline odd}^{n-1}f(x_{i})+2\sum_{i=2 \newline even}^{n-2}f(x_{i})+f(x_n)\right) + O(h^4)
$$

## 4.6. Error and Spacing Trade-off 

As with numerical differentiation, choosing a smaller step size $h$ improves accuracy in numerical integration. However, this comes at the cost of increased computational demands, including longer processing times and higher memory requirements, as well as potentially more significant round-off errors. Therefore, a balance between acceptable error and step size is crucial.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Calculate the error for the left Riemann and midpoint rule relative to the analytical solution for $f(x)=\cos(x)$ over $x = [1, 3]$. Calculate the error for different values of <code>n</code>.</div>

In [None]:
# Define inputs
a, b = 1, 3
n = 10**2
h = (b - a)/n
print(f'h = {h}')

# Analytical solution for integral
I_exact = np.sin(b)- np.sin(a)

# Calculate error and display results
I_left = left_int(np.cos, a, b, n)
print(f"Left Riemann error: O(h)   = {np.abs(I_exact - I_left)}")

# Calculate error and display results
I_mid = mid_int(np.cos, a, b, n)
print(f"Midpoint rule error: O(h^2) = {np.abs(I_exact - I_mid)}")