### 3.8 Projects

In this section we propose several ideas for projects related to numerical Calculus. These projects are meant to be open ended, to encourage creative mathematics, to push your coding skills, and to require you to write and communicate your mathematics. Take the time to read Appendix B before you write your final solution.

### 3.8.1 Galaxy Integration

To analyze the light from stars and galaxies, scientists use a spectral grating (fancy prism) to split it up into the different frequencies (colors). We can then measure the intensity (brightness) of the light (in units of Watts per square meter) at each frequency (measured in Hertz), to get intensity per frequency (Watts per square meter per Hertz, $\mathrm{W} /\left(\mathrm{m}^{2} \mathrm{~Hz}\right)$ ). Light from the dense opaque surface of a star produces a smooth rainbow, which produces a continuous curve when we plot intensity versus frequency. However stars are also surrounded by thin gas which either emits or absorbs light at only a specific set of frequencies, called spectral lines. Every chemical element produces a specific set of lines (or peaks) at fixed frequencies, so by identifying the lines, we can tell what types of atoms and molecules a star is made of. If the gas is cool, then it will absorb light at these wavelengths, and if the gas is hot then it will emit light at these wavelengths. For galaxies, on the other hand, we expect mostly emission spectra: light emitted from the galaxy.

For this project we will be analyzing the galaxy "ngc 1275." The black hole at the center of this galaxy is often referred to as the "Galactic Spaghetti Monster" since the magnetic field "sustains a mammoth network of spaghetti-like gas filaments around it." You can download the data file associated with this project with the following Python code.

```
import numpy as np
import pandas as pd
URL1='https://raw.githubusercontent.com/NumericalMethodsSullivan'
URL2='/NumericalMethodsSullivan.github.io/master/data/'
URL = URL1+URL2
ngc1275 = np.array( pd.read_csv(URL+'ngc1275.csv') )
# ngc1275.csv
```

In the data you will see the spectral data measuring the light intensity from ncg 1275 at several different wavelengths (measured in Angstroms ). You will notice in this data set that there are several emission lines at various wavelengths. Of particular interest are the peaks near 3800 Angstroms, 5100 Angstroms, 6400 Angstroms, and the two peaks around 6700 Angstroms. The data set contains 1,727 data points at different wavelengths. Your first job will be to transform
the wavelength data to frequency via the formula

$$
\lambda=\frac{c}{f}
$$

where $\lambda$ is the wavelength, $c$ is the speed of light, and $f$ is the frequency (measured in Hertz). Be sure to double check the units. Given the inverse relationship between frequency and wavelength you should see the emission lines flip to the other side of the plot (right-to-left or left-to-right).

The strength of each emission line (in $\mathrm{W} / \mathrm{m}^{2}$ ) is defined as the relative intensity of each peak across the associated frequencies. Note that you are not interested in the intensity of the continuous spectrum - just the peaks. That is to say that you are only interested in the area above the background curve and the background noise.

Your primary task is to develop a process for analyzing data sets like this so as to determine the strength of each emission lines. You must demonstrate your process on this particular data set, but your process must be generalizable to any similar data set. Your process must clearly determine the strength of peaks in data sets like this and you must apply your procedure to determine the strength of each of these four lines with an associated margin of error. Keep in mind that you will first want to first develop a method for removing the background noise. Finally, the double peak near 6700 Angstroms needs to be handled with care: the strength of each emission line is only the integral over one peak, not two, so you'll need to determine a way to separate these peaks.

Finally, it would be cool, but is not necessary, to report on which chemicals correspond to the emission lines in the data. Remember that the galaxy is far away and hence there is a non-trivial red-shift to consider. This will take some research but if done properly will likely give a lot more merit to your paper.

### 3.8.2 Higher Order Integration

Riemann sums can be used to approximate integrals and they do so by using piecewise constant functions to approximate the function. The trapezoidal rule uses piece wise linear functions to approximate the function and then the area of a trapezoid to approximate the area. We saw earlier that Simpson's rule uses piece wise parabolas to approximate the function. The process which we used to build Simpson's rule can be extended to any higher-order polynomial. Your job in this project is to build integration algorithms that use piece wise cubic functions, quartic functions, etc. For each you need to show all of the mathematics necessary to derive the algorithm, provide several test cases to show that the algorithm works, and produce a numerical experiment that shows the order of accuracy of the algorithm.

### 3.8.3 Dam Integration

Go to the USGS water data repository:
https://maps.waterdata.usgs.gov/mapper/index.html.
Here you'll find a map with information about water resources around the country.

- Zoom in to a dam of your choice (make sure that it is a dam).
- Click on the map tag then click "Access Data"
- From the drop down menu at the top select either "Daily Data" or "Current / Historical Data." If these options don't appear then choose a different dam.
- Change the dates so you have the past year's worth of information.
- Select "Tab-separated" under "Output format" and press Go. Be sure that the data you got has a flow rate $\left(\mathrm{ft}^{3} / \mathrm{sec}\right)$.
- At this point you should have access to the entire data set. Copy it into a csv file and save it to your computer.

For the data that you just downloaded you have three tasks: (1) plot the data in a reasonable way giving appropriate units, (2) find the total amount of water that has been discharged from the dam during the past calendar year, and (3) report any margin of error in your calculation based on the numerical method that you used in part (2).

### 3.8.4 Edge Detection in Images

Edge detection is the process of finding the boundaries or edges of objects in an image. There are many approaches to performing edge detection, but one method that is quite robust is to use the gradient vector in the following way:

- First convert the image to gray scale.
- Then think of the gray scale image as a plot of a multivariable function $G(x, y)$ where the ordered pair $(x, y)$ is the pixel location and the output $G(x, y)$ is the value of the gray scale at that point.
- At each pixel calculate the gradient of the function $G(x, y)$ numerically.
- If the magnitude of the gradient is larger than some threshold then the function $G(x, y)$ is steep at that location and it is possible that there is an edge (a transition from one part of the image to a different part) at that point. Hence, if $\|\nabla G(x, y)\|>\delta$ for some threshold $\delta$ then we can mark the point $(x, y)$ as an edge point.


## Your Tasks:

1. Choose several images on which to do edge detection. You should take your own images, but if you choose not to be sure that you cite the source(s) of your images.
2. Write Python code that performs edge detection as described above on the image. In the end you should produce side-by-side plots of the original picture and the image showing only the edges. To calculate the gradient use a centered difference scheme for the first derivatives

$$
f^{\prime}(x) \approx \frac{f(x+h)-f(x-h)}{2 h}
$$

In an image we can take $h=1$ (why?), and since the gradient is two dimensional we get

$$
\nabla G(x, y) \approx\left\langle\frac{G(x+1, y)-G(x-1, y)}{2}, \frac{G(x, y+1)-G(x, y-1)}{2}\right\rangle
$$

Figure 3.14 depicts what this looks like when we zoom in to a pixel and its immediate neighbors. The pixel labeled $G[i, j]$ is the pixel at which we want to evaluate the gradient, and the surrounding pixels are labeled by their indices relative to [i,j].
![](https://cdn.mathpix.com/cropped/2025_02_27_429587f441ab5f434461g-61.jpg?height=519&width=1115&top_left_y=1177&top_left_x=402)

Figure 3.14: The gradient computation on a single pixel using a centered difference scheme for the first derivative.
3. There are many ways to approximate numerical first derivatives. The simplest approach is what you did in part (2) - using a centered difference scheme. However, pixels are necessarily tightly packed in an image and the immediate neighbors of a point may not have enough contrast to truly detect edges. If you examine Figure 3.14 you'll notice that we only use 4 of the 8 neighbors of the pixel $[i, j]$. Also notice that we didn't reach out any further than a single pixel. Your job now is to build several other approaches to calculating the gradient vector, implement them to perform edge detection, and show the resulting images. For each method you need to give the full mathematical details for how you calculated the gradient as
well as give a list of pros and cons for using the new numerical gradient for edge detection based on what you see in your images. As an example, you could use a centered difference scheme that looks two pixels away instead of at the immediate neighboring pixels

$$
f^{\prime}(x) \approx \frac{? ? ? f(x-2)+? ? ? f(x+2)}{? ? ?}
$$

Of course you would need to determine the coefficients in this approximation scheme.
Another idea could use a centered difference scheme that uses pixels that are immediate neighbors AND pixels that are two units away

$$
f^{\prime}(x) \approx \frac{? ? ? f(x-2)+? ? ? f(x-1)+? ? ? f(x+1)+? ? ? f(x+2)}{? ? ?}
$$

In any case, you will need to use Taylor Series to derive coefficients in the formulas for the derivatives as well as the order of the error. There are many ways to approximate the first derivatives so be creative. In your exploration you are not restricted to using just the first derivative. There could be some argument for using the second derivatives and/or the Hessian matrix of the gray scale image function $G(x, y)$ and using some function of the concavity as a means of edge detection. Explore and have fun!

The following code will allow you to read an image into Python as an np.array().

```
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import image
I = np.array(image.plt.imread('ImageName.jpg'))
plt.imshow(I)
plt.axis("off")
plt.show()
```

You should notice that the image, I, is a three dimensional array. The three layers are the red, green, and blue channels of the image. To flatten the image to gray scale you can apply the rule

$$
\text { grayscale value }=0.3 \mathrm{Red}+0.59 \mathrm{Green}+0.11 \mathrm{Blue} .
$$

The output should be a 2 dimensional numpy array which you can show with the following Python code.

```
plt.imshow(G, cmap='gray') # "cmap" stands for "color map"
plt.axis("off")
plt.show()
```

Figure 3.15 shows the result of different threshold values applied to the simplest numerical gradient computations. The image was taken by the author.
![](https://cdn.mathpix.com/cropped/2025_02_27_429587f441ab5f434461g-63.jpg?height=1045&width=1146&top_left_y=796&top_left_x=393)

Figure 3.15: Edge detection using different thresholds for the value of the gradient on the grayscale image

