# Final Project: Normal distributions and a small research project

**Course Title: Special Topics in Physics: 2025 Summer Tutorials on Computational Physics (PHYS 6900)**   
**Submit all files by Friday, August 8, 2025, at 6 pm by emailing them to <drischler@ohio.edu> in a zipped folder.**

**Rules:**

* **Collaborating with other students attending this class is allowed and encouraged.** However, plain copying of solutions is not allowed.
* You can use all the resources discussed in the tutorials. You are NOT allowed to use any AI resources, including ChatGPT.
* Answer the questions in this Jupyter notebook and use `Python3` for your computations. Solutions in other programming languages will not be accepted.
* Comment your programs adequately so we know that you understand your code. Your functions should have at least a one-line [docstring](https://peps.python.org/pep-0257/) that explains their purpose.
* This Jupyter notebook is expected to run from the top to the bottom. Restart the kernel and run all cells in order, e.g., using `Kernel > Restart & Run All` to check.
* Label your plots adequately, e.g., so we know what is plotted and what the axes mean. Show legends with meaningful labels if more than one dataset is plotted.
* Explain your solutions clearly through coherent text and, if necessary, expressions in markdown cells.
* You are encouraged to work on the **optional exercises** if your time permits it, but it's _not_ required.

## Normalized Gaussians

A normal distribution takes the form
$$
g(x) = \frac{1}{\sigma\sqrt{2 \pi}} \exp{\left ( {\frac{-(x-\mu)^2}{2\sigma ^2}} \right)} ,
$$

with expected value $\mu$, variance $\sigma ^2$ and amplitude $\frac{1}{\sigma \sqrt{2 \pi}}$. In this exercise, you will visualize and analyze Gaussian functions using libraries like `numpy`, `matplotlib` and `pandas` to: <br>
* Define a Gaussian (normal distribution) function
* Plot several Gaussian curves with different parameters
* Explore how changes in mean and variance affect the shape
* Compute and compare statistical properties of the curves
* Display the results in a table using `pandas` <br>
> Don't forget to import the necessary libraries.
***

Complete the following tasks:

1) **Define the Gaussian function**: Write a function that takes four parameters: `gauss(a, mu, sigma, x)` and returns the value of a Gaussian function using the equation above. <br><br>
2) **Create your Domain**: Create a numpy array `x` of 100 values linearly spaced from -5 to 5. (Hint: use `np.linspace`). <br><br>
3) **Set up your Gaussian parameters**: Set up arrays for different mean ($\mu$) and variance $\sigma^2$. Then convert the variance to the standard deviation (Hint: use `np.sqrt`) to compute the corrosponding normalization factor `a` so that the area under the curve equals 1.

```python
# use the following arrays
means = [0, 0, 0, -2.5, 2.5]
variances = [0.2, 1.0, 5.0, 0.5, 0.5]   # the variance is sigma^2
```

4) **Plot the Gaussian curves and analyze statistical properties**: Create two empty lists that you will use to store two statistical properties. Define a figure object to then start a `for` loop so that in the same figure, you loop over each parameter set to:
* Compute the Gaussian function using your `gauss` function.
* Plot it using `matplotlib`.
* Label each curve with its mean and variance (Hint: don't forget to use `plt.legend` to activate your labels on the figure, and use LaTeX formatting for better labels).
* Don't forget to include labels for your x- and y-axis (you can simply label them `x` and `g(x)`, respectively).
* For a cleaner plot, add limits to your axes using `plt.xlim` and `plt.ylim`.
* For each curve:
    * Calculate the **analytical expected value and standard deviation of the distribution over x.** and `append` them to each one of the lists you created. <br><br> The expected value $\left\langle {x} \right\rangle $ is calculated using the equation 
    $$
    \mu = \Delta x \cdot \sum{x_i \cdot g(x_i)} 
    $$
    >Hint: $\Delta x$ is the interval of your `x` array.
    >>You can use `np.sum()` to sum over `x * g`.

    * And the variance $\left\langle {(x - \mu)^2} \right\rangle $ is calculated using the equation
    $$
    \sigma ^2 = \Delta x \cdot \sum{(x_i - \mu)^2 \cdot g(x_i)}
    $$

5) **Create a summary table**: Make a `pandas.Dataframe` with the following columns:
* Expected Mean ($\mu$)
* Analytical Mean ($\left\langle {x} \right\rangle $)
* Variance ($\sigma ^2$)
* Analytical Variance ($\left\langle {x} \right\rangle $)
* Normalization factor (a)
> You can create a `dictionary` first and then convert it to a `pandas.Dataframe`.

6) **Compare numerical and analytical values**: Add two error columns to your dataframe to compare numerical and analytical values of the mean and variance:
```python
df['mean error'] = np.abs(df['expected mean'] - df['analytical mean'])
df['variance error'] = np.abs(df['variance'] - df['analytical variance'])
```
* *How close is the expected mean to the analytical mean in each row?*
* *What about the variance values?*
* *Is the error larger for the wider (larger variance) Gaussians? Why might that be?*
* *What happens to the height of the curve as variance increases? Why?*

***

## Radial Growth of Axons in Rats

Neurons are the basic unit of the nervous system. They communicate by means of electrical signals, which are transmitted along the length of their axons. The [axon](https://en.wikipedia.org/wiki/Axon) is a long slender projection that extends from the neuronal cell body. The rate of propagation of these signals, and in turn, neuronal function, is influenced by the axon’s diameter; the larger the diameter, the faster signals travel. Thus, the growth of axonal diameter is a crucial developmental process to understand. <br><br>
Neurofilaments (NFs) are the most abundant structures that are known for their space-filling role in the axon. They are protein polymers transported along microtubule tracks in the axon. Experimental studies by Hoffman et al. (1984) on the sciatic nerve of rats demonstrated that the number of NFs increases with the axonal cross-sectional area, highlighting NFs' contribution to axonal growth.
<br><br>
In [this recent study](https://doi.org/10.1091/mbc.E22-12-0565), these experimental morphometric data were extracted and used to constrain the computational model developed to quantitatively describe how NFs contribute to the modulation of axonal diameter. Results showed that an increase in their influx from the cell body and slowing of their transport along the axon are the two mechanisms in which they help regulate radial growth through development.
<br><br>
In this exercise, we will simulate a small research project: you will use the morphometric data collected by Hoffman et al. (1984 and 1985) from growing axons in the sciatic nerve of rats up to 18 weeks of age to:

1) Plot the data along with the equations describing them;
2) Analyze the data to extract crucial information about axonal growth; and
3) Compare your findings with the data and interpretations presented in the referenced study.

We will also practice the other skills you have learned in the 2025 Computational Summer Tutorials, including UNIX and Git. Let's get started.

### Growth Curve

#### Import and plot the time vs. area data. 

The goal here is to produce a graph similar to Figure 1A in [this paper](https://www.molbiolcell.org/doi/epdf/10.1091/mbc.E22-12-0565).

* Download the course's GitHub repository to your hard drive. Which `git` command did you use? Navigate to the exercise folder. 
  * Create the new subdirectory `datap`. How?
  * Navigate into the folder. What is the current path? List the content of this folder with details.
  * Step out of the folder `datap`. Move the two data files to `datap` all at once. How? (Hint: Wildcards)
  * Delete the folder `datap`. How?
  * Retrieve the two data files from the repository. Which `git` command did you use?
  * Create the new subdirectory `data` and move the two data files into that new folder.
  * Add a comment line to the beginning of each of the two data files using the text editor `vi`. The comment should include the date and the origin of the data (i.e., the link to the paper and the scientific citation).
  * Check the status of the repository. Add the new folder structure to the repository and commit. How did you choose your commit message?
  * How would you upload your changes if you had read/write access to the remote repository on GitHub?

* You can use `NumPy` or `Pandas` to import the data from the `time_area_(Hoff.85)` file in the folder `data`. Make sure to include `','` as the delimiter and import the data as `float`. Print the data once imported. You should have a 13x2 matrix, the first column represents time in weeks and the second column is the axonal cross-sectional area $A$ in units of $\mu m^2$. Check the documentation of the read-in function you used if the comment line causes an issue. <br><br>

* Slice the imported data and assign the first (time) column to one variable and the second (area) column to another variable; that way, you have two one-dimensional arrays. <br><br>

* Plot the data. Make sure to add the appropriate labels for the axes with the area on the y-axis and time on the x-axis. Don't forget to include the units! Use $\LaTeX$ commands.<br><br>

* Have the limits of the **x-axis** be: $0 \rightarrow 21$ and **y-axis** be: $0 \rightarrow 100$. <br><br>

* To get the same scaling as in the original figure, have the ticks of the **x-axis** range from 0 to 23 in steps of 3. You can use `np.arange` to do so. <br><br>

* You can use any marker style or color. Just make sure the data points are plotted using **symbols**, not lines.

#### Optional: Axonal area through time

A linear regression was performed on the data to find an equation that describes how the area changes through growth. The resultant relationship obtained is (refer to Eq. (1) in the paper)

$$
A(t) = 2.93~t + 16.2 
$$

* Define a function that takes a variable $x$ and returns the result of this equation. <br><br>
* Generate an array with times of $t_i = 0$ as the first element and $t_f = 21$ weeks as the last element with 100 elements in between. You can use either `np.arange` or `np.linspace`. <br><br>
* Use the function and the array with the times you have defined to calculate the axonal area. Plot the resultant line on the same graph as the data with the same labels and axes attributes as before. Label the data points as 'data' and the line as 'regression'. Don't forget to show the legend. You should now have the same graph as the original Figure 1A.<br><br>
* Export the final figure to `figure_1_matplotlib.pdf`.

---------------

### Optional: Neurofilaments Abundance

Now, let's look into the NF data obtained from Hoffman et al. (1984). Let's use Python and `matplotlib`.

#### Import and plot the area vs. NF data

* Follow the same steps as before to import, print and plot the number of NFs as a function of axonal area, using the data from the `area_nf_(Hoff.84)` in the folder `data`. You should have the same data plot as Figure 1B in the paper. Don't forget to include the units in your labels. <br><br>

* Set the limits of the $x$ and $y$ axes as well as the ranges of their ticks to appear exactly the same as the original Figure 1B. <br><br>

* Again, since these are data points, use symbols with no lines connecting them. <br><br>

* Check the [Numpy documentation](https://numpy.org/doc/stable/index.html) as to whether it implements a function to perform a simple (polynomial) fit to the data. If so, perform such a model fit. What degree do you suggest for the polynomial? What are the fitted parameters of the model? <br><br>

* Add a legend and export the final figure to `figure_1b.pdf`.

--------------

## Wrapping up

Zip the entire folder in which this Jupyter notebook resides, including **all** files, and email the zipped file to <drischler@ohio.edu> by the deadline specified above. The notebook has to run entirely, from the top to the bottom.

**Recommended but optional:** Signed up for a free [GitHub](https://github.com/) account if you plan to take the Computational Physics lecture in the fall semester. You can get GitHub Pro for free [through Ohio University](https://help.ohio.edu/TDClient/30/Portal/KB/ArticleDet?ID=499). Just follow the link and the instructions. 

## References

* Hoffman, P., Griffin, J., Gold, B., & Price, D. (1985). Slowing of neurofilament transport and the radial growth of developing nerve fibers. the Journal of Neuroscience, 5(11), 2920–2929. https://doi.org/10.1523/jneurosci.05-11-02920.1985 <br><br>

* Hoffman, P. N., Griffin, J. W., & Price, D. L. (1984). Control of axonal caliber by neurofilament transport. the Journal of Cell Biology, 99(2), 705–714. https://doi.org/10.1083/jcb.99.2.705 <br><br>

* Nowier, R. M., Friedman, A., Brown, A., & Jung, P. (2023). The role of neurofilament transport in the radial growth of myelinated axons. Molecular Biology of the Cell, 34(6). https://doi.org/10.1091/mbc.e22-12-0565
