# Exercise 1: Coding and visualizing geostatistics

In this week's exercise we will take our first steps toward learning how to convert equations into Python code, and visualizing some geochronological data.
We will be using Python tools that are already familiar to us, but applying them in a slightly different way than in the earlier exercises.

For each problem you need to modify the given notebook, and then upload your files to GitHub.
The answers to the questions in this week's exercise should be given by modifying the document in places where asked.

- **Exercise 1 is due by the start of class on on 5.11.**
- Don't forget to check out [the hints for this week's exercise](https://introqg.github.io/qg/lessons/L8/exercise-8-hints.html) if you're having trouble.
- Scores on this exercise are out of 20 points.

# Problem 1: Converting math to Python (8.5 points)

One of the goals of this part of the course is to develop your quantitative geoscience skills, including learning how to convert mathematical equations to Python code.
Doing this allows you to explore how various equations work and produce useful data plots or predictions, something increasingly done by geoscience professionals.

For this problem you are asked to create 3 Python functions to calculate common statistics on a list of values: (1) the mean, (2) the standard deviation, and (3) the standard deviation of the mean (or standard error).

- For each function you will be given a formula that you should convert to a Python function.
- In your functions you should not use existing Python or NumPy functions, other than perhaps a function for calculating the square root.
- Each of the functions should be defined in a Python cell in this notebook so that you can use and test your functions.
- In addition, you should use the NumPy functions `np.mean()`, `np.std()`, and `np.sqrt()` to calculate the mean, standard deviation, and standard error to compare to the values found using your functions.

## Part 1: Preparing your test data (1 point)

To ensure your functions are working properly, we will be using some data where the expected values of each function can easily be calculated.
The data comprise ages measured for minerals in the five geochronological samples in the table below.

| Sample    | Subsample ID | Age [Ma] | 
| --------- | ------------ | -------- |
| **F09**   | F09-1        | 2.01     |
|           | F09-2        | 1.95     |
|           | F09-3        | 2.38     |
|           | F09-4        | 2.3      |
|           | F09-5        | 2.0      |
| **BH63**  | BH63-1       | 4.77     |
|           | BH63-2       | 5.11     |
|           | BH63-3       | 3.30     |
|           | BH63-4       | 3.34     |
|           | BH63-5       | 4.45     |
| **BH161** | BH161-1      | 8.8      |
|           | BH161-3      | 2.15     |
| **BH412** | BH412-1      | 4.74     |
|           | BH412-2      | 5.14     |
|           | BH412-3      | 5.14     |
|           | BH412-4      | 5.5      |
|           | BH412-5      | 5.1      |
| **BHF04** | BHF04-1      | 2.21     |
|           | BHF04-3      | 5.1      |
|           | BHF04-4      | 2.93     |
|           | BHF04-5      | 4.69     |<br/>
*Table 1. Apatite (U-Th)/He thermochronometer ages from [Coutand et al. (2014)](https://doi.org/10.1002/2013JB010891)*.

For this part you should:

- Create 5 Python lists named `f09`, `bh63`, `bh161`, `bh412`, and `bhf04` that contain the age values listed for each subsample above.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# This test should print first age of the list `f09`
print("The first age in list f09 is", f09[0], "Ma")


## Part 2: Creating and testing your `mean()` function (2 points)

The *mean* or *average* $\bar{x}$ should be calculated using a function you should call `mean()`.
  
\begin{equation}
  \Large
  \bar{x} = \frac{\Sigma x_{i}}{N}
\end{equation}

*Equation 1. The mean value, where $x_{i}$ is a value to be included in the mean calculation and $N$ is the total number of values to average*.
    
For this part you should:

- Create a function called `mean()` in the cell below.
- Print out the mean value you calculated using your `mean()` function and that from `np.mean()` (from NumPy) in the format for each of the age lists created in Part 1.

  ```
  My mean for f09: 2.1280. NumPy mean for f09: 2.1280.
  ```
  
  **Note**: Like the example above, you should have 4 numbers after the decimal place in your output.
  In case you've forgotten how to do that, you can refer to the [hints for Exercise 1](https://introqg.github.io/qg/lessons/L8/exercise-8-hints.html) for some ideas.

In [None]:
def mean(numbers):
    """Returns the average value of a collection of numbers."""
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# This test print should work
print("Mean for f09:", mean(f09))


## Part 3: Creating and testing your `stddev()` function (2 points)

The *standard deviation* $\sigma_{x}$, calculated using a function you should call `stddev()`.

\begin{equation}
  \Large
  \sigma_{x} = \sqrt{\frac{1}{N} \Sigma \left( x_{i} - \bar{x} \right)^{2}}
\end{equation}

*Equation 2. The standard deviation*.

For this part you should:

- Create a function called `stddev()` in the cell below.
- Print out the standard deviation values you calculate using your `stddev()` function and that from `np.stddev()` (from NumPy) in the format for each of the age lists created in Part 1.

  ```
  My standard deviation for f09: 2.1280. NumPy standard deviation for f09: 2.1280.
  ```
  
  **Note**: Like the example above, you should have 4 numbers after the decimal place in your output.
  In case you've forgotten how to do that, you can refer to the [hints for Exercise 1](https://introqg.github.io/qg/lessons/L8/exercise-8-hints.html) for some ideas.

In [None]:
# Import the math module to use the square root function sqrt()
import math

def stddev(numbers):
    """Returns the standard deviation of a collection of numbers."""
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# This test print should work
print("Standard deviation for f09:", stddev(f09))


## Part 4: Creating and testing your `stderr()` function (2 points)

The *standard deviation of the mean* or *standard error* $\sigma_{\bar{x}}$, calculated using a function you should call `stderr()`.

\begin{equation}
  \Large
  \sigma_{\bar{x}} = \frac{\sigma_{x}}{\sqrt{N}}
\end{equation}

*Equation 3. The standard error*.

For this part you should:

- Create a function called `stderr()` in the cell below.
- Print out the standard deviation values you calculate using your `stderr()` function and that from the `np.stddev()`  and `np.sqrt()` functions in the format for each of the age lists created in Part 1.

  ```
  My standard error for f09: 2.1280. NumPy standard error for f09: 2.1280.
  ```
  
  **Note**: Like the example above, you should have 4 numbers after the decimal place in your output.
  In case you've forgotten how to do that, you can refer to the [hints for Exercise 1](https://introqg.github.io/qg/lessons/L8/exercise-8-hints.html) for some ideas.

In [None]:
# Import the math module to use the square root function sqrt()
import math

def stderr(numbers):
    """Returns the standard error of a collection of numbers."""
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# This test print should work
print("Standard error for f09:", stderr(f09))


## Part 5: Questions for Problem 1 (1.5 points)

1. How much do the standard deviation values vary among the samples in Table 1, and what does this tell you about the measured values?
2. Do you observe a large difference between the standard deviation and standard error values? Is it clear why you should always indicate whether reported values are standard deviations or standard errors?
3. Did you observe any descrepancy between the values calculated in your functions and those found using NumPy? If so, what is the cause of the difference(s)?

YOUR ANSWER HERE