# ENGN2350 Data-Driven Design and Analysis of Structures and Materials

_Homeworks for fall semester 2025-2026_

Coding exercises to explore the [`f3dasm`](https://f3dasm.readthedocs.io/en/latest/) package.

**General instructions**:

- Read the questions and answer in the cells under the "PUT YOUR CODE IN THE CELL BELOW" message.
- Work through the notebook and make sure you fill in any place that says `YOUR CODE HERE` or `YOUR ANSWER HERE`. You can remove the `'raise NotImplementedError()'` code.
- After "END OF YOUR CODE" , there is a cell that contains simple tests (with `assert` statements) to see if you did the exercises correctly. Not all exercises have tests! If you run the cell containing the tests and no error is given, you have succesfully solved the exercise!
- Make sure you have the right version of `f3dasm` (2.1.0).

> You can check your `f3dasm` version by running `pip show f3dasm`

- **ONLY WORK ON THE EXERCISE IN A JUPYTER NOTEBOOK ENVIRONMENT**

> The homework assignments are generated and automatically graded by the `nbgrader` extension. If you open and save the notebook in Google Colab, metadata from Colab will be added, and the `nbgrader` metadata will be altered. As a result, `nbgrader` will be unable to automatically grade your homework. Therefore, we kindly ask students to only work on the notebook in Jupyter Notebook.

- **DO NOT ADD OR REMOVE CELLS IN THE NOTEBOOK**

> Most cells containing tests are set to read-only, but VS Code can bypass this restriction. Modifying or removing cells in the notebook may disrupt the `nbgrader` system, preventing automatic grading of your homework.

**Instructions for handing in the homework**

- Upload the Jupyter Notebook (`.ipynb file`) to Canvas

If there are any questions about the homework, send an email to Samik (samik_mukhopadhyay@brown.edu) or Elvis (elvis_alexander_aguero_vera@brown.edu)

**Grading**

- In each homework, you can obtain a maximum of 20 points
- Next to each subquestion, the maximum amount of obtainable points is listed

Good luck!

You can put your name in the cell below:

In [None]:
NAME = ""

---

## Homework 7

In this homework you will explore the basic usage of the `f3dasm` package:

At the end of this homework you will know
- how to do Design of Experiments, including creating the `Domain` object, and `ExperimentData` object from a numpy array and a pandas DataFrame.
- how to define the Data Generation module, including creating your custom evaluation function.
- how to save your `ExperimentData` and later retrieve it from disk.
- how to use the package to do simple model selection.

In [None]:
# Import some packages we might need later
import numpy as np

---
### Exercise 1

Consider the function $f(x) = x  \; sin(x)$ in the domain $x \in [0, 10]$

1.1 _(5 point_) Do the Design of Experiments manually by creating a `f3dasm.ExperimentData` dataset with $50$ input points that are equally spaced within those bounds.

You are going to do this step-by-step!

---

- Create a `Domain` object called `my_domain` and add the input parameter $x$ to it. Make sure the bounds of the
variable are between $0.0$ and $10.0$.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert isinstance(my_domain, Domain), "my_domain is not an instance of the Domain class!"
assert 'x' in my_domain.input_space, "There is no parameter named 'x' in your domain!"
assert my_domain.input_space['x'].lower_bound == 0.0, "The lower bound of the paramater x is not 0.0"
assert my_domain.input_space['x'].upper_bound == 10.0, "The upper bound of the parameter x is not 10.0"

---

- Create a `numpy` vector `x_data` of $50$ points that are uniformly spaced between $0$ and $10$.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert np.isclose(x_data[3], 0.61224), "The value of x_data in position 3 is not correct!"
assert np.isclose(x_data[6], 1.22448), "The value of x_data in position 6 is not correct!"
assert np.isclose(x_data[-1], 10.0), "The value of x_data in the last position is not correct!"

---

- Create a new `ExperimentData` object called `my_experimentdata` with the input data `x_data`
and the domain object created in the first step.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

---

- Create a function $f(x)$ that computes $x \; sin(x)$
- Add the decorator `@datagenerator` from `f3dasm` to the function. Name the output to the function `'y'`. 

PUT YOUR CODE IN THE CELL BELOW

In [None]:
from f3dasm import datagenerator

@datagenerator(output_names=['y'])
def f(x: np.ndarray) -> float:
    # YOUR CODE HERE
    raise NotImplementedError()
    return float(y)

END OF YOUR CODE

In [None]:
from f3dasm.datageneration import DataGenerator

# This cell checks if you did the exercise correctly
assert np.isclose(f.f(1), 0.84147), "When prompting the function with input x=1, the output is not correct!"
assert np.isclose(f.f(2), 1.81859), "When prompting the function with input x=2, the output is not correct!"
assert isinstance(f, DataGenerator), "f is not an instance of the DataGenerator class!"

---

- Call the datagenerator of funciton $f(x)$ with the `f3dasm.ExperimentData` and store the output back in `my_experimentdata`.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert all(np.isclose(my_experimentdata.to_numpy()[1].ravel(), 
                      np.array([f.f(x) for x in x_data]))), "The output of the experimentdata is not correct!"

---

1.2 _(1 point)_ Plot the function from the 50 points that you defined, label the x-axis as "x" and the y axis as "y",
and include a title "Exercise 1" in the plot.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

---

1.3. _(1 point)_ Make a new folder in the "HW7_exercise1" folder called "exercise_1" and save the `ExperimentData`
object to this folder.

> Note: For this exercise, make sure you put a relative path as the storing location!
>
> So: `./HW7_exercise1` = good, `/home/martin/Documents/GitHub/3dasm_course/Assignments/your_Assignments/HW7_exercise1` = bad

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

---

1.4. _(1 point)_ Load the ExperimentData object you saved previously into a variable called "`my_loaded_experimentdata`".
Print this `ExperimentData` object and check that it is the same one you saved.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert my_loaded_experimentdata.round(6) == my_experimentdata.round(6), "The experimentdata in memory and the reloaded experimentdata are not the same!" 

---
##  Exercise 2

In this exercise, you will use `f3dasm` to train different Gaussian Process Regressor (GPR) models with different kernels. While you've done a similar exercise before, you'll find that `f3dasm` simplifies the workflow, making the process more efficient.

2.1 _(1 point)_ Convert `my_experimentdata` from the previous exercise into two `numpy` arrays, `X` and `Y`, where `X` contains the input data and `Y` contains the output data.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert isinstance(X, np.ndarray), "X is not a numpy array"
assert isinstance(Y, np.ndarray), "Y is not a numpy array"
assert X.shape == (50, 1), "The shape of X is not correct"

---

2.2 _(2 points)_ Add Gaussian noise ($z$) to the output (`Y`), such that:

$$
z \sim \mathcal{N}(0, \sigma_z^2), \quad \sigma_z \sim \mathcal{U}(0.5, 1.5)
$$

To handle the stochasticity, create a `numpy.random.Generator` object with `np.random.default_rng` ([link to documentation](https://numpy.org/doc/2.1/reference/random/generator.html#numpy.random.default_rng)) and use $123$ as the seed


PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

---

2.3 _(1 point)_ With the `train_test_split` function of `scikit-learn`, split the dataset into a train (`X_train`, `y_train`) and test (`X_test`, `y_test`) set, with ratio 80/20 and use $123$ as the random seed:

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert (X_train.shape == y_train.shape == (40, 1)), "The shape of X_train or y_trainis not correct!"
assert (X_test.shape == y_test.shape == (10, 1)), "The shape of X_test or y_test is not correct"

---

2.4 _(1 point)_ Create a new domain object named `domain_kernel` and add a parameter called `kernel`.

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
# This cell checks if you did the exercise correctly
assert isinstance(domain_kernel, Domain), "domain_kernel is not Domain instance!"
assert 'kernel' in domain_kernel.input_space, "there is not parameter named 'kernel' in the domain!"

---

2.5 _(2 points)_ Create a datagenerator function `evaluate_regressor` that requires an argument `kernel` and `seed`.
- The function creates a gaussian process regressor with the kernel and seed given by the input of the function
- The model is trained on the training data and predicts the testing data
- The function returns the $R^2$ and MSE on the test data

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# You might want to import some functions ..
# YOUR CODE HERE
raise NotImplementedError()

@datagenerator(output_names=['r2', 'mse'])
def evaluate_regressor(kernel, seed):
    # YOUR CODE HERE
    raise NotImplementedError()
    return r2, mse

END OF YOUR CODE

---

2.6 _(1 point)_ Create a variable `Data_kernels` that is a list of dictionaries. Each dictionary should have only one key (`'kernel'`) and the value should be instances of the `'RBF'`, `'Matern'` and `'ExpSineSquared'` kernel classes from [`sklearn.gaussian_process.kernels`](https://scikit-learn.org/1.5/api/sklearn.gaussian_process.html#module-sklearn.gaussian_process.kernels).

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

In [None]:
assert isinstance(Data_kernels, list), "Data_kernels is not a list!"
assert all(isinstance(d, dict) for d in Data_kernels), "Not all elements in Data_kernels are dictionaries!"
assert all(list(d.keys()) == ['kernel'] for d in Data_kernels), "Not all dictionaries in Data_kernels have the correct keys!"

---

2.7 _(2 points)_ Create an `ExperimentData` object called `experimentdata_gpr` and initialize with the experiments `Data_kernels` (created in the previous exercise) . Evaluate the experiments on the `evaluate_regressor` function. Use $123$ as the seed. Print the output to the terminal

PUT YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

The resulting `ExperimentData` should look something like this:

|  | jobs     | kernel                                         | mse       | r2        |
|------|----------|------------------------------------------------|-----------|-----------|
| 0    | finished | 1**2 * RBF(length_scale=1)                     | 38.557356 | -3.393046 |
| 1    | finished | Matern(length_scale=1, nu=1.5)                 | 9.367152  | -0.067250 |
| 2    | finished | ExpSineSquared(length_scale=1, periodicity=1)  | 2.707409  | 0.691530  |


---

2.8 _(2 points)_ Recreate the same experiment, but now also store output the predictions (`y_pred`) of the test data (`X_test`). Store a reference to the predicted numpy array to the `ExperimentData` object using the `to_disk=True` functionality. You can find more information on this [in the `f3dasm` documentation](https://f3dasm.readthedocs.io/en/latest/notebooks/design/domain_creation.html#Storing-parameters-on-disk).

YOUR CODE IN THE CELL BELOW

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

END OF YOUR CODE

---

End of the homework!

---