# `f3dasm`: Framework for Data-Driven Design and Analysis of Structures and Materials 
*April 3rd, 2023* <br>
*Code Release Week \#1*

# Table of contents

**1. Project introduction**
- 1.1 Overview
- 1.2 Conceptual Map and use cases
- 1.3 Computational Framework
- 1.4 Installation and getting started

**2. Demonstration**

## 1.1 Overview

f3dasm is an attempt to unite data-driven design and analysis of structures of materials.
More concretely, 

## 1.2 Conceptual map and use cases

**Using `f3dasm` to handle your design of experiments**
- *Have your own functions and modules and coat them in a `f3dasm` sauce to manage and scale-up your experiments!*

**Using `f3dasm` to benchmark or compare models**
- *Go fully `f3dasm`: use existing implementations to benchmark parts of the data-driven machine learning process!*

**Develop on `f3dasm`**
- *Work hard, play hard: work towards making your implementations an official `f3dasm`extension!*



## 1.4 Installation and getting started

### System requirements
`f3dasm` is compatible with:
1. Python 3.7 to 3.10.
2. the three major operations system (Linux, MacOS, Ubuntu).
3. the default environment of Google Colab (Python 3.8, Linux) 
4. the `pip` package manager system.

Installation instruction can be found in the documentation page under [Getting Started](https://bessagroup.github.io/F3DASM/gettingstarted.html)

You can check if the installation was succesfull by importing f3dasm. It will show the installed version and the dependencies:

In [1]:
import f3dasm

2023-03-30 20:30:11,984 - Imported f3dasm (version: 0.2.98)
2023-03-30 20:30:13.027564: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-30 20:30:13.208754: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-03-30 20:30:13.970766: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.1/lib64:

#### Distinction between python package and repository
- The Python  PyPI package (`pip install f3dasm`) contains the code that is used when installing the package as a **user**. It contains only the `main` branch version.
- The GitHub repository is mainly for **developers** and besides the package includes:
  - Studies (more on that later)
  - Test suite
  - Documentation source
  - Tutorial notebooks

### Extensions

- The package contains a lot of implementation for each of the blocks.
- The installation of dependencies of `f3dasm` is modular: you decide what you want to use or not.

We can distinguish three ways of using `f3dasm`:

#### Using `f3dasm` to handle your design of experiments
*Have your own functions and modules and coat them in a `f3dasm` sauce to manage and scale-up your experiments!*

The **core** package: contains the minimal installation to use `f3dasm` without extended features. 
Installed with `pip install f3dasm`

The core package contains the following features:
1. provide a way to parametrize your experiment with the **design-of-experiments** classes.
2. provide the option to investigate their experiment by **sampling** and **optimizing** their design.
3. provide the user guidance in **parallelizing** their program and ordering their data.
4. give the user ways of deploying their **experiment** at the HPC (TORQUE system)

The core package requires the following dependencies:
- `numpy` and `scipy`: for numerical operations
- `pandas`: for the representation of the design of experiments
- `matplotlib`: for plotting
- `hydra-core`: for deploying your experiment 
- `pathos`: for multiprocessing
- `autograd`: for computing gradients

#### Using `f3dasm` to benchmark or compare models
*Go fully `f3dasm`: use existing implementations to benchmark parts of the data-driven machine learning process!*

You can solely use the core package, but it is advised to enrich `f3dasm` with its **extensions** 

The extensions contain the following features:
1. provide various **implementations** to accommodate common machine learning workflows.
2. provide **adapter** classes that link common machine learning libraries to `f3dasm` base classes. 


For each of the blocks, extensions can be installed to extend the choice of implementations. Installed with `pip install f3dasm[<name of extension>]`

The following extensions are available:
- **machinelearning**: containing various `tensorflow` related models
- **sampling**: containing sampling strategies from `SALib`
- **optimization**: containing various optimizers from `GPyOpt`, `pygmo` and `tensorflow`


#### Develop on `f3dasm`
*Work hard, play hard: work towards making your implementations an official `f3dasm`extension!*

If you want your implementation to be part of the `f3dasm` package, you can develop an adapter and/or implementation for `f3dasm`

The **developement** package: contains the full installation plus requirements for developing on `f3dasm`. 
Installed with `pip install f3dasm[dev]`

Information on how to contribute to `f3dasm` can be found [on the wiki page of the GitHub repository](https://github.com/bessagroup/F3DASM/wiki)!


### Feature requirements
The first release of `f3dasm` should:
1. have a way to parametrize a **design-of-experiments** from their own program
2. give the user the option to **sample** and **optimize** from their design with various optimizers
3. give the user guidance in **parallelizing** and ordering their experiment-data
4. provide high-level classes to deploy **simulations**.
4. give the user ways of deploying their **experiment** at the HPC (TORQUE system)

The first release of `f3dasm` **should not**:
1. force the user to use the built-in machine learning models and optimizers (these are extensions)
2. include state-of-the-art machine learning models as extensions

## Scale things up on the HPC!

Import some other packages and set a seed

In [2]:
import numpy as np
import logging
from pathos.helpers import mp  # For multiprocessing!
import time

SEED = 42

### Your program

Let's say we have a program that we want to execute. It is important that this could be **anything**. Like:
- Calculate the loss of some compliance curve in topology optimization
- Computing the mean stress and strain from some abaqus simulation
- Benchmarking various regressors in a multi-fidelity setting

At the top level of your experiment, you will probably have a main function that accepts some arguments and returns the quantity of interest.

Let's create such a function, just for demonstration purposes.

In [3]:
def my_own_program(a: float, b: float, c: float) -> float:    
    functions = [f3dasm.functions.Rastrigin, f3dasm.functions.Levy, f3dasm.functions.Ackley]
    y = []
    for func in functions:
        f = func(dimensionality=3, scale_bounds=np.tile([-1.,1.], (3,1)))
        time.sleep(.1)
        y.append(f(np.array([a,b,c])).ravel()[0])

    # Sum the values
    out = sum(y)
    logging.info(f"Executed program with a={a:.3f}, b={b:.3f}, c={c:.3f}: \t Result {out:.3f}")
    return out

What are we seeing:
- The program requires three floating points and returns a float as well.
- It creates three 3D-benchmark functions, evaluates them sequentially and sums the results
- We simulate some computational cost (0.1 seconds per evaluation) by calling the `time.sleep()` method
- We write to a log

> Note: `my_own_program` uses the integrated benchmark functions from `f3dasm`, but this could very well be one of your codes without any dependency on `f3dasm`.

Executing multiple experiments is easy:

In [4]:
inputs = np.random.uniform(size=(10,3))

start_time = time.time()
outputs = np.array([my_own_program(*input_vals) for input_vals in inputs])
time_not_parallel = time.time() - start_time

print(f"It took {time_not_parallel:.5f} seconds to execute this for loop")

2023-03-30 20:30:15,644 - Executed program with a=0.443, b=0.825, c=0.950: 	 Result 355.223
2023-03-30 20:30:15,948 - Executed program with a=0.168, b=0.787, c=0.096: 	 Result 215.947
2023-03-30 20:30:16,251 - Executed program with a=0.187, b=0.782, c=0.794: 	 Result 160.605
2023-03-30 20:30:16,556 - Executed program with a=0.115, b=0.624, c=0.761: 	 Result 199.399
2023-03-30 20:30:16,860 - Executed program with a=0.182, b=0.948, c=0.595: 	 Result 80.616
2023-03-30 20:30:17,165 - Executed program with a=0.137, b=0.142, c=0.017: 	 Result 124.994
2023-03-30 20:30:17,469 - Executed program with a=0.376, b=0.589, c=0.676: 	 Result 222.736
2023-03-30 20:30:17,772 - Executed program with a=0.038, b=0.604, c=0.749: 	 Result 155.112
2023-03-30 20:30:18,077 - Executed program with a=0.545, b=0.362, c=0.663: 	 Result 115.630
2023-03-30 20:30:18,386 - Executed program with a=0.032, b=0.666, c=0.768: 	 Result 142.940


It took 3.04596 seconds to execute this for loop


We can save the values of `outputs` for later use

## Local parallelization

If you are familiar with [multiprocessing](https://docs.python.org/3/library/multiprocessing.html), you might already know that we can speed-up this function by parellizing the internal for loop:

We create a multiprocessing pool (`mp.Pool()`) where we map the functions to cores in our machine:

In [5]:
def my_own_program_parallel(a: float, b: float, c: float) -> float:
    def evaluate_function(func, a, b, c):
        f = func(dimensionality=3, scale_bounds=np.tile([-1.,1.], (3,1)))
        y = f(np.array([a,b,c])).ravel()[0]
        time.sleep(.1)
        return y

    functions = [f3dasm.functions.Rastrigin, f3dasm.functions.Levy, f3dasm.functions.Ackley]
    with mp.Pool() as pool:
        y = pool.starmap(evaluate_function, [(func, a, b, c) for func in functions])

    # Sum the values
    out = sum(y)

    logging.info(f"Executed program with a={a:.3f}, b={b:.3f}, c={c:.3f}: \t Result: {out:.3f}")
    return out

Executing this function will speed up the process

In [6]:
inputs = np.random.uniform(size=(10,3))

start_time = time.time()
outputs = np.array([my_own_program_parallel(*input_vals) for input_vals in inputs])
time_parallel = time.time() - start_time

print(f"It took {time_parallel:.5f} seconds to execute this for loop")
print(f"We are {time_not_parallel-time_parallel:.5f} seconds faster by parellelization!")

2023-03-30 20:30:18,771 - Executed program with a=0.247, b=0.314, c=0.646: 	 Result: 136.850
2023-03-30 20:30:19,030 - Executed program with a=0.264, b=0.488, c=0.115: 	 Result: 102.013
2023-03-30 20:30:19,293 - Executed program with a=0.437, b=0.355, c=0.557: 	 Result: 121.198
2023-03-30 20:30:19,541 - Executed program with a=0.773, b=0.031, c=0.727: 	 Result: 198.581
2023-03-30 20:30:19,802 - Executed program with a=0.327, b=0.942, c=0.786: 	 Result: 131.037
2023-03-30 20:30:20,054 - Executed program with a=0.078, b=0.618, c=0.093: 	 Result: 83.309
2023-03-30 20:30:20,312 - Executed program with a=0.992, b=0.958, c=0.593: 	 Result: 131.035
2023-03-30 20:30:20,570 - Executed program with a=0.686, b=0.184, c=0.946: 	 Result: 171.295
2023-03-30 20:30:20,827 - Executed program with a=0.304, b=0.389, c=0.785: 	 Result: 126.305
2023-03-30 20:30:21,077 - Executed program with a=0.558, b=0.479, c=0.004: 	 Result: 88.377


It took 2.53714 seconds to execute this for loop
We are 0.50881 seconds faster by parellelization!


## Scale-up: challlenges

Now we would like to really scale things up. What challenges lie along the way? I asked ChatGPT4:


- **Experiment design and analysis**: As the complexity of the experiment increases, it becomes more difficult to design experiments that are robust and reproducible, and to analyze the results in a meaningful way. This can lead to issues with experimental design, parameter tuning, and statistical analysis.

- **Parallelization and distribution**: As experiments become larger, it may be necessary to parallelize or distribute the computations across multiple machines or nodes in order to reduce the overall runtime. This introduces additional challenges such as synchronization between distributed processes.

- **Managing data**: As the volume of data generated by an experiment increases, it becomes more difficult to manage and store that data. This can lead to issues with data corruption, loss, or inconsistency.

This is where `f3dasm` is a helping hand!

#### Experiment design and analysis

We can create a `f3dasm.DesignSpace` to capture the variables of interest:
- A `f3dasm.DesignSpace` consists of an input and output list of `f3dasm.Parameter` objects

In [7]:
param_a = f3dasm.ContinuousParameter(name='a', lower_bound=-1., upper_bound=1.)
param_b = f3dasm.ContinuousParameter(name='b', lower_bound=-1., upper_bound=1.)
param_c = f3dasm.ContinuousParameter(name='c', lower_bound=-1., upper_bound=1.)
param_out = f3dasm.ContinuousParameter(name='y')

design = f3dasm.DesignSpace(input_space=[param_a, param_b, param_c], output_space=[param_out])

We can create an object to store the experiments: `f3dasm.ExperimentData`, but we can also **sample from this designspace**
We do that with the `f3dasm.sampling` submodule:

> Note that this submodule offers an extension (`f3dasm[sampling]`) that include sampling strategies from `SALib` 

In [8]:
# Create the sampler object
sampler = f3dasm.sampling.RandomUniform(design=design, seed=SEED)

data: f3dasm.ExperimentData = sampler.get_samples(numsamples=10)

The data object is under the hood a pandas dataframe:

In [9]:
data.data

Unnamed: 0_level_0,input,input,input,output
Unnamed: 0_level_1,a,b,c,y
0,-0.25092,0.901429,0.463988,
1,0.197317,-0.687963,-0.688011,
2,-0.883833,0.732352,0.20223,
3,0.416145,-0.958831,0.93982,
4,0.664885,-0.575322,-0.63635,
5,-0.633191,-0.391516,0.049513,
6,-0.13611,-0.417542,0.223706,
7,-0.721012,-0.415711,-0.267276,
8,-0.08786,0.570352,-0.600652,
9,0.028469,0.184829,-0.907099,


The `y` values are NaN because we haven't evaluate our experiment yet! Let's do that:

Handy: we can retrieve the input columns of a specific row as a dictionary

In [10]:
data.get_inputdata_by_index(index=3)

{'a': 0.416145155592091, 'b': -0.9588310114083951, 'c': 0.9398197043239886}

Unpacking the values as arguments of our experiment creates the same results:

In [11]:
for index in range(data.get_number_of_datapoints()):
    value = my_own_program_parallel(**data.get_inputdata_by_index(index))
    data.set_outputdata_by_index(index, value)

2023-03-30 20:30:21,721 - Executed program with a=-0.251, b=0.901, c=0.464: 	 Result: 261.134
2023-03-30 20:30:21,975 - Executed program with a=0.197, b=-0.688, c=-0.688: 	 Result: 19.109
2023-03-30 20:30:22,234 - Executed program with a=-0.884, b=0.732, c=0.202: 	 Result: 321.825
2023-03-30 20:30:22,508 - Executed program with a=0.416, b=-0.959, c=0.940: 	 Result: 170.930
2023-03-30 20:30:22,760 - Executed program with a=0.665, b=-0.575, c=-0.636: 	 Result: 79.458
2023-03-30 20:30:23,023 - Executed program with a=-0.633, b=-0.392, c=0.050: 	 Result: 139.412
2023-03-30 20:30:23,281 - Executed program with a=-0.136, b=-0.418, c=0.224: 	 Result: 115.536
2023-03-30 20:30:23,534 - Executed program with a=-0.721, b=-0.416, c=-0.267: 	 Result: 83.109
2023-03-30 20:30:23,798 - Executed program with a=-0.088, b=0.570, c=-0.601: 	 Result: 215.214
2023-03-30 20:30:24,068 - Executed program with a=0.028, b=0.185, c=-0.907: 	 Result: 109.803


Now our data-object is filled

In [12]:
data.data

Unnamed: 0_level_0,input,input,input,output
Unnamed: 0_level_1,a,b,c,y
0,-0.25092,0.901429,0.463988,261.134214
1,0.197317,-0.687963,-0.688011,19.109039
2,-0.883833,0.732352,0.20223,321.825051
3,0.416145,-0.958831,0.93982,170.930424
4,0.664885,-0.575322,-0.63635,79.458296
5,-0.633191,-0.391516,0.049513,139.411721
6,-0.13611,-0.417542,0.223706,115.535908
7,-0.721012,-0.415711,-0.267276,83.1094
8,-0.08786,0.570352,-0.600652,215.214311
9,0.028469,0.184829,-0.907099,109.803282


#### Paralellization and distribution

`f3dasm` can handle paralellization to the HPC cluster and experiment distribution. 

In order to set this up, navigate to a folder where you want to create your experiment and run f3dasm.experiment.quickstart()

In [14]:
f3dasm.experiment.quickstart()


# I'll not run this because this is a demo

FileExistsError: [Errno 17] File exists: '/home/martin/Documents/GitHub/F3DASM_practical/code release week/hydra'

This creates the following files and folders:

```
└── my_experiment 
    ├── main.py
    ├── config.py
    ├── config.yaml
    ├── default.yaml
    ├── pbsjob.sh
    └── README.md
    └── hydra/job_logging
        └── custom_script.py
```

Without going to much in detail, the following things have already been set up automatically:

**Logging**
- `hydra` (and the `custom_script.py`) take care of all (multiprocess) logging
- including writing across nodes when executing arrayjobs!

**Saving data**
- `hydra` creates a new `outputs/<HPC JOBID>/` directory that saves all output files, logs and settings when executing `main.py`
- When executing arrayjobs, all arrayjobs write to the same folder!

**Parameter storage**
- `config.yaml`, `config.py` and `default.yaml` can be used for easy reproducibility and parameter tuning of your experiment!

**Parallelization**
- `pbsjob.sh` can be used to execute your `main.py` file on the HPC, including array-jobs.