## <font style="font-family:roboto;color:#455e6c"> Introduction to Workflows </font>  

<div class="admonition note" name="html-admonition" style="background:#e3f2fd; padding: 10px">
<font style="font-family:roboto;color:#455e6c"> <b> DPG Tutorial: Automated Workflows and Machine Learning for Materials Science Simulations </b> </font> </br>
<font style="font-family:roboto;color:#455e6c"> 16 March 2024 </font>
</div>

We will use [pyiron_workflow](https://github.com/pyiron/pyiron_workflow), a framework for constructing workflows as computational graphs from simple python functions, to create a simple workflow for data analysis. Coverting your script to a workflow that you can use a number of powerful features that pyiron provides, such as data management, job management, at the same time ensuring that they are fully reproducible.

In this example, we will use a very common use case in Materials Science, which is to use data from a [tensile test](https://en.wikipedia.org/wiki/Tensile_testing) to calculate the [Young's modulus](https://en.wikipedia.org/wiki/Young%27s_modulus).



We start from a datafile in csv format. The file containes data from a tensile test of typical S355 (material number: 1.0577) structural steel (designation of steel according to DIN EN 10025-2:2019). The data were generated in the [Bundesanstalt für Materialforschung und -prüfung (BAM)](https://zenodo.org/communities/bam) in the framework of the digitization project [Innovationplatform MaterialDigital (PMD)](https://www.materialdigital.de/) which, amongst other activities, aims to store data in a semantically and machine understandable way. 

### References

- Schilling, M., Glaubitz, S., Matzak, K., Rehmer, B., & Skrotzki, B. (2022). Full dataset of several mechanical tests on an S355 steel sheet as reference data for digital representations (1.0.0) [Data set](https://doi.org/10.5281/zenodo.6778336)

Let's start with the visualisation of how such a workflow would look like:

<img src="img/workflow-dpg.png" width="700">

In the tensile test experiment, the force (load) and elongation values are recorded, and saved in a csv file which forms the dataset. We would like to read in this dataset, and convert the load and elongation to stress and strain. Then we plot the results, and calculate a the Young's modulus, which is the slope of the linear, elastic part of the stress-strain curve. Your calculation could depend on the value of this strain-cutoff that is used, which is something we will explore. 

<div class="admonition note" name="html-admonition" style="background: #FFEDD1; padding: 10px">
<p class="title"><b>Note</b></p>
Note that the stress and strain used in this notebook are actually <a href="https://en.wikipedia.org/wiki/Stress%E2%80%93strain_curve">engineering stress and strain</a>
</div>

To create such a workflow, we start by defining some functions which does each of this step. We will use 'pyiron_workflow' to compose them into a workflow, which can then be easily visualised and executed.

Before we move on to the actual workflow, a crash course on Jupyter notebooks.

### <font style="font-family:roboto;color:#455e6c"> Jupyter Crash Course </font>  

1. Select cells by clicking on them.
2. Navigate through with `up` and `down` keys (or `k` and `j` for you vimmers).
3. Press Enter to edit a cell.
4. Press Shift-Enter to execute it.
5. Create new cells above or below the current one with `a` or `b`.
6. Copy, cut and paste them with `c`, `x` and `v`.
7. Press `m` to turn a cell into a markdown cell.
8. See the `Help` in the toolbar for more.

In [None]:
from pyiron_workflow import as_function_node, as_macro_node, as_dataclass_node, Workflow

### <font style="font-family:roboto;color:#455e6c"> Reading in the experimental results </font>  

This function reads in the csv file:

In [None]:
@as_function_node("csv")
def ReadCSV(filename: str, header: list = [0, 1], decimal: str = ",", delimiter: str = ";"):
    import pandas as pd
    return pd.read_csv(filename, delimiter=delimiter, header=header, decimal=decimal)

Then a function to convert the load to stress:

In [None]:
@as_function_node
def CovertLoadToStress(df, area):
    """
    Read in csv file, convert load to stress
    """
    kN_to_N = 0.001  # convert kiloNewton to Newton
    mm2_to_m2 = 1e-6  # convert square millimeters to square meters
    df["Stress"] = df["Load"] * kN_to_N / (area * mm2_to_m2)
    #although it says extensometer elongation, the values are in percent! 
    strain = df["Extensometer elongation"].values.flatten()
    #subtract the offset from the dataset
    strain = strain - strain[0]
    stress = df["Stress"].values.flatten()
    return stress, strain

### <font style="font-family:roboto;color:#455e6c"> Calculate Young's modulus </font>  

The stress and strain values, which are outputs of the previous function is used for a linear fit in this function, and the slope is calculated. The slope is the Young's modulus. The calculated value of Young's modulus will depend on the `strain_cutoff` parameter.

In [None]:
@as_function_node
def CalculateYoungsModulus(stress, strain, strain_cutoff=0.2):
    import numpy as np
    percent_to_fraction = 100  # convert
    MPa_to_GPa = 1 / 1000  # convert MPa to GPa
    arg = np.argsort(np.abs(np.array(strain) - strain_cutoff))[0]
    fit = np.polyfit(strain[:arg], stress[:arg], 1)
    youngs_modulus = fit[0] * percent_to_fraction * MPa_to_GPa
    return youngs_modulus

### <font style="font-family:roboto;color:#455e6c"> Plotting the results </font>  

This function plots the stress and strain.

In [None]:
@as_function_node
def Plot(stress, strain, format="-"):
    import matplotlib.pyplot as plt
    plt.plot(strain, stress, format)
    plt.xlabel("Strain [%]")
    plt.ylabel("Stress [MPa]")
    return 1

### <font style="font-family:roboto;color:#455e6c"> Creating a workflow </font>  

Now we can combine all the functions together to compose a workflow. Each function corresponds to a step in the workflow and their inputs and outputs are linked.

In [None]:
wf = Workflow("youngs_modulus")
wf.read_csv =ReadCSV('data/dataset_1.csv')
wf.stresses = CovertLoadToStress(wf.read_csv, 120)

wf.youngs_modulus = CalculateYoungsModulus(
    stress=wf.stresses.outputs.stress,
    strain=wf.stresses.outputs.strain,
)

wf.plot = Plot(
    stress=wf.stresses.outputs.stress,
    strain=wf.stresses.outputs.strain,
)

Now we execute the workflow

In [None]:
wf.run()

We can also visualise the workflow. The visualisation shows the different steps, and their inputs and outputs and how they are linked together.

In [None]:
wf.draw(size=(12, 15))

### <font style="font-family:roboto;color:#455e6c"> A graphical user interface for running workflows </font>  

We can use use the tool [pyironflow](https://github.com/pyiron/pyironflow) to visually compose and execute the workflow.

In [None]:
from pyironflow import PyironFlow

In [None]:
pf = PyironFlow()
pf.gui

We can also see the saved workflow

In [None]:
pf = PyironFlow([Workflow('tensile_example')])
pf.gui

<div class="admonition note" name="html-admonition" style="background: #FFEDD1; padding: 10px">
<p class="title"><b>Note</b></p>
As we have seen, the ranges of stress and strain have to chosen carefully. In practice, this is done by calculating <a href="https://materion.com/-/media/files/alloy/newsletters/technical-tidbits/issue-no-47---yield-strength-and-other-near-elastic-properties.pdf">R<sub>P0,2</sub> yield stress</a>
</div>

### <font style="font-family:roboto;color:#455e6c"> Software used in this notebook </font>  

- [pyiron](https://pyiron.org/)
- [pyiron_workflow](https://github.com/pyiron/pyiron_workflow)