Merge pull request #28 from BAMresearch/feature/simpleusecase-aiida

Feature/simpleusecase aiida
BAMresearch · May 4, 2022 · 98f1114 · 98f1114
2 parents a69befd + 6ddb2e9
commit 98f1114
Show file tree

Hide file tree

Showing 5 changed files with 336 additions and 4 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -33,7 +33,6 @@ jobs:
           conda install --channel conda-forge doit=0.33.1
           cd $GITHUB_WORKSPACE/simple_use_case/pydoit
           doit
-
       - name: upload-paper-artifact
         uses: actions/upload-artifact@v2
         with:
@@ -65,7 +64,6 @@ jobs:
           conda install --channel conda-forge cwltool
           cd $GITHUB_WORKSPACE/simple_use_case/cwl
           cwltool wf_run_use_case.cwl
-
       - name: upload-paper-artifact
         uses: actions/upload-artifact@v2
         with:
@@ -97,7 +95,6 @@ jobs:
           conda install --channel bioconda nextflow=21.04.0
           cd $GITHUB_WORKSPACE/simple_use_case/nextflow
           nextflow run simplecase.nf
-
       - name: upload-paper-artifact
         uses: actions/upload-artifact@v2
         with:
@@ -130,7 +127,6 @@ jobs:
           conda install snakemake
           cd $GITHUB_WORKSPACE/simple_use_case/snakemake
           snakemake --cores 1 --use-conda --conda-frontend conda ./paper.pdf
-
       - name: upload-paper-artifact
         uses: actions/upload-artifact@v2
         with:
@@ -170,3 +166,100 @@ jobs:
           path: ./simple_use_case/kadistudio/paper.pdf
           retention-days: 1
           if-no-files-found: error
+
+  run-aiida:
+    runs-on: ubuntu-latest
+
+    services:
+      postgres:
+        image: postgres:10
+        env:
+          POSTGRES_DB: test_aiida
+          POSTGRES_PASSWORD: ''
+          POSTGRES_HOST_AUTH_METHOD: trust
+        options: >-
+          --health-cmd pg_isready
+          --health-interval 10s
+          --health-timeout 5s
+          --health-retries 5
+        ports:
+          - 5432:5432
+      rabbitmq:
+        image: rabbitmq:3.8.14
+        ports:
+          - 5672:5672
+      slurm:
+        image: xenonmiddleware/slurm:17
+        ports:
+          - 5001:22
+
+    steps:
+    - name: checkout-repository
+      uses: actions/checkout@v2
+
+    - name: install-basic-deps
+      uses: ./.github/actions/install-basic-deps
+
+    - name: install-system-dependencies
+      run: |
+        sudo apt update
+        sudo apt install postgresql graphviz
+
+    - name: setup-conda-environment
+      uses: conda-incubator/setup-miniconda@v2
+      with:
+        environment-file: simple_use_case/source/envs/default_env.yaml
+        miniforge-version: latest
+        activate-environment: simple_use_case
+
+    - name: install-pypdf2
+      # It is crucial to update `setuptools` or the installation of `pymatgen` can break
+      shell: bash -l {0}
+      run:
+        conda install --channel conda-forge pypdf2 setuptools pip --yes
+
+    - name: install-aiida-core
+      working-directory: simple_use_case/aiida
+      # the shell directive is necessary to properly activate the shell
+      # see https://github.com/marketplace/actions/setup-miniconda
+      shell: bash -l {0}
+      run: |
+        git clone https://github.com/aiidateam/aiida-core.git
+        cd aiida-core
+        git checkout v2.0.0b1
+        pip install -e .
+
+    - name: install-aiida-shell
+      working-directory: simple_use_case/aiida
+      shell: bash -l {0}
+      run: |
+        git clone https://github.com/sphuber/aiida-shell.git
+        cd aiida-shell
+        git checkout v0.1.0
+        pip install -e .
+
+    - name: adjust-github-workspace
+      working-directory: simple_use_case/aiida
+      shell: bash -l {0}
+      run:
+        python3 ./set_workspace.py --input ./aiida-core/.github/workflows/setup.sh --output ./aiida-core/.github/workflows/mysetup.sh
+
+    - name: setup-aiida-environment
+      shell: bash -l {0}
+      run: |
+        chmod +x $GITHUB_WORKSPACE/simple_use_case/aiida/aiida-core/.github/workflows/mysetup.sh
+        $GITHUB_WORKSPACE/simple_use_case/aiida/aiida-core/.github/workflows/mysetup.sh
+
+    - name: run-workflow
+      working-directory: simple_use_case/aiida
+      shell: bash -l {0}
+      run:
+        ./suc_v2b1.py
+
+    - name: upload-paper-artifact
+      uses: actions/upload-artifact@v2
+      with:
+        name: paper
+        path: ./simple_use_case/aiida/paper.pdf
+        retention-days: 1
+        if-no-files-found: error
diff --git a/simple_use_case/aiida/.gitignore b/simple_use_case/aiida/.gitignore
@@ -0,0 +1,5 @@
+outfile*
+*.pvd
+*.vtu
+*.msh
+*.pdf
diff --git a/simple_use_case/aiida/README.md b/simple_use_case/aiida/README.md
@@ -0,0 +1,88 @@
+# AiiDA
+This directory contains an implementation of the simple use case with [AiiDA](https://www.aiida.net/).
+
+## Implementation
+Since the implementation of workflows in AiiDA is quite different from the other file
+based workflow managers (like e.g. snakemake or nextflow), we briefly comment on the different options
+and design choices in AiiDA.
+
+### Calculation functions
+According to the [documentation](https://aiida.readthedocs.io/projects/aiida-core/en/latest/topics/calculations/concepts.html#calculation-functions): 
+> The calcfunction in AiiDA is a function decorator that transforms a regular python function in a calculation process, which automatically stores the provenance of its output in the provenance graph when executed.
+
+Typically `calcfunction`s are used for short running processes to be run on the local machine, like preprocessing and postprocessing steps.
+One could think of a workaround, using `os.subprocess` inside a `calcfunction` to run the processes of the simple use case.
+However, `calcfunction`s are not intended to be used to run external codes and the use of `os.subprocess` is discouraged since in this case the provenance cannot be properly captured by AiiDA.    
+
+### Calculation jobs
+> ... not all computations are well suited to be implemented as a python function, but rather are implemented as a separate code, external to AiiDA. To interface an external code with the engine of AiiDA, the CalcJob process class was introduced
+
+The `CalcJob` is designed to run a `Code` on *any* computer through AiiDA. 
+While this is very powerful, apart from installing the `Code` on the other computer it is necessary to setup the `code` with AiiDA and [write a plugin](https://aiida.readthedocs.io/projects/aiida-core/en/latest/howto/plugin_codes.html) which instructs AiiDA how to run the external `Code`.
+For long running processes (computationally expensive tasks) this is worth the while, but for simple
+shell commands the effort is too high.
+
+### AiiDA shell plugin
+The [AiiDA shell plugin](https://github.com/sphuber/aiida-shell) was developed to make it easier to run simple shell commands with AiiDA.
+This way any command line tool (external code) installed on the *computer* can be run without the need to write a plugin.
+Moreover, the `ShellJob` inherits from `CalcJob` and thus it is possible to run commands on remote computers.
+Instructions on how to setup a remote computer can be found in this [how-to-guide](https://aiida.readthedocs.io/projects/aiida-core/en/latest/howto/run_codes.html#how-to-set-up-a-computer).
+
+## Installation
+Please follow the instructions in the [documentation](https://aiida.readthedocs.io/projects/aiida-core/en/latest/)
+to make yourself familiar with the installation process.
+It is recommended to use the system-wide installation method, where you first install prerequisite
+services using a package manager (e. g. on Ubuntu)
+```sh
+sudo apt install \
+    git python3-dev python3-pip \
+    postgresql postgresql-server-dev-all postgresql-client rabbitmq-server
+```
+Next, we prepare a conda environment with all the software required to run the simple use case.
+```sh
+conda env create --name aiida_simplecase --file ../source/envs/default_env.yaml
+conda activate aiida_simplecase
+```
+Make sure that the python version is greater than 3.8, since this is required by the `aiida-shell` plugin.
+Moreover, `aiida-core` version `2.0.0b1` is required which was released on March 16th, 2022.
+Therefore, we proceed to install `aiida-core` and the `aiida-shell` plugin from source.
+Make that your conda environment is activated as above and run the following commands.
+```sh
+git clone git@github.com:aiidateam/aiida-core.git
+cd aiida-core
+git checkout v2.0.0b1
+pip install -e .
+```
+```sh
+git clone git@github.com:sphuber/aiida-shell.git
+cd aiida-shell
+git checkout v0.1.0
+pip install -e .
+```
+Finally, run
+```sh
+verdi quicksetup
+```
+to setup a profile and see if everything was installed correctly by running
+```sh
+verdi status
+```
+
+## Running the simple use case
+If you are using `conda`, activate your environment.
+```
+conda activate aiida_simplecase
+```
+Make the workflow script executable (`chmod +x ./suc_v2b1.py`) and run it with
+```
+./suc_v2b1.py
+```
+By default all `ShellJob`s are run on the `localhost`.
+Some useful commands to inspect the status of the processes run and their results stored in the database are listed below.
+```
+verdi process list -a           # lists all processes
+verdi process show <PK>         # show info about process
+verdi process report <PK>       # log messages if something went wrong
+verdi node show <PK>            # show info about node
+verdi node graph generate <PK>  # generate provenance graph for node
+```
diff --git a/simple_use_case/aiida/set_workspace.py b/simple_use_case/aiida/set_workspace.py
@@ -0,0 +1,17 @@
+import argparse
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser("replace GITHUB_WORKSPACE")
+    parser.add_argument("--input", required=True, help="input file")
+    parser.add_argument("--output", required=True, help="output file")
+    args = parser.parse_args()
+
+    with open(args.output, "w") as outstream:
+        with open(args.input, "r") as instream:
+            content = instream.read()
+            new = content.replace(
+                "${GITHUB_WORKSPACE}",
+                "${GITHUB_WORKSPACE}/simple_use_case/aiida/aiida-core",
+            )
+            outstream.write(new)
diff --git a/simple_use_case/aiida/suc_v2b1.py b/simple_use_case/aiida/suc_v2b1.py
@@ -0,0 +1,129 @@
+#!/usr/bin/env runaiida
+
+from aiida_shell import launch_shell_job
+import PyPDF2
+
+# ### generate mesh with gmsh
+gmsh_results, gmsh_node = launch_shell_job(
+    "gmsh",
+    arguments=[
+        "-2",
+        "-setnumber",
+        "domain_size",
+        "2.0",
+        "{geometry}",
+        "-o",
+        "mesh.msh",
+    ],
+    files={"geometry": "../source/unit_square.geo"},
+    outputs=["mesh.msh"],
+)
+
+# ### convert mesh from msh to xdmf format
+meshio_results, meshio_node = launch_shell_job(
+    "meshio",
+    arguments=["convert", "{mesh}", "mesh.xdmf"],
+    files={"mesh": gmsh_results["mesh_msh"]},
+    filenames={"mesh": "mesh.msh"},
+    outputs=["*.xdmf", "*.h5"],
+)
+
+# ### solution of the poisson problem with fenics
+fenics_results, fenics_node = launch_shell_job(
+    "python",
+    arguments=[
+        "{script}",
+        "--mesh",
+        "{mesh}",
+        "--degree",
+        "2",
+        "--outputfile",
+        "poisson.pvd",
+    ],
+    files={
+        "script": "../source/poisson.py",
+        "mesh": meshio_results["mesh_xdmf"],
+        "mesh_h5": meshio_results["mesh_h5"],
+    },
+    filenames={"mesh": "mesh.xdmf", "mesh_h5": "mesh.h5"},
+    outputs=["poisson.pvd", "poisson000000.vtu"],
+)
+
+# ### postprocessing of the fenics job
+paraview_results, paraview_node = launch_shell_job(
+    "pvbatch",
+    arguments=["{script}", "{pvdfile}", "plotoverline.csv"],
+    files={
+        "script": "../source/postprocessing.py",
+        "pvdfile": fenics_results["poisson_pvd"],
+        "vtufile": fenics_results["poisson000000_vtu"],
+    },
+    filenames={"pvdfile": "poisson.pvd", "vtufile": "poisson000000.vtu"},
+    outputs=["plotoverline.csv"],
+)
+
+
+def read_domain_size():
+    stdout = gmsh_results["stdout"].get_content()
+    s = stdout.split("Used domain size:")[1]
+    size = float(s.split("Used mesh size")[0])
+    return str(size)
+
+
+def read_num_dofs():
+    stdout = fenics_results["stdout"].get_content()
+    ndofs = stdout.split("Number of dofs used:")[1]
+    return "".join(ndofs.split())
+
+
+# ### prepare latex macros
+macros, macros_node = launch_shell_job(
+    "python",
+    arguments=[
+        "{script}",
+        "--macro-template-file",
+        "{template}",
+        "--plot-data-path",
+        "{csvfile}",
+        "--domain-size",
+        read_domain_size(),
+        "--num-dofs",
+        read_num_dofs(),
+        "--output-macro-file",
+        "macros.tex",
+    ],
+    files={
+        "script": "../source/prepare_paper_macros.py",
+        "template": "../source/macros.tex.template",
+        "csvfile": paraview_results["plotoverline_csv"],
+    },
+    filenames={"csvfile": "plotoverline.csv"},
+    outputs=["macros.tex"],
+)
+
+# ### compile paper
+paper, paper_node = launch_shell_job(
+    "tectonic",
+    arguments=["{texfile}"],
+    files={
+        "texfile": "../source/paper.tex",
+        "macros": macros["macros_tex"],
+        "csvfile": paraview_results["plotoverline_csv"],
+    },
+    filenames={
+        "texfile": "paper.tex",
+        "macros": "macros.tex",
+        "csvfile": "plotoverline.csv",
+    },
+    outputs=["paper.pdf"],
+)
+
+# ### extract final PDF from database
+outstream = open("./paper.pdf", "wb")
+PdfWriter = PyPDF2.PdfFileWriter()
+
+with paper["paper_pdf"].open(mode="rb") as handle:
+    reader = PyPDF2.PdfFileReader(handle)
+    PdfWriter.appendPagesFromReader(reader)
+    PdfWriter.write(outstream)
+outstream.close()