# Biodiversity time series analyses from European marine ecosystems

![images/biotisan_euromarec_flowchart.png](images/biotisan_euromarec_flowchart.png)

To run the workflow, do the following:

#### Prepare an access key on MinIO
This virtual lab uses MinIO to upload the data that will be analyzed and store the results of the workflow. You can log into MinIO at https://scruffy.lab.uvalight.net:9001 using the same credentials as you're using to log into NaaVRE. In order to store and retrieve data from MinIO, you will need an access key. Create an access key [in MinIO](https://scruffy.lab.uvalight.net:9001/access-keys). Make sure to store the access key and secret key.

> **_Upcoming Feature:_**  An upcoming feature in NaaVRE will make the generation of the MinIO secret unnecessary.

#### Prepare the data
You can run the workflow with sample data or with your own data. The sample data is fully prepared and can be used by setting _param_use_dummy_data_ to "1" at a later step. In that case, you can skip ahead to the next paragraph: [Run the workflow in NaaVRE](#Run-the-workflow-in-NaaVRE). In case you want to use your own data, do the following in NaaVRE: 

Go to the _File Browser_.

Open _Cloud storage_ -> _naa-vre-public_ -> _vl-biotisan-euromarec_.

Download _Template....xlsx_ file:

![images/download_template.png](images/download_template.png)

Fill the excel file with you own data.

In the _File Browser_, go to _Cloud storage_ -> _naa-vre-user-data_, Upload the _Excel_ file you just filled with your own data.

![images/upload_data.png](images/upload_data.png)

The file should now be visible in your user data directory:

![images/data_in_cloud_storage.png](images/data_in_cloud_storage.png)

#### Run the workflow in NaaVRE
To run the workflow in NaaVRE, open the workflow file: [../workflows/Biodiversity_time_series_analyser.naavrewf](./../workflows/Biodiversity_time_series_analyser.naavrewf).

![images/open_workflow.png](images/open_workflow.png)

Optionally you can drag the window next to this tutorial window to view both at the same time.

Press "Run"

![images/run_workflow.png](images/run_workflow.png)

Press "Use default parameter values"

![images/default_parameter_values.png](images/default_parameter_values.png)

Set _param_use_dummy_data_ to "true" if you want to use the dummy data. Set it to "false" if you've uploaded your own data in **Preparing the data**.

Scroll to the bottom to fill in the e mail address you've used to log into NaaVRE in the field _param_user_email_. If you're unsure which e-mail address you've used, you can check the url: *https://beta.naavre.net/jupyter/user/[your_e_mail_address]/lab/workspaces/*

Fill in the MinIO access key and secret key you've just generated in **Prepare an access key on MinIO**.

Press "Run".

![images/press_run.png](images/press_run.png)

Check the notifications at the bottom right to confirm whether the workflow is running:

![images/running_workflow.png](images/running_workflow.png)

Wait for the workflow to complete.

In case the workflow succeeds, you can proceed to **Inspect the outcome**:

![imaged/successful_run.png](images/successful_run.png)

In case the workflow fails, you can explore why in the next paragraph **Inspecting workflow errors**:

![imaged/failed_run.png](images/failed_run.png)

#### Inspecting workflow errors
In case the workflow run fails, go to the File Browser (Folder icon in the vertical menu on the left) and click on the Folder icon next to it to go to your home folder:

![images/home_folder.png](images/home_folder.png)

Navigate to _Cloud Storage -> naa-vre-user-data_. In case the workflow failed on validations, you will see a file *"[timestamp]_validation_log.txt"*:

![images/validation_log.png](images/validation_log.png)

In case this file appears, open it to check which validation errors occured.

In case no validation errors occured but the workflow still failed, press "_Show in workflow engine_" to explore the errors:

![imaged/failed_run.png](images/failed_run.png)

The first time you might encounter an error "_Failed to load version/info Error_", which you can ignore. If you see a login prompt, use the leftmost login button:

![images/login_to_argo.png](images/login_to_argo.png)

Then argo might ask you what you are using Argo for. You can simply close this. You should now see your workflow run:

![images/workflow_in_argo.png](images/workflow_in_argo.png)

Click on the failed node:

![images/click_failed_node.png](images/click_failed_node.png)

A pop up should appear on the screen. Click on "LOGS" to inspect the output of the failed workflow component:

![images/click_loge.png](images/click_logs.png)

#### Inspect the outcome
In case the workflow run succeeded, go to the File Browser (Folder icon in the vertical menu on the left) and click on the Folder icon next to it to go to your home folder:

![images/home_folder.png](images/home_folder.png)

Navigate to _Cloud Storage -> naa-vre-user-data_. You should now see an output file from your workflow: *"[Timestamp]__[Data_filename]__final_results_all.csv"*

![images/view_results.png](images/view_results.png)

Additionally, if you've kept the parameter *"param_make_plot"* on *"true"*, you will see three plots in the directory in *.png* format similar to this:

![images/frequency_distribution.png](images/frequency_distribution.png)

If you set the parameter *"param_output_samples_ecological_parameters"* to *"true"*, you will also see an output file 

> **_What if:_**  You don't see the expected output file, double check that the _param_user_email_ matches the user displayed in your browser url and the workflow has completed successfully. If so, but you still don't see an output file. Please get in touch.
  
#### Adapt the workflow 
You can adapt the workflow in NaaVRE to suit your own research objectives. To do this, copy the content of _Virtual Labs -> Biodiversity Time Series Analyses -> Git public_ to  _Virtual Labs -> Biodiversity Time Series Analyses -> My data_, or fork and clone the [git repository](https://github.com/QCDIS/Biodiversity_time_series_analyses_from_European_marine_ecosystems) to _My data_. 

To adapt the workflow, change the source code available in this virtual lab: [codebase/Data_cleaning_analysis....ipynb](../codebase/Data_cleaning_analysis_Example-Carlos-Cano-Barbacil.ipynb). After changing the source code you can recontainerize the Jupyter Notebook cell and update the adapted workflow node in  [workflows/biodiversity_time_series_analyses.naavrewf](../workflows/Biodiversity_time_series_analyses.naavrewf). For documentation how on to make these changes, go to https://naavre.net/docs/tutorials/#from-notebook-to-workflow. 