# Execute VIP Jobs from your Computer

This Notebook uses the `VipSession` class from the [VIP Python client](https://github.com/virtual-imaging-platform/VIP-python-client/), which allows to run VIP applications on local datasets. In this tutorial you will learn how to:
- **Create** a VipSession instance;
- **Perform** an *upload-run-download* procedure (see image below) to get your outputs from VIP;
- **Parallelize** your executions on VIP;
- **Remove** your temporary data from VIP;
- **Resume & repeat** previous VIP executions.

For more advanced use of the Python client (*e.g.*, running different pipelines on the same dataset), please [read the doc on GitHub](https://github.com/virtual-imaging-platform/VIP-python-client/#vipsession).

<img src="imgs/Upload_Run_Download.png" alt="Procedure" height="250" title="Procedure for the Python Client"/>


*__N.B.__: In this Notebook, commands starting with "`!`" will only run on Linux distributions (including the Binder instance).*

In [None]:
# Python libraries used in this tutorial
from pathlib import *
from vip_client import VipSession

## Initiate your Session 
*The VIP API key can be genereated in your own account settings. This short procedure can be found in the [documentation](https://github.com/virtual-imaging-platform/VIP-python-client/#get-a-vip-api-key).* 

In [None]:
# Handshake with VIP
VipSession.init(api_key="VIP_API_KEY"); # Paste your VIP API key here

*VIP jobs will be launched through a persistent `VipSession` instance. Providing a session name is a good practice.*

In [None]:
# Instantiate a VipSession object with session name: "Demo-VipSession"
my_session = VipSession("Demo-VipSession")

## The *upload-run-download* procedure

### Upload your dataset on VIP

*Local folder `./data` contains the entire dataset*

In [None]:
input_dir = Path("data")
! tree {input_dir}

The VipSession method `upload_inputs()` is meant to upload the full dataset on VIP

In [None]:
my_session.upload_inputs(input_dir)

### Launch the Application

#### Get the Pipeline Identifier

Search for available applications on VIP 

In [None]:
VipSession.show_pipeline("cquest")

*If the previous cell displays "`No pipeline found`", please subscribe to the MR Spectroscopy group in your __account settings__ on the [VIP portal](https://vip.creatis.insa-lyon.fr/).*

In [None]:
# Pipeline identifier = App/Version (/!\ mind the case)
pipeline_id = "CQUEST/0.6"

#### Get your Input Settings

Show the pipeline description to know which inputs are required by the application

In [None]:
VipSession.show_pipeline(pipeline_id)

Provide the input settings as a dictionnary

In [None]:
# Input values can take several formats:
input_settings = { 
    "zipped_folder" : "data/basis.zip", # String
    "parameter_file" : input_dir / "parameters.txt", # PathLib object
    "data_file" : [file for file in (input_dir/"signals").iterdir()] # List
}

*A **list** of values (`[...]`) submits __parallel__ jobs on VIP. In this example, all signals (`data_file`) will be processed in parallel with the same `parameter_file` and `zipped_folder`.*

In [None]:
# Display the previous settings as strings to see the list of files in `data_file`
import json
print("input_settings =",
    json.dumps(indent=3, obj={
        key: [str(v) for v in value] if isinstance(value, list) else str(value) for key, value in input_settings.items()
        }
    )
)

### Launch & monitor executions on VIP

VIP executions are launched with `launch_pipeline()`. Parallel *jobs* (*e.g.* processing the signals) are launched at once and grouped in a single *workflow*.

In [None]:
my_session.launch_pipeline(pipeline_id, input_settings);

*You can monitor the workflow progression on https://vip.creatis.insa-lyon.fr/* ...

... or wait until all jobs are over on this terminal with `monitor_workflows()`:

In [None]:
my_session.monitor_workflows(refresh_time=10);

### Download your Results

When all jobs are over, the outputs are downloaded at once with `download_outputs()`. Output files that have already been downloaded will be ignored.

In [None]:
my_session.download_outputs();

*If `output_dir` was not specified when instanciating `my_session`, the outputs are stored at default location (recommended for beginners)*

In [None]:
! tree {my_session.output_dir}

### Comments
1. This 4-step procedure (`upload_inputs()` -> `launch_pipeline()` -> `monitor_workflows()` -> `download_outputs()`) can be performed with a single command using `run_session()`:
```python 
    my_session = VipSession(
        session_name = "Demo-VipSession",
        input_dir = input_dir,
        pipeline_id = pipeline_id,
        input_settings = input_settings
    ).run_session(refresh_time=10);
```
2. Setting all properties (*e.g.*, `input_dir`, `pipeline_id`, `input_settings`) at instantiation (like above) allows early detection of common mistakes (*e.g.*, missing parameters or input files) before running VIP executions.

## Remove temporary data from VIP

*After the download, **your input and output data are still on VIP** (https://vip.creatis.insa-lyon.fr/)*

In [None]:
print("Inputs are in:", my_session.vip_input_dir, "\n", 
      "Outputs are in:", my_session.vip_output_dir)

Please remove the temporary data from VIP with `finish()`

In [None]:
my_session.finish();

## Use the session backups

### Check the backup file

*After each step, session data are automatically backed up in its output directory* 

In [None]:
! tree {Path(my_session.output_dir).parent}

*The JSON file `session_data.json` contains everything you need to repeat the same executions on VIP*

In [None]:
! head {Path(my_session.output_dir) / "session_data.json"} 

### Repeat previous executions

You can restore a previous VipSession instance using its session name

In [None]:
new_session = VipSession("Demo-VipSession") # Name of the previous session

Use the `run_session()` shortcut to launch the full *Upload-Run-Download* procedure from this new VipSession instance

In [None]:
new_session.run_session(refresh_time=10);

Check the output files

In [None]:
! tree {new_session.output_dir}

## Re-use a same dataset multiples times
You can re-use the same dataset multiples times. For that you will only need to keep the trace of where the inputs where uploaded on VIP.  
You will also need to not run `finish()` at the end of the session which has uploaded the dataset, it will prevent your dataset from being deleted.  
When you will reuse the data don't forget to adapt the paths in `inputs_settings` by using the path where they were initially uploaded.

In [None]:
session = VipSession("session-A")
session.upload_inputs(input_dir)

inputs_settings = {
    "file": "initial.file",
    "value": 5
}

input_folder_on_vip = session._vip_input_dir
print(f"The dataset is located on VIP here: {input_folder_on_vip}") # keep this information somewhere

session.launch_pipeline(pipeline_id, input_settings)
session.monitor_workflows()
# no finish for the first session

In [None]:
session = VipSession("session-B")
# do not forget to prepend your inputs !
adapted_inputs_settings = {
    "file" : f"{input_folder_on_vip}/initial.file", # reuse the stored information !
    "value": 5
}
session.launch_pipeline(pipeline_id, adapted_inputs_settings)
session.monitor_workflows()
session.finish()

> [!NOTE]
> At the very end when you won't need the dataset anymore don't forget to run `VipSession(session-A).finish()` for cleaning the data from VIP servers.
> You must name your session like you named it for uploading your dataset.

In [None]:
upload_session = VipSession("upload-session")
upload_session.upload_inputs(input_dir)
# * running the session * #

reuse_session_a = VipSession("reuse-session_a")
# * reunning the session on the previous dataset * #

reuse_session_b = VipSession("reuse-session_b")
# * reunning the session on the previous dataset * #

# finally deleting the dataset
VipSession("upload-session").finish()

## End this tutorial

The output data downloaded on your computer is yours to remove

In [None]:
! rm -r vip_outputs