## Setup for RENKU

*Skip this step if not using RENKU*

The Dockerfile takes care of compiling the packrat-managed R libraries from the sources and storing them in the home directory while the docker image is build.
This removes the need to compile the  R libraries each time the project is opened nor to push the "heavy" compiled libraries files on the git repository.
In addition, the libraries layer can be shared among docker images, which can reduce the build time of new images.

To benefit from that system, enable the use of the packrat R libraries on the docker image using the [``renku-r-tools`` package](https://pypi.org/project/renku-r-tools).

The ``renku-r ln-packrat-lib`` command will replace the packrat R libraries of the R project by links to the compiled libraries in the home directory.

In [None]:
!renku-r ln-packrat-lib -p . -s /home/rstudio/packrat

Ensure that data can be pushed on the git repository by editing the ``.gitignore`` file.
For security, files in ``data/`` and ``figs/`` are not tracked by default.

CAUTION: Errors in git setup can lead to breaches in privacy.
 - Understand the project's privacy requirments
 - Know who has access to the git repository
 - If you whitness a breach, immediatly inform the responsible persons and fix the breach (make sure to also delete all sensitive information from previous versions and log files)

In [None]:
!sed -i "/data\/\*/d;/figs\/\*/d;/\*\.nb\.html/d" .gitignore

## Import data from SLIMS

*Skip this step if not using SLIMS*

Import raw survival counts from SLIMS using the [``slims-lisp`` package](https://pypi.org/project/slims-lisp).

Use the ``slims-lisp fetch`` command to download the data collection .xlsx file from a SLIMS ELN attachment step.
A file containing metadata about the origin of the file on SLIMS will also be created by the command.
The ``subprocess`` and ``getpass`` modules are used to securely enter the access credentials.

In [None]:
# Set ``slims-lisp fetch`` options
url = input("SLIMS url (ex: https://<address>/rest/rest): ")
proj = input("Project name: ")
exp = input("Experiment name: ")
step = input("Attachment step name: ")
attm = input("Attachment name: ")
output_fp = input("Output path: ")
user = input("SLIMS username: ")

In [None]:
# Download attachment with ``slims-lisp fetch``
import subprocess
import getpass
slims_fetch = subprocess.Popen(['slims-lisp', 'fetch',
                                '--url', url,
                                '--proj', proj,
                                '--exp', exp,
                                '--step', step,
                                '--attm', attm,
                                '--output', output_fp,
                                '-v',
                                '-u', user,
                                '-p', getpass.getpass("SLIMS password: ")],
                               stdout = subprocess.PIPE)
stdout = slims_fetch.stdout.read().decode()
print(stdout)

## Create a RENKU dataset

*Skip this step if not using RENKU*

Create a RENKU dataset and add the data file to it using the [``renku`` package](https://pypi.org/project/renku).
This will make the data searchable in RENKU and easier to export to repositories like Zenodo.

In [None]:
# Create a RENKU dataset
dataset_init = input("Dataset name: ")
dataset_create = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'create',
                                   dataset_init],
                                  stdout = subprocess.PIPE)
stdout = dataset_create.stdout.read().decode()
print(stdout)

In [None]:
# Add a file to the RENKU dataset
if output_fp is None:
    output_fp = input("File: ")
if dataset_init is None:
    dataset_init = input("Dataset name: ")
dataset_add = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'add',
                                dataset_init,
                                output_fp],
                               stdout = subprocess.PIPE)
stdout = dataset_add.stdout.read().decode()
print(stdout)

# Build survival curves using the ``bin/build_survival_curves.R`` R script

Run ``bin/build_survival_curves.R --help`` to see the options.

If using RENKU, prepend the ``renku run`` command to track the results in the Knowledge Graph.

In [None]:
# Set the options for ``bin/build_survival_curves.R``
if output_fp is None:
    output_fp = input("Path to the xlsx input file (ex: 'data/survival_data.xlsx'): ")
input_fp = output_fp
model = input("Model (ex: 'Strain+Treatment'): ")
result_dir = input("Path to the output directory (ex: 'data/results'): ")

In [None]:
# Run ``bin/build_survival_curves.R``
import os
if not os.path.exists(result_dir):
    os.makedirs(result_dir)
build_surv = subprocess.Popen(['bin/build_survival_curves.R',
                               '--input', input_fp,
                               '--model', model,
                               '-t', result_dir + "/" + os.path.splitext(os.path.basename(input_fp))[0] + ".txt",
                               '-f', "figs/" + os.path.basename(result_dir) + ".pdf",
                               '--coxph', result_dir + "/" + os.path.splitext(os.path.basename(input_fp))[0] + "_coxph.txt",
                               '--km', result_dir + "/" + os.path.splitext(os.path.basename(input_fp))[0] + "_km.txt"
                              ],
                              stdout = subprocess.PIPE)
stdout = build_surv.stdout.read().decode()
print(stdout)

## Create a RENKU dataset with the analysis results

*Skip this step if not using RENKU*

In [None]:
# Create a RENKU dataset
dataset_results = input("Dataset name: ")
dataset_create = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'create',
                                   dataset_results],
                                  stdout = subprocess.PIPE)
stdout = dataset_create.stdout.read().decode()
print(stdout)

In [None]:
# Add results to the RENKU dataset
import glob

if result_dir is None:
    result_dir = input("Results directory path: ")
if dataset_results is None:
    dataset_results = input("Dataset name: ")
    
for f in os.listdir(result_dir):
    dataset_add = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'add',
                                    dataset_results,
                                    result_dir + "/" + f],
                                   stdout = subprocess.PIPE)
    stdout = dataset_add.stdout.read().decode()
    print(stdout)

# Add figures to the RENKU dataset
for f in glob.glob("figs/" + os.path.basename(result_dir) + ".*"):
    dataset_add = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'add',
                                    dataset_results,
                                    f],
                                   stdout = subprocess.PIPE)
    stdout = dataset_add.stdout.read().decode()
    print(stdout)

## Upload the results into SLIMS

*Skip this step if not using SLIMS*

Export results on SLIMS using the [``slims-lisp`` package](https://pypi.org/project/slims-lisp).
Use the ``slims-lisp add-dataset`` command to upload the files to a new SLIMS ELN attachment step.
Again, the ``subprocess`` and ``getpass`` modules are used to securely enter the access credentials.

In [None]:
# Get the list of data files
import glob

if result_dir is None:
    result_dir = input("Results directory path: ")
    
results_files = glob.glob(result_dir + "/*")
results_files.extend(glob.glob("figs/" + os.path.basename(result_dir) + ".*"))

In [None]:
# If using RENKU, add the metadata from the RENKU dataset
# Skip this step if not using RENKU
%run helper
import re
if dataset_results is None:
    dataset_results = input("Dataset name: ")

metadata = Command(['/home/rstudio/.local/bin/renku', 'dataset'])
metadata.pipe(['grep', '-w', re.sub("[^A-Za-z0-9]+", "", dataset_results)])
metadata.pipe(['cut', '-d', ' ', '-f1'])
metadata_file = ".renku/datasets/" + metadata.stdout.read().decode().splitlines()[0] + "/metadata.yml"
if metadata_file not in results_files:
    results_files.append(metadata_file)

In [None]:
# Set ``slims-lisp fetch`` options
if url is None:
    url = input("SLIMS url (ex: https://<address>/rest/rest): ")
if proj is None:
    proj = input("Project name: ")
if exp is None:
    exp = input("Experiment name: ")
title = input("Attachment step name: ")
if user is None:
    user = input("User: ")

In [None]:
# Upload the files into a new SLIMS attachment step
slims_add = subprocess.Popen(['slims-lisp', 'add-dataset',
                              '--url', url,
                              '--proj', proj,
                              '--exp', exp,
                              '--files', ','.join(files),
                              '--title', title,
                              '-v',
                              '-u', user,
                              '-p', getpass.getpass("SLIMS password:")],
                             stdout=subprocess.PIPE)
stdout = slims_add.stdout.read().decode()
print(stdout)