# Survival analysis

### General setup for working locally

*Skip this cell if using RENKU*

The R libraries first need to be compiled from the provided sources using the packrat::restore() command.

In [None]:
!Rscript -e "packrat::on(); packrat::restore()"

### Specific setup for working on RENKU

*Skip this cell if not using RENKU*

When opening the project in RENKU for the first time, migrate the packrat libraries on the docker image using the python [``renku-r-tools`` package](https://pypi.org/project/renku-r-tools) package.

The ``renku-r ln-packrat-lib`` command will replace the packrat R libraries of the R project by links to the compiled libraries in the home directory.

In [None]:
!renku-r ln-packrat-lib -p . -s /home/rstudio/packrat -v -f

## Import data from SLIMS

*Skip this step if not using SLIMS*

Import raw survival counts from SLIMS using the [``slims-lisp`` package](https://pypi.org/project/slims-lisp).

Use the ``slims-lisp fetch`` command to download the data collection .xlsx file from a SLIMS ELN attachment step.
A file containing metadata about the origin of the file on SLIMS will also be created by the command.
The ``subprocess`` and ``getpass`` modules are used to securely enter the access credentials.

In [None]:
# Set ``slims-lisp fetch`` options
import subprocess
import getpass
import os

if "url" not in vars():
    url = input("SLIMS url (ex: https://<address>/rest/rest): ")
if "proj" not in vars():
    proj = input("Project name: ")
if "exp" not in vars():
    exp = input("Experiment name: ")
if "step" not in vars():
    step = input("Attachment step name: ")
if "attm" not in vars():
    attm = input("Attachment name: ")
if "output_fp" not in vars():
    output_fp = input("Output file path (ex: data/survival_data.xlsx): ")
if "user" not in vars():
    user = input("SLIMS username: ")

if not os.path.exists(os.path.dirname(output_fp)):
    os.makedirs(os.path.dirname(output_fp))

# Download attachment with ``slims-lisp fetch``
slims_fetch = subprocess.Popen(['slims-lisp', 'fetch',
                                '--url', url,
                                '--proj', proj,
                                '--exp', exp,
                                '--step', step,
                                '--attm', attm,
                                '--output', output_fp,
                                '-v',
                                '-u', user,
                                '-p', getpass.getpass("SLIMS password: ")],
                               stdout = subprocess.PIPE,
                               stderr = subprocess.PIPE)
stdout = slims_fetch.stdout.read().decode()
stderr = slims_fetch.stderr.read().decode()
print(stdout + stderr)

## Create an initial RENKU dataset

*Skip this step if not using RENKU*

Create a RENKU dataset and add the data collection file to it using the [``renku`` package](https://pypi.org/project/renku).
This will make the data searchable in RENKU and easier to export to repositories like Zenodo.

In [None]:
# Create an initial RENKU dataset
import subprocess

dataset_init = input("Dataset name: ")
dataset_create = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'create',
                                   dataset_init],
                                  stdout = subprocess.PIPE)
stdout = dataset_create.stdout.read().decode()
print(stdout)

# Add a file to the initial RENKU dataset
if "output_fp" not in vars():
    output_fp = input("File to add: ")
dataset_add = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'add',
                                dataset_init,
                                output_fp],
                               stdout = subprocess.PIPE,
                               stderr = subprocess.PIPE)
stdout = dataset_add.stdout.read().decode()
stderr = dataset_add.stderr.read().decode()
print(stdout + stderr)

## Build survival curves

Run the ``bin/build_survival_curves.R`` script:

In [None]:
# Set the options for ``bin/build_survival_curves.R``
import os
import subprocess

if "output_fp" not in vars():
    output_fp = input("Path to the xlsx data collection file (ex: 'data/survival_data.xlsx'): ")
input_fp = output_fp
model = input("Statistical model (ex: 'Strain+Treatment'): ")
result_dir = input("Path to the output directory (ex: 'data/results'): ")
if not os.path.exists(result_dir):
    os.makedirs(result_dir)

# Run ``bin/build_survival_curves.R``
# ADVANCED: If using RENKU, prepend the ``renku run`` command to track the results in the Knowledge Graph.

build_surv = subprocess.Popen(['bin/build_survival_curves.R',
                               '--input_fp', input_fp,
                               '--model', model,
                               '--output', result_dir
                              ],
                              stdout = subprocess.PIPE,
                              stderr = subprocess.PIPE)
stdout = build_surv.stdout.read().decode()
stderr = build_surv.stderr.read().decode()
print(stdout + stderr)

## Create a RENKU dataset with the analysis results

*Skip this step if not using RENKU*

In [None]:
# Create a RENKU dataset
import subprocess
import os

dataset_results = input("Dataset name: ")
dataset_create = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'create',
                                   dataset_results],
                                  stdout = subprocess.PIPE,
                                  stderr = subprocess.PIPE)
stdout = dataset_create.stdout.read().decode()
stderr = dataset_create.stderr.read().decode()
print(stdout + stderr)

if "result_dir" not in vars():
    result_dir = input("Results directory path: ")

# Add results to the RENKU dataset
for f in os.listdir(result_dir):
    print("Adding: " + f)
    dataset_add = subprocess.Popen(['/home/rstudio/.local/bin/renku', 'dataset', 'add',
                                    dataset_results,
                                    result_dir + "/" + f],
                                   stdout = subprocess.PIPE,
                                   stderr = subprocess.PIPE)
    stdout = dataset_add.stdout.read().decode()
    stderr = dataset_add.stderr.read().decode()
    print(stdout + stderr)

## Upload the results into SLIMS

*Skip this step if not using SLIMS*

Export results on SLIMS using the [``slims-lisp`` package](https://pypi.org/project/slims-lisp).
Use the ``slims-lisp add-dataset`` command to upload the files to a new SLIMS ELN attachment step.
Again, the ``subprocess`` and ``getpass`` modules are used to securely enter the access credentials.

In [None]:
# Get the list of data files
import glob

if "result_dir" not in vars():
    result_dir = input("Results directory path: ")
    
results_files = glob.glob(result_dir + "/*")
print("Data files:\n" + '\n'.join(results_files))

If using RENKU, add the metadata from the RENKU dataset

*Skip this cell if not using RENKU*

In [None]:
%run lib/command
import re

if "dataset_results" not in vars():
    dataset_results = input("Dataset name: ")

metadata = Command(['/home/rstudio/.local/bin/renku', 'dataset'])
metadata.pipe(['grep', '-w', re.sub("[^A-Za-z0-9]+", "", dataset_results)])
metadata.pipe(['cut', '-d', ' ', '-f1'])
metadata_file = ".renku/datasets/" + metadata.stdout.read().decode().splitlines()[0] + "/metadata.yml"
if metadata_file not in results_files:
    results_files.append(metadata_file)
    print("Adding ''" + metadata_file +"'.")
else:
    print("'" + metadata_file + "' already added.")

In [None]:
# Set ``slims-lisp fetch`` options
import subprocess
import glob
import getpass

if "url" not in vars():
    url = input("SLIMS url (ex: https://<address>/rest/rest): ")
if "proj" not in vars():
    proj = input("Project name: ")
if "exp" not in vars():
    exp = input("Experiment name: ")
title = input("Attachment step name: ")
if "user" not in vars():
    user = input("SLIMS username: ")

# Upload the files into a new SLIMS attachment step
slims_add = subprocess.Popen(['slims-lisp', 'add-dataset',
                              '--url', url,
                              '--proj', proj,
                              '--exp', exp,
                              '--files', ','.join(results_files),
                              '--title', title,
                              '-v',
                              '-u', user,
                              '-p', getpass.getpass("SLIMS password:")],
                             stdout=subprocess.PIPE)
stdout = slims_add.stdout.read().decode()
print(stdout)

## Save your work !

Commit and push the changes onto the git repository.

Danger: Errors in git setup can lead to breaches in privacy.
 - Understand the project's privacy requirments
 - Know who has access to the git repository
 - If you whitness a breach, immediatly inform the responsible persons and fix the breach (make sure to also delete all sensitive information from previous versions and logs)

Tracking data and figures may be disabled by default, so the `.gitignore` file will be modified accordingly.

In [None]:
%%bash -s "{input('Git commit message: ')}"
sed -i "/data\/\*/d;/figs\/\*/d;/\*\.nb\.html/d" .gitignore
git add -A
git commit -m "$1"
git push