streamline/document update process #60

johnbradley · 2020-02-13T14:38:40Z

Broad releases new data every quarter that is processed by this app.
Currently updating for this data is a multistep process:

@hirscheylab performs some data validation, then creates a PR that updates release name, URLs and methods details. Example PR: 20Q1 update #57
After the above PR is merged a docker image is automatically built. Wait for the docker image to build.
On our HPC cluster I generate the data and upload the results to a DukeDS project. This process uses a singularity image created from the above docker image and a clone of this repo.
On my laptop I update the list of files to download into the openshift ddh-data volume based on the new contents of the DukeDS project. Then I rerun the the job to download data and manually redeploy the app. Finally I create a PR with the updates to openshift/file-list.json.

I would like to simplify this process or at least have these details recorded so I don't forget them.

Notes based on steps above

Step 3 - Generate data - Docker Image

The docker image is used to supply the r libraries used by data generation. I currently also clone the repo and use them in combination. I am wondering if I could use just the docker/singularity image.

Step 3 - Generate data - directories to create

I need to manually create the following directories after cloning the repo: logs, singularity/images, and data. I'm not sure why the Makefile doesn't create the data
directory.

Step 3 - Generate data - sbatch commands

We have some notes here: https://github.com/hirscheylab/ddh#singularity
Basically I setup a config file, run sbatch build-slurm.sh wait for it to finish successfully then run sbatch upload-slurm.

Step 4 - Update website - Update list of files to download

To update the list of files to download into openshift we have openshift/make-file-list.py. This script creates a file for all files in the DukeDS project and not just the current release so I manually remove the older files from this list.

Step 4 - Update website - Download

To download data in the openshift app requires installing and configuring the openshift oc command.
To rerun the job to download data in the openshift app usually requires deleting the previous job:

oc delete job download-ddh-data

Then creating/running the job to download data:

oc create -f DownloadJob.yaml

On the job finishes I redeploy the website using the okd application console: depmap -> Applications -> ddh-shiny-app -> Click Deploy.

FYI: @dleehr

The text was updated successfully, but these errors were encountered:

johnbradley · 2020-02-18T14:47:34Z

Another note for Step 4 - Update website - Update list of files to download:

Even though we no longer use the <release>_achilles_cor.Rds file in app.R:
dce7521#diff-934cea81792b89d50e43ec7534f8b30e

We are still downloading this 2G file into our openshift deployment:
https://github.com/hirscheylab/ddh/blob/3cd774c161ee7f9296dab6c33a07f4efc49cb27c/openshift/file-list.json#L4

johnbradley · 2020-02-19T19:35:06Z

Based on code/app.R this is the list of files we need to stage from the data directory:

gene_summary.Rds
<release>_achilles.Rds
<release>_expression_join.Rds
sd_threshold.Rds
achilles_lower.Rds
achilles_upper.Rds
mean_virtual_achilles.Rds
sd_virtual_achilles.Rds
master_bottom_table.Rds
master_top_table.Rds
master_positive.Rds
master_negative.Rds

dleehr mentioned this issue Feb 17, 2020

Makefile target should make directories #62

Closed

johnbradley mentioned this issue Feb 19, 2020

Quarterly update docs #65

Merged

johnbradley closed this as completed in #65 Feb 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

streamline/document update process #60

streamline/document update process #60

johnbradley commented Feb 13, 2020

johnbradley commented Feb 18, 2020

johnbradley commented Feb 19, 2020

streamline/document update process #60

streamline/document update process #60

Comments

johnbradley commented Feb 13, 2020

Notes based on steps above

Step 3 - Generate data - Docker Image

Step 3 - Generate data - directories to create

Step 3 - Generate data - sbatch commands

Step 4 - Update website - Update list of files to download

Step 4 - Update website - Download

johnbradley commented Feb 18, 2020

johnbradley commented Feb 19, 2020