Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streamline/document update process #60

Closed
johnbradley opened this issue Feb 13, 2020 · 2 comments · Fixed by #65
Closed

streamline/document update process #60

johnbradley opened this issue Feb 13, 2020 · 2 comments · Fixed by #65

Comments

@johnbradley
Copy link
Collaborator

Broad releases new data every quarter that is processed by this app.
Currently updating for this data is a multistep process:

  1. @hirscheylab performs some data validation, then creates a PR that updates release name, URLs and methods details. Example PR: 20Q1 update #57
  2. After the above PR is merged a docker image is automatically built. Wait for the docker image to build.
  3. On our HPC cluster I generate the data and upload the results to a DukeDS project. This process uses a singularity image created from the above docker image and a clone of this repo.
  4. On my laptop I update the list of files to download into the openshift ddh-data volume based on the new contents of the DukeDS project. Then I rerun the the job to download data and manually redeploy the app. Finally I create a PR with the updates to openshift/file-list.json.

I would like to simplify this process or at least have these details recorded so I don't forget them.

Notes based on steps above

Step 3 - Generate data - Docker Image

The docker image is used to supply the r libraries used by data generation. I currently also clone the repo and use them in combination. I am wondering if I could use just the docker/singularity image.

Step 3 - Generate data - directories to create

I need to manually create the following directories after cloning the repo: logs, singularity/images, and data. I'm not sure why the Makefile doesn't create the data
directory.

Step 3 - Generate data - sbatch commands

We have some notes here: https://github.com/hirscheylab/ddh#singularity
Basically I setup a config file, run sbatch build-slurm.sh wait for it to finish successfully then run sbatch upload-slurm.

Step 4 - Update website - Update list of files to download

To update the list of files to download into openshift we have openshift/make-file-list.py. This script creates a file for all files in the DukeDS project and not just the current release so I manually remove the older files from this list.

Step 4 - Update website - Download

To download data in the openshift app requires installing and configuring the openshift oc command.
To rerun the job to download data in the openshift app usually requires deleting the previous job:

oc delete job download-ddh-data

Then creating/running the job to download data:

oc create -f DownloadJob.yaml 

On the job finishes I redeploy the website using the okd application console: depmap -> Applications -> ddh-shiny-app -> Click Deploy.

FYI: @dleehr

@johnbradley
Copy link
Collaborator Author

Another note for Step 4 - Update website - Update list of files to download:

Even though we no longer use the <release>_achilles_cor.Rds file in app.R:
dce7521#diff-934cea81792b89d50e43ec7534f8b30e

We are still downloading this 2G file into our openshift deployment:
https://github.com/hirscheylab/ddh/blob/3cd774c161ee7f9296dab6c33a07f4efc49cb27c/openshift/file-list.json#L4

@johnbradley
Copy link
Collaborator Author

Based on code/app.R this is the list of files we need to stage from the data directory:

gene_summary.Rds
<release>_achilles.Rds
<release>_expression_join.Rds
sd_threshold.Rds
achilles_lower.Rds
achilles_upper.Rds
mean_virtual_achilles.Rds
sd_virtual_achilles.Rds
master_bottom_table.Rds
master_top_table.Rds
master_positive.Rds
master_negative.Rds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant