Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle the jbang-compiled JARs within the docker container and update empty-file usage for cloud storage #46

Merged
merged 17 commits into from
Aug 23, 2022

Conversation

abhi18av
Copy link
Contributor

This PR explores the colocation of jbang-compiled jars into the docker container itself, as mentioned in the second suggestion here #45

I'd be happy to give this PR finishing touches, if you agree with the overall direction here.

NOTE: In this POC, I've hardcoded the path of the JAR in the process, to avoid having to rebuild/push the docker container again which needs to be tweaked accordingly.

As of https://github.com/abhi18av/nf-gwas/pull/2/commits/97d075d63fe41569016560713f179a0487aad4ff , this approach is working.

  • On local executor
N E X T F L O W  ~  version 22.04.5
Launching `main.nf` [backstabbing_liskov] DSL2 - revision: 6e13e51147
executor >  local (1)
[29/57e2f7] process > NF_GWAS:VALIDATE_PHENOTYPES [100%] 1 of 1 ✔
Pipeline completed at: 2022-08-19T10:35:17.167361542+02:00
Execution status: OK


  • On azure executor
N E X T F L O W  ~  version 22.08.2-edge
Launching `./main.nf` [cranky_noyce] DSL2 - revision: 6e13e51147
Uploading local `bin` scripts folder to az://batch-jobs/nf-gwas-workdir/tmp/2c/8b5df3e5d7a4229ba4e2a2d27a4beb/bin
executor >  azurebatch (1)
[94/c78916] process > NF_GWAS:VALIDATE_PHENOTYPES [100%] 1 of 1 ✔
Pipeline completed at: 2022-08-19T10:44:09.673403+02:00
Execution status: OK
Completed at: 19-Aug-2022 10:44:10
Duration    : 3m 20s
CPU hours   : (a few seconds)
Succeeded   : 1


@@ -10,7 +10,7 @@ dependencies:
- r=4.1.0
- r-rmarkdown=2.11
- r-ggplot2=3.3.5
- r-dplyr
- r-dplyr=1.0.9
Copy link
Contributor Author

@abhi18av abhi18av Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I had to update this dependency because of the following error


Command error:
  Loading required package: rmarkdown
  Warning message:
  package 'rmarkdown' was built under R version 4.1.2 
  
  
  processing file: gwas_report_template.Rmd
  Quitting from lines 30-39 (gwas_report_template.Rmd) 
  Error: package or namespace load failed for 'ramwas' in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
   namespace 'dplyr' 1.0.7 is already loaded, but >= 1.0.9 is required
  Execution halted



Comment on lines 39 to 50
//Optional covariates file
if (params.covariates_filename == []) {
covariates_file = []
} else {
covariates_file = file(params.covariates_filename, checkIfExists: true)
}

//Optional sample file
sample_file = file(params.regenie_sample_file)
if (params.regenie_sample_file != 'NO_SAMPLE_FILE' && !sample_file.exists()){
exit 1, "Sample file ${params.regenie_sample_file} not found."
if (params.regenie_sample_file == []) {
sample_file = []
} else {
sample_file = file(params.regenie_sample_file, checkIfExists: true)
Copy link
Contributor Author

@abhi18av abhi18av Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Changes done to accommodate the empty files on cloud blob storage.

@abhi18av
Copy link
Contributor Author

abhi18av commented Aug 21, 2022

Quick update: As of e580e11, the pipeline is working locally and on cloud.

  • Execution on local executor 💻
~/data/projects/nf-gwas$ nextflow -c ../nf-gwas-local.config run main.nf -profile test,docker --outdir custom_results
N E X T F L O W  ~  version 22.04.5
Launching `main.nf` [tender_brahmagupta] DSL2 - revision: 6e13e51147
executor >  local (16)
[a4/355962] process > NF_GWAS:VALIDATE_PHENOTYPES          [100%] 1 of 1 ✔
[6f/abd8ce] process > NF_GWAS:QC_FILTER_GENOTYPED (1)      [100%] 1 of 1 ✔
[57/191e01] process > NF_GWAS:REGENIE_STEP1 (1)            [100%] 1 of 1 ✔
[98/65f8a9] process > NF_GWAS:REGENIE_LOG_PARSER_STEP1 (1) [100%] 1 of 1 ✔
[2f/4c2139] process > NF_GWAS:REGENIE_STEP2 (example)      [100%] 1 of 1 ✔
[92/20dab3] process > NF_GWAS:REGENIE_LOG_PARSER_STEP2     [100%] 1 of 1 ✔
[f6/c5f4be] process > NF_GWAS:FILTER_RESULTS (example_Y2)  [100%] 2 of 2 ✔
[1a/c58b92] process > NF_GWAS:MERGE_RESULTS_FILTERED (Y1)  [100%] 2 of 2 ✔
[e7/ac8a20] process > NF_GWAS:MERGE_RESULTS (Y1)           [100%] 2 of 2 ✔
[86/a00dd6] process > NF_GWAS:ANNOTATE_FILTERED (1)        [100%] 2 of 2 ✔
[56/ad0931] process > NF_GWAS:REPORT (1)                   [100%] 2 of 2 ✔
Pipeline completed at: 2022-08-21T22:06:02.551433621+02:00
Execution status: OK
Completed at: 21-Aug-2022 22:06:02
Duration    : 1m 25s
CPU hours   : (a few seconds)
Succeeded   : 16
  • Execution on azurebatch executor 🌩️
$ nextflow -c ../nf-gwas-azure.config run ./main.nf -profile test,docker,azb 
N E X T F L O W  ~  version 22.08.2-edge
Launching `./main.nf` [hopeful_plateau] DSL2 - revision: 6e13e51147
Uploading local `bin` scripts folder to az://batch-jobs/nf-gwas-workdir/tmp/0a/2a71a0f98537a0cf40df73be97319f/bin
executor >  azurebatch (16)
[56/b01abe] process > NF_GWAS:VALIDATE_PHENOTYPES          [100%] 1 of 1 ✔
[69/ae0aed] process > NF_GWAS:QC_FILTER_GENOTYPED (1)      [100%] 1 of 1 ✔
[06/f22a2a] process > NF_GWAS:REGENIE_STEP1 (1)            [100%] 1 of 1 ✔
[b7/e28cf7] process > NF_GWAS:REGENIE_LOG_PARSER_STEP1 (1) [100%] 1 of 1 ✔
[21/83d459] process > NF_GWAS:REGENIE_STEP2 (example)      [100%] 1 of 1 ✔
[d5/9e9522] process > NF_GWAS:REGENIE_LOG_PARSER_STEP2     [100%] 1 of 1 ✔
[f3/875964] process > NF_GWAS:FILTER_RESULTS (example_Y2)  [100%] 2 of 2 ✔
[75/898d5f] process > NF_GWAS:MERGE_RESULTS_FILTERED (Y2)  [100%] 2 of 2 ✔
[22/461918] process > NF_GWAS:MERGE_RESULTS (Y1)           [100%] 2 of 2 ✔
[fc/d09a56] process > NF_GWAS:ANNOTATE_FILTERED (2)        [100%] 2 of 2 ✔
[7b/ab6d0c] process > NF_GWAS:REPORT (2)                   [100%] 2 of 2 ✔
Pipeline completed at: 2022-08-21T22:22:16.397925+02:00
Execution status: OK
Completed at: 21-Aug-2022 22:22:16
Duration    : 5m 46s
CPU hours   : 0.2
Succeeded   : 16

I'm finalizing the nf-test related updates - but in the meantime, it'd be great if you could please test the pipeline on your end (with some real data) using this container rg.fr-par.scw.cloud/nfcontainers/nf-gwas-with-jars:1.0.0 🙏

@abhi18av abhi18av marked this pull request as ready for review August 21, 2022 20:26
@abhi18av abhi18av changed the title Bundle the jbang-compiled JARs within the docker container Bundle the jbang-compiled JARs within the docker container and update empty-file usage for cloud storage Aug 21, 2022
@seppinho
Copy link
Member

Hi guys,
really appreciate the commits and improvements. We already run the branch on real data (2 chromosomes) and everything looks good so far. We'll start a larger analysis on all chromosomes now but we dont expect any major problems with that.

@seppinho
Copy link
Member

Hi guys,
Analysis finished as expected.
image

If everything is done from your side, I'm happy to merge this. Once again, thanks for your time and the improvements. Curious to see how it performs on large datasets on Azure.

@drpatelh
Copy link
Contributor

Awesome! Thanks for writing the pipeline and for your work on nf-test :bowtie: Great to be able to use a well-written pipeline off-the-shelf rather than re-inventing the wheel.

Will wait for @abhi18av to confirm that he is happy to merge.

Once merged it would be awesome if you are able to push a new container and create a release for us 🙏🏽

@abhi18av
Copy link
Contributor Author

As a final test for cloud execution, I've successfully 💚 tested the execution again - looks good from my side. Please go ahead with the merge and release 🙏

nf-gwas  🍣 jbang-docker 🅒 base 
+  >_ nextflow -c ../nf-gwas-azure.config run ./main.nf -profile test,docker,azb
N E X T F L O W  ~  version 22.08.2-edge
Launching `./main.nf` [irreverent_volta] DSL2 - revision: 6e13e51147
Uploading local `bin` scripts folder to az://batch-jobs/nf-gwas-workdir/tmp/16/88dad2486dfa0eaa530512b44ce5c6/bin
executor >  azurebatch (16)
[e9/9893a9] process > NF_GWAS:VALIDATE_PHENOTYPES          [100%] 1 of 1 ✔
[fb/c542ea] process > NF_GWAS:QC_FILTER_GENOTYPED (1)      [100%] 1 of 1 ✔
[48/bdacc8] process > NF_GWAS:REGENIE_STEP1 (1)            [100%] 1 of 1 ✔
[ce/ce57df] process > NF_GWAS:REGENIE_LOG_PARSER_STEP1 (1) [100%] 1 of 1 ✔
[2e/b80888] process > NF_GWAS:REGENIE_STEP2 (example)      [100%] 1 of 1 ✔
[60/d2fc7c] process > NF_GWAS:REGENIE_LOG_PARSER_STEP2     [100%] 1 of 1 ✔
[49/3fc2b5] process > NF_GWAS:FILTER_RESULTS (example_Y1)  [100%] 2 of 2 ✔
[69/37a7b6] process > NF_GWAS:MERGE_RESULTS_FILTERED (Y1)  [100%] 2 of 2 ✔
[3b/2aa82b] process > NF_GWAS:MERGE_RESULTS (Y1)           [100%] 2 of 2 ✔
[4e/8de33c] process > NF_GWAS:ANNOTATE_FILTERED (2)        [100%] 2 of 2 ✔
[55/7999c0] process > NF_GWAS:REPORT (2)                   [100%] 2 of 2 ✔
Pipeline completed at: 2022-08-23T11:01:25.425121+02:00
Execution status: OK
Completed at: 23-Aug-2022 11:01:25
Duration    : 5m 34s
CPU hours   : 0.1
Succeeded   : 16



@seppinho
Copy link
Member

I fixed the test cases which reflects the changes you made in the pipeline. Thanks again!!

@seppinho seppinho merged commit e13dd07 into genepi:main Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants