Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ld prune] ld_pruning task is hardcoded to not work on chrX #8

Open
aofarrel opened this issue May 1, 2021 · 4 comments
Open

[ld prune] ld_pruning task is hardcoded to not work on chrX #8

aofarrel opened this issue May 1, 2021 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@aofarrel
Copy link
Collaborator

aofarrel commented May 1, 2021

It appears that neither the Python pipeline nor the CWL have arguments to change autosome_only. Therefore it is effectively hardcoded in the R script to True. When chrX is run on in the pipeline, it understandably skips it, but this causes Cromwell to panic as the expected RData output is never generated.

@aofarrel aofarrel changed the title [b] ld_pruning does not work on chrX by default [b] ld_pruning is hardcoded to not work on chrX May 1, 2021
aofarrel added a commit that referenced this issue May 1, 2021
most of which are no-ops, but...
@aofarrel
Copy link
Collaborator Author

aofarrel commented May 4, 2021

To do: Test if the CWL errors out. I'm not sure if CWL will error out or just skip chrX. If it errors then the two are equivalent, if it skips, that's a divergence and the divergence label is appropriate.

@aofarrel
Copy link
Collaborator Author

aofarrel commented May 5, 2021

The CWL does indeed error out.

Seven Bridges job.error.log

2021-05-05T18:29:18.585126888Z Loading required package: SeqArray
2021-05-05T18:29:18.587557445Z Loading required package: gdsfmt
2021-05-05T18:29:18.620759364Z SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
2021-05-05T18:29:18.646526603Z found parameters: gds_file, genome_build, out_file
2021-05-05T18:29:18.646667727Z using default values: autosome_only, exclude_pca_corr, ld_r_threshold, ld_win_size, maf_threshold, missing_threshold, sample_include_file, variant_include_file
2021-05-05T18:29:18.650421776Z Using all samples
2021-05-05T18:29:18.695280165Z Using 985 variants
2021-05-05T18:29:18.697779543Z Running with 1 thread(s).
2021-05-05T18:29:18.703736959Z Error in seqParallel(parallel, gdsfile, split = "by.variant", FUN = function(f, :
2021-05-05T18:29:18.703755062Z No variants selected.
2021-05-05T18:29:18.703761029Z Calls: snpgdsLDpruning ... .InitFile2 -> eval -> eval -> -> seqParallel
2021-05-05T18:29:18.703767758Z Execution halted

Seven Bridges job.out.log
https://pastebin.com/MfJ8Bq3A

The fact the error seems to occur in the R script due to variants selected, instead of in the CWL due to lack of an expected output, has me wondering... does the original pipeline, which by default takes in chr1-23 where chr23 is parsed as chrX, might break if passed into the LD pruning task with all 23 chromosomes. That being said, LD pruning has autosome_only on by default, but also by default skips chr23!

@aofarrel aofarrel added the help wanted Extra attention is needed label May 10, 2021
@aofarrel aofarrel added this to Low priority in Issue Triage Jun 10, 2021
@aofarrel
Copy link
Collaborator Author

aofarrel commented Jun 15, 2021

autosome_only has been added to the CWL, but it seems a little confusing.

Iff SB does not treat tooldefaultvalue as something to actually be passed in, ie it's just for the user's information, then:

  • user does not set autosome_only: config says false
  • user sets autosome_only to true: config does not say anything, Rscript treats as true
  • user sets autosome_only to false: config says false

Screenshot 2021-06-15 at 11 57 51 AM

Screenshot 2021-06-15 at 11 57 19 AM

@aofarrel
Copy link
Collaborator Author

aofarrel commented Jul 8, 2021

Tested on SB and it confirms what I guessed from looking over the code. Setting nothing should default to true but the config prints false. Setting false sets config to false. Setting to true does not appear in config.

This will be added to the WDL once CWL is modified to clarify what the default actually should be.

@aofarrel aofarrel moved this from Low priority to Won't Fix/Unfixable in Issue Triage Jul 21, 2021
@aofarrel aofarrel changed the title [b] ld_pruning is hardcoded to not work on chrX [ld prune] ld_pruning task is hardcoded to not work on chrX Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
Issue Triage
Won't Fix/Unfixable/Waiting for C...
Development

No branches or pull requests

1 participant