Skip to content

LDpred2: Append score to existing output file#150

Merged
deepchocolate merged 19 commits intomainfrom
deepchocolate/append_score
Apr 19, 2023
Merged

LDpred2: Append score to existing output file#150
deepchocolate merged 19 commits intomainfrom
deepchocolate/append_score

Conversation

@deepchocolate
Copy link
Contributor

@deepchocolate deepchocolate commented Mar 24, 2023

Fixes #144
Fixes #151
Fixes #153

Changes proposed in this pull request:

  • Added the possibility to append a score to an existing output file using the option --out-merge and optionally --out-merge-ids (if ID columns are not IID and FID).
  • Moved code into functions
  • Fixed error when character values are present in the chromosome column (filtered out)
  • Added tests to cover this and existing functionality
  • Added missing sumstat snippets to repo

I note that it is possible that this functionality may raise conflicts if the user runs several scorings in parallel, and both processes read the existing data just prior to the writing of a parallell process, but I'm guessing the chance of that happening is low enought to outweigh the benefits of this functionality.

Before submitting

  • I've read and followed all steps in the Making a pull request
    section of the CONTRIBUTING docs.
  • I've updated or added any relevant docstrings following the syntax described in the
    Writing docstrings section of the CONTRIBUTING docs.
  • If this PR fixes a bug, I've added a test that will fail without my fix.
  • If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

@deepchocolate deepchocolate marked this pull request as ready for review March 27, 2023 13:44
@deepchocolate deepchocolate requested review from espenhgn and ofrei March 27, 2023 13:46
@deepchocolate
Copy link
Contributor Author

deepchocolate commented Mar 29, 2023

Added clarification of documentation, and an incorrect error message in complementSumstats() in fun.R

Copy link
Contributor

@espenhgn espenhgn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good, but some file is missing with the test script (see above), however.

@deepchocolate deepchocolate requested a review from espenhgn April 4, 2023 12:02
Copy link
Contributor

@espenhgn espenhgn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deepchocolate;
I encountered another issue with a missing file:

### Testing ldpred2.R (various options, manually downloaded LD, output merge)
Downloading LD map
--2023-04-11 07:25:29--  https://figshare.com/ndownloader/files/37802721
Resolving figshare.com (figshare.com)... 2a05:d018:1f4:d000:e143:4beb:37ea:9af4, 2a05:d018:1f4:d003:c91e:f786:d93e:5593, 34.248.204.94, ...
Connecting to figshare.com (figshare.com)|2a05:d018:1f4:d000:e143:4beb:37ea:9af4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/37802721/map_hm3_plus.rds?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIYCQYOYV5JSSROOA/20230411/eu-west-1/s3/aws4_request&X-Amz-Date=20230411T072529Z&X-Amz-Expires=10&X-Amz-SignedHeaders=host&X-Amz-Signature=eec81eeb428c5ed9eeb0e896f415e27472906810f9a5364e86ab08fa01133a65 [following]
--2023-04-11 07:25:29--  https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/37802721/map_hm3_plus.rds?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIYCQYOYV5JSSROOA/20230411/eu-west-1/s3/aws4_request&X-Amz-Date=20230411T072529Z&X-Amz-Expires=10&X-Amz-SignedHeaders=host&X-Amz-Signature=eec81eeb428c5ed9eeb0e896f415e27472906810f9a5364e86ab08fa01133a65
Resolving s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)... 52.218.37.43, 52.218.26.107, 52.92.16.80, ...
Connecting to s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)|52.218.37.43|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43814795 (42M) [application/octet-stream]
Saving to: '/nrec/space/espenh/containers/usecases/LDpred2_test/data/ld/map_hm3_plus.rds'

/nrec/space/espenh/containers/usecases/LDpred2_t 100%[=========================================================================================================>]  41.78M  51.0MB/s    in 0.8s    

2023-04-11 07:25:30 (51.0 MB/s) - '/nrec/space/espenh/containers/usecases/LDpred2_test/data/ld/map_hm3_plus.rds' saved [43814795/43814795]

Test error: Character in chromosome column
Loading backingfile: /nrec/space/espenh/containers/usecases/LDpred2_test/data/EUR_imputed.rds 

### Reading LD reference meta-file from  /nrec/space/espenh/containers/usecases/LDpred2_test/data/ld/ldref/map.rds 
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '/nrec/space/espenh/containers/usecases/LDpred2_test/data/ld/ldref/map.rds', probable reason 'No such file or directory'
Execution halted

Are you pointing to the correct file, or did it fail to process/download?

@deepchocolate
Copy link
Contributor Author

@espenhgn
It's that the script downloads the map_hm3_plus.rds which is actually not used in the scripts. I think that's because I didn't find any file with LD blocks for this map. So I kept working with the old maps (map.rds), but didn't notice the error as that file was always available in my development environment. But I've added the download of the old map for now so it should solve your error.

@espenhgn
Copy link
Contributor

@espenhgn It's that the script downloads the map_hm3_plus.rds which is actually not used in the scripts. I think that's because I didn't find any file with LD blocks for this map. So I kept working with the old maps (map.rds), but didn't notice the error as that file was always available in my development environment. But I've added the download of the old map for now so it should solve your error.

Hi again. Sorry, the file is still missing. If you have issues due to the .gitignore file it may be forcefully added using the syntax git add --force my/ignore/file.foo. Hope that works :)

@deepchocolate
Copy link
Contributor Author

@espenhgn It's that the script downloads the map_hm3_plus.rds which is actually not used in the scripts. I think that's because I didn't find any file with LD blocks for this map. So I kept working with the old maps (map.rds), but didn't notice the error as that file was always available in my development environment. But I've added the download of the old map for now so it should solve your error.

Hi again. Sorry, the file is still missing. If you have issues due to the .gitignore file it may be forcefully added using the syntax git add --force my/ignore/file.foo. Hope that works :)

Hopefully fixed now. I've removed downloading of maps, and used maps inside the lpred2_ref repository. However, the tests now assume that this repository is available in the parent directory of this repository. I forgot that the unittests for the tutorial data actually doesn't use these files, because it calculates LD instead so those tests are unaffected.

Copy link
Contributor

@espenhgn espenhgn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again, there is now another test that fails on my end:

Test appending score to output file
Runing unittests on output
-- Failure (???): Appended score output ----------------------------------------
hasAllColumns(tmp, cnamesExp) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Error in reporter$stop_if_needed() : Test failed
Calls: test_that -> <Anonymous>
Execution halted

@deepchocolate
Copy link
Contributor Author

Hi again, there is now another test that fails on my end:

Test appending score to output file
Runing unittests on output
-- Failure (???): Appended score output ----------------------------------------
hasAllColumns(tmp, cnamesExp) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Error in reporter$stop_if_needed() : Test failed
Calls: test_that -> <Anonymous>
Execution halted

Quite "funny", I had the same test fail. In my case it was the run of lpred2 prior to that test that failed without throwing the right error code. When I inspected it seemed that the process was terminated by the Ubuntu (22) in some way, likely due to a memory issue. Chromium also crashed simultaneously.

Can you do an echo of dump to see if there's anything weird happening?

dump=$( { $LDP --ldpred-mode inf --name-score $scoreNameNew --out $fileExisting --out-merge; } 2>&1 )

@espenhgn
Copy link
Contributor

Lack of memory could be a culprit. The "devbox" environment I use only have 8GB of RAM. I'll check dump.

@deepchocolate deepchocolate merged commit c36ae83 into main Apr 19, 2023
@deepchocolate deepchocolate deleted the deepchocolate/append_score branch April 25, 2023 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants