-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplest vs bfile_path SAMPLESHEET #77
Comments
An example of the sample sheet is here: https://github.com/PGScatalog/pgsc_calc/blob/main/assets/examples/samplesheet.csv If your .bed/.fam/.bim files all start with plink_genome_test1 and contain all chromosomes your sample sheet would be:
If it is split across chromosomes the files should have slightly longer root names but the same sampleset ID. An example for chrs 1 and 2:
|
Dear @smlmbrt , Thanks a lot for the quick response! Now worked! The problem seems now to be that "the Score PGS001927_hmPOS_GRCh37 fails minimum matching threshold (22.01% variants match)" is there any way to overcome that? (already tried changing the Ch38 which is even less..) thanks a lot again! |
You can adjust the --min_overlap flag to score on the available 22% of variants un the score; however, it's probably best to investigate why the data is missing so many variants. Some options:
|
Dear @smlmbrt , Thank you again very much for the info and the prompt reply! It now seems to have worked. However, I get the error: Error: --score variant ID '1:89479074:C:T' appears multiple times in main Weird enough I have checked the .bim files and the PGS and the variant is not there(?) Thank you very much again! |
This is because the pipeline relabels the variants for consistency and scoring file formatting. If you do a grep/lookup by position in the .bim file you'll likely see multiple rows with those alleles. Implementing the response in this issue may fix this problem: #74 (comment) |
Hello,
Thanks again for the amazing tool and the amazing documentation.
I have been preparing the samplesheet and I believe I am getting the "sampleset" text string wrong.
I have my population in plink.bed plink.fam plink.bim
I understand the "bfile_path" is :
/Users/myname/Desktop/PLINK/plink_mac/plink_genome_test1
Then the "sampleset" could then be "plink_genome_test1"? or how does the sampleset should be looking like?
Thanks a lot for your time in advance and apologies for the inconvenience
Best
The text was updated successfully, but these errors were encountered: