Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read in data from getCausalSNPs() #10

Closed
blairzhang126 opened this issue Dec 21, 2018 · 3 comments
Closed

Read in data from getCausalSNPs() #10

blairzhang126 opened this issue Dec 21, 2018 · 3 comments
Assignees
Labels
bug

Comments

@blairzhang126
Copy link

@blairzhang126 blairzhang126 commented Dec 21, 2018

Hi, Hannah,

Sorry to bother you again. I encountered a problem when reading data from getCausalSNPs command. My R version is 3.5.0. It seems that I must run the same command twice to get the correct format that I want it to be. To give you an idea what I have done:

library(PhenotypeSimulator)
set.seed(1234)
causalSNPsFromLines <- getCausalSNPs(10000,NrCausalSNPs=10,chr=22,genoFilePrefix="test_time_",genoFileSuffix="_10000_case_maf0.01_ld0.2_100.recodeA.raw.transpose",format="delim",delimiter=" ")

First round output:

head(causalSNPsFromLines)
X rs470766_C rs2329553_T rs8136076_T rs56150635_C 22-36398040_A
ID_1 "id2_10000" "0" "1" "0" "0" "0"
ID_2 "id2_10001" "0" "0" "1" "0" "0"
ID_3 "id2_10002" "0" "1" "0" "0" "0"
ID_4 "id2_10003" "0" "1" "0" "0" "0"
ID_5 "id2_10004" "0" "1" "0" "0" "0"
ID_6 "id2_10005" "1" "1" "0" "0" "0"

Of which I don't want the quote and the id number twice.
If I run the exact same command again, I got the correct format like this:

head(causalSNPsFromLines)
rs481709_T rs2329553_T rs4821946_G rs5759481_G rs957648_C rs3876055_A
ID_1 1 1 0 0 1 0
ID_2 0 0 0 0 1 1
ID_3 2 1 0 0 0 2
ID_4 2 1 0 0 2 0
ID_5 0 1 0 1 1 2
ID_6 1 1 0 0 0 0

I hate to run the exact same command twice to get what I want. Just wanted to know if you have encountered this before. I have uploaded my data to github in case you wanted to look at or test it:
https://github.com/blairzhang126/phenosim-sampledata (it's a space-delimiter file.)

Best,
Blair

@blairzhang126
Copy link
Author

@blairzhang126 blairzhang126 commented Dec 21, 2018

Hi, Hannah, no worries! I found the problem!! I'm posting it here in case other people are wondering. (feel free to close the issue)

It turns out only random seed 1234 is the problem. I tried 123 and 12345, they both gave me correct format. That's the reason why I need to run twice to get the correct format because the random seed for the second time is not 1234 anymore. The one with random seed 1234 will not work!

I hope these are not confusing to people and feel free to comment if you have any.

@HannahVMeyer HannahVMeyer self-assigned this Mar 17, 2019
@HannahVMeyer HannahVMeyer added the bug label Mar 17, 2019
HannahVMeyer added a commit that referenced this issue Mar 17, 2019
HannahVMeyer added a commit that referenced this issue Mar 17, 2019
@HannahVMeyer
Copy link
Owner

@HannahVMeyer HannahVMeyer commented Mar 17, 2019

Hi Blair,

thank you for pointing me to this issue and providing the sample data!

The problem occurred because getCausalSNPs was not designed to handle a header line when sampling from a delimited file:

If format== delim, the first column in each file needs to be the SNP_ID and files cannot contain a header. (from Details in ?getCausalSNPs)

The seed you choose at random happened to lead to sampling of the
first row in the file which was the header.
I have now included an option to specify if the file contains a header and additional checks to make sure the right data is received when sampling from the genotypes file. This is available now on the current github version (v 0.3.2). I will keep this issue open till this fix is also up on CRAN.

Thank you for raising this issue,
Hannah

@HannahVMeyer
Copy link
Owner

@HannahVMeyer HannahVMeyer commented May 15, 2019

Latest release including this fix on CRAN (v0.3.3), closing this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.