Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit of read.cross.tidy() #20

Closed
wants to merge 12 commits into from

Conversation

aaronwolen
Copy link
Contributor

Hi Karl,

This PR adds support for a new read.cross() format I'm calling "tidy" because genotype, phenotype and map data are stored in separate files using a standard data.frame-like format:

Phenotypes:

  1 2 3 4 5
T264 118.3 264 194.9 264 145.4

Genotypes:

  1 2 3 4 5
D10M44 B - - B H
D1M3 B B H B H
D1M75 B B H H H
D1M215 H B H H H
D1M309 H H H H B

Map:

  chr cm
D10M44 1 0
D1M3 1 0.996
D1M75 1 24.85
D1M215 1 40.41
D1M309 1 49.99

The advantage of this format is each file can be easily loaded into R with read.table() for analysis/visualization outside of R/qtl. For projects using R/qtl I usually maintain two sets of genotype/phenotype files: 1 for general analysis and 1 set formatted for R/qtl. Adding something like read.cross.tidy() would allow me to avoid that redundancy, so that's my selfish motivation behind this PR.

I'd call it draft code at this point; it works but lacks the extensive data checks found in the other read.cross.*() functions. I can bring it up to parity but wanted to gauge your level interest before proceeding.

@kbroman
Copy link
Owner

kbroman commented Aug 23, 2014

Great idea! I'll merge it into the devel branch. I'm reserving the master branch for the latest release.

@aaronwolen
Copy link
Contributor Author

Cool! I'll integrate it with read.cross() and add some of the missing data checks. Let me know if you have any other suggestions.

@aaronwolen
Copy link
Contributor Author

I made some improvements to the original PR:

  • added write.cross.tidy()
  • added "tidy" support to read.cross() and write.cross()
  • NA columns are inserted for individuals missing in the genotype or phenotype files

I wrote a few simple tests to ensure a cross created with read.cross.tidy() is identical to one created with an existing read.cross.*() function and verify that cross data can be round tripped:

files -> `read.cross.tidy()` -> `write.cross.tidy()` -> files -> `read.cross.tidy()`

I didn't want to muck up your existing testing infrastructure so I kept my tests in a separate branch, which isn't part of this PR.

@kbroman
Copy link
Owner

kbroman commented Sep 3, 2014

Thanks, @aaronwolen! I've incorporated your code into the devel branch. I also added a test and added a small bit to the documentation for read.cross and write.cross.

@kbroman kbroman closed this Sep 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants