-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A few issues with genomic_converter()
function
#10
Comments
Awesome, thanks Ido!
thanks |
Hi Ido,
Cheers |
The new version (0.0.6), now completely fails to import data while applying filters, with the following error:
I suspect it has something to do with the changes made to accommodate Reverting back to the older version for now. |
Currently checking this along another problem I've detected. This mainly affect VCF file. Since stacks version 1.44, the position of the SNP on the haplotype/read is included in the ID column in VCF file. Now the ID column is no longer unique and no longer correspond to LOCUS, the column requires parsing to get back to the LOCUS info. Which is really a pain. This should have been included in the POS column (the problem was raised on google group). The whitelists and blacklists created were intended to be used in R with the packages and a tidy dataset and before this stacks update, it could also be used with a stacks vcf file. I suspect the problem is related to whitelist, blacklist and blacklist.genotype with locus info that are used back to filter the VCF file and not the tidy dataset. I'll have to check this... Otherwise, using a tidy dataset it works as intended. |
Works with the latest commit Best |
Hi Thierry,
Working with the package, mainly to clean, import and convert SNP data to different formats, I've been trying to use
genomic_converter()
function and came up with a few issues with its behaviour:SNPRelate
format, it ignores the provided output filename and creates a date-signature based one (see related pull request).vcf.metadata=TRUE
argument with a VCF file resulted in an error (object DP not found
).blacklist.id
argument can accept either a file or a data.frame object, whileblacklist.genotype
can only a filename containing a data.frame. I know it appears in the function documentation, but this inconsistency got me confused for a while until I double checked the fine details. I suggest making both arguments work with R objects, it makes much more sense than relying on files.snp.ld
lets you choose the first, last or random SNP, while to me it makes sense to allow choosing a SNP that is NOT first nor last, because the ones at the tag ends are often supported by fewer reads and are less usable in validation (if flanking primers are to be designed).That's it for now, thanks, Ido
The text was updated successfully, but these errors were encountered: