-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sequence dictionaries differ #126
Comments
By the way, you can see the full ribosomal intervals file and my script for creating it here: https://gist.github.com/slowkow/b11c28796508f03cdf4b It would be nice to maintain a public repository of commonly used files such as this one. |
The current implementation of Picard throws a new exception, discarding the information provided by htsjdk describing the difference between the two sequence dictionaries. The new exception hides information from the htsjdk exception:
The htsjdk exception includes information: https://github.com/samtools/htsjdk/blob/a30bc12df2aeb3b4312a9c236c6a639025d5b596/src/java/htsjdk/samtools/util/SequenceUtil.java#L100 |
Can you please post the output of the following: $ cat hg19.genome.dict Thanks. Yossi. On Wed, Dec 10, 2014 at 4:57 PM, Kamil Slowikowski <notifications@github.com
|
Hi Yossi, I believe I already posted it. See above. You can see my ribosomal intervals file, in full, at the gist link. The header of my bam file is already posted in the first comment of this thread, except for the single line that specifies the program which created the bam file. Here it is a second time, with that line included:
When you say |
@slowkow I only see the sequence dictionary from the ribosmal interval list, not the sequence dictionary associated with the FASTA file. |
@nh13 There is no mention of a sequence dictionary associated with the FASTA file in the error, and also no mention in the documentation for Here are the chromosome names in my FASTA file:
|
Do you have a file Regarding documentation and error messages, we would be grateful if you could submit a pull request so we can review (@vdauwera). |
I created the file as you suggested. I also tried replacing the dictionary in my ribosomal intervals file with the generated dictionary, and I receive the same error as in my first post..
|
Ah, it says the interval list and |
The problem: my sequence dictionary lines are space-delimited, and they must be tab-delimited, i.e. I apologize for the confusion, and for wasting your time. I recommend that you document the format of the interval list file, or refer the user to an example of a properly-formatted file. It would be useful to point them to a ready-to-use file like this one. Strangely, the current documentation for Wrong (each item is separated by a space
Right (each item is separated by a tab
|
I'm running:
I get this error:
However, I copied the sequence dictionary from the BAM file. So, the sequence dictionaries are identical.
Here's the sequence dictionary along with the first 5 lines of the ribosomal intervals file:
The text was updated successfully, but these errors were encountered: