Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beauti 2.0.2 not accepting .nex file #45

Closed
tgvaughan opened this issue Jan 19, 2014 · 6 comments
Closed

Beauti 2.0.2 not accepting .nex file #45

tgvaughan opened this issue Jan 19, 2014 · 6 comments

Comments

@tgvaughan
Copy link
Member

Reported by shvetanks, Jun 12, 2013

What steps will reproduce the problem?

  1. Clustal Omega (EBI server) aligned file saved as .nex in Clustalx reports error "Something went wrong importing the alignment: Around line 2549
    Unknown code found in sequence: X"
  2. There is no X sequence.
  3. Same file works fine in Beauti1.7.5

What is the expected output? What do you see instead?

File should be listed in Partitions with details of sequences. Instead the following error shows- "Something went wrong importing the alignment:
Around line 2549
Unknown code found in sequence: X"

What version of the product are you using? On what operating system?

Error in Beauti 2.0.2 on Mac OSX 10.8.3

Please provide any additional information below.

This is an amino acid multiple alignment file.
Changing the extension to .nex or .nexuss after conversion does not help

Additional comments from GC issue

Sep 11, 2013
Project Member #1 dong.w.xie

Could you please attach the example of nex file? I cannot replicate using our example.

Thanks,
Walter

@rbouckaert
Copy link
Member

This may be due to the nexus file using X to represent missing values. Both nucleotide and amino acid data types should probably be able to deal with this.

@alexeid
Copy link
Member

alexeid commented Feb 16, 2014

X is not a valid code for missing data in DNA/RNA. We should stick to established codes. It might be valid for amino acid sequences but that should be confirmed before we allow it.

Perhaps a better error message (i.e. popup saying illegal character in sequence) would be sufficient?

Cheers
Alexei

Sent from my iPhone

On 17/02/2014, at 8:28 AM, rbouckaert notifications@github.com wrote:

This may be due to the nexus file using X to represent missing values. Both nucleotide and amino acid data types should probably be able to deal with this.


Reply to this email directly or view it on GitHub.

@rbouckaert
Copy link
Member

http://en.wikipedia.org/wiki/FASTA_format suggests X is used as "masked" for nucleotide, and "any" for amino acid coding. Since "X" is not used for anything else in nucleotide and amino acid encoding in BEAST, we can keep to the same convention as for FASTA, and interpret and "X" the same as a question-mark.

@alexeid
Copy link
Member

alexeid commented Mar 23, 2014

Lets not introduce our own conventions. IUPAC is the convention for nucleotides.

Alexei

Sent from my iPhone

On 24/03/2014, at 9:20 AM, rbouckaert notifications@github.com wrote:

http://en.wikipedia.org/wiki/FASTA_format suggests X is used as "masked" for nucleotide, and "any" for amino acid coding. Since "X" is not used for anything else in nucleotide and amino acid encoding in BEAST, we can keep to the same convention as for FASTA, and interpret and "X" the same as a question-mark.


Reply to this email directly or view it on GitHub.

@rbouckaert
Copy link
Member

How do you suggest to interpret the 'X' in nucleotide alignments allegedly produced by clustalX?
At least the FASTA parser needs converting 'X's.

@tgvaughan
Copy link
Member Author

@alexeid seems to be right regarding X not being used this way under the IUPAC nucelotide convention (in fact, it represents "xanthine"). It does represent ambiguity in amino acid sequences though.

Still, if X is defined in FASTA this way, shouldn't the parser read it this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants