Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align Error: seq contains letters not in the alphabet #70

Closed
LeoBusse opened this issue Jun 2, 2021 · 2 comments
Closed

Align Error: seq contains letters not in the alphabet #70

LeoBusse opened this issue Jun 2, 2021 · 2 comments

Comments

@LeoBusse
Copy link

LeoBusse commented Jun 2, 2021

Hello, thank you for developing a great tool!

I've been trying to figure out where the "ValueError: sequence contains letters not in the alphabet" error is coming from when I run my .gbf files/.gb files through Clinker. I went through issue #68 and I installed Clinker 0.0.21 through Conda again but to no avail. I have also tried the pip install but that didn't fix the problem. I double checked the align.py script on my local computer and it has the extend_matrix_alphabet addition, so I'm not sure what to do. You mentioned a quick fix would be to go through the sequence and delete anything not part of the extended IUPAC. Is there a particular way you recommend doing this? I have several sequences, so it seems like it would take a long time to identify anything wrong in the sequence (I would be looking for numbers, right?).

I attached an image with the traceback in case it's helpful.

Thank you so much!

Screen Shot 2021-06-02 at 6 05 50 PM

@gamcil
Copy link
Owner

gamcil commented Jun 3, 2021

If your image is anything to go by, looks like your sequences have gaps (B starts with -) in them - I'll have to add them to the extended set. In the meantime, you could do a search and replace for the gap characters with X, or just delete them (it might only be a few rogue sequences, which is usually the case). I'll try and get a fix for this soon.

Looks like I also forgot to remove some logging calls there so thanks for reminding me haha.

@LeoBusse
Copy link
Author

LeoBusse commented Jun 3, 2021

Thank you so much!

I looked for the gaps as suggested and it works perfectly now! I really appreciate the advice.

@gamcil gamcil closed this as completed Jun 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants