Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequence alignment (any specific filtering needed or not)? #3

Open
curiousa opened this issue Apr 30, 2019 · 1 comment
Open

sequence alignment (any specific filtering needed or not)? #3

curiousa opened this issue Apr 30, 2019 · 1 comment

Comments

@curiousa
Copy link

Hello there,

I could run pconsC4 successfully for some fasta alignments but not the other. I couldn't figure out the possible cause of why certain alignments are not acceptable (see the error message I get below). Could you please let me know if there are any specific values accepted for alignment length, number of sequences, etc.

Thanks,
Alina

The error message I get is the following:

pred_3 = pconsc4.predict(model, 'cadtry2.fasta')
Traceback (most recent call last):
File "", line 1, in
File "/ifs/scratch/c2b2/bh_lab/aps2174/anaconda3/lib/python3.6/site-packages/pconsc4/run.py", line 97, in predict
return predict_contacts(model, alignment, verbose)
File "/ifs/scratch/c2b2/bh_lab/aps2174/anaconda3/lib/python3.6/site-packages/pconsc4/run.py", line 101, in predict_contacts
feat_dict, L = _generate_features(alignment, verbose)
File "/ifs/scratch/c2b2/bh_lab/aps2174/anaconda3/lib/python3.6/site-packages/pconsc4/run.py", line 37, in _generate_features
self_info, part_entr, seq = process_a3m(fname)
File "pconsc4/parsing/_load_data.pyx", line 184, in pconsc4.parsing._load_data.process_a3m
File "pconsc4/parsing/_load_data.pyx", line 67, in pconsc4.parsing._load_data.load_a3m
ValueError: setting an array element with a sequence.

@Dapid
Copy link
Collaborator

Dapid commented Jun 20, 2019

Sorry, I missed this issue when you submitted it. There are not limits in depth or size, but the alignments must be without line wrapping: that is, each sequence must be contained in one line. Your error suggests that you have wrap around:

>seq1
AAAFAPA
AA
>seq2
AABAPA
AB

Here are some examples: https://github.com/ElofssonLab/pyGaussDCA/tree/master/tests/data

If that is not the case, can you upload a failing example so I can look at it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants