Inconsistent Percentage of Identity Calculations with 'X' Characters in DIAMOND BLASTP Output #789

nvucic · 2024-02-23T21:44:46Z

When running the following command:

root@8a6cc3e3fbcb:/opt# diamond blastp -q test.fa -d test.dmnd -o out.tsv --more-sensitive --outfmt 6 qseqid sseqid full_qseq full_sseq pident qlen slen length gapopen --matrix PAM30 --no-self-hits --masking 0

produced the output:

sequence_0_counts_2260484 sequence_1_counts_226 XXXXSFFPILSYYSMSIYPSYGYTYXXXXXXXXXSHYGVWYGAM XXXQSFFPILSYYSMSIYPSYGYTYXXXXXXXXXSHYGVWYGAM 100 44 44 40 0
sequence_1_counts_226 sequence_0_counts_2260484 XXXQSFFPILSYYSMSIYPSYGYTYXXXXXXXXXSHYGVWYGAM XXXXSFFPILSYYSMSIYPSYGYTYXXXXXXXXXSHYGVWYGAM 97.6 44 44 41 0

Is there a way to take into account X characters when calculating percentage identity and why the identities for these two queries are diffefrent?

Thank you,
Nemanja

bbuchfink · 2024-03-04T14:00:10Z

Is there a way to take into account X characters when calculating percentage identity

no

why the identities for these two queries are diffefrent?

the sequences are different

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent Percentage of Identity Calculations with 'X' Characters in DIAMOND BLASTP Output #789

Inconsistent Percentage of Identity Calculations with 'X' Characters in DIAMOND BLASTP Output #789

nvucic commented Feb 23, 2024

bbuchfink commented Mar 4, 2024

Inconsistent Percentage of Identity Calculations with 'X' Characters in DIAMOND BLASTP Output #789

Inconsistent Percentage of Identity Calculations with 'X' Characters in DIAMOND BLASTP Output #789

Comments

nvucic commented Feb 23, 2024

bbuchfink commented Mar 4, 2024