Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gene presence/absence file having ";" without merging paralogs #257

Closed
mhjonathan opened this issue Nov 22, 2023 · 3 comments
Closed

gene presence/absence file having ";" without merging paralogs #257

mhjonathan opened this issue Nov 22, 2023 · 3 comments

Comments

@mhjonathan
Copy link

Hi,

I'm running the Panaroo with gff3 files without merging paralogs in strict mode.

panaroo
--threads 20
--input {input}
--out_dir {out_dir}
--clean-mode strict
--remove-invalid-genes
--threshold 0.98
--family_threshold 0.7

And when I check gene_presence_absence.csv file, there are some queries like below (Yellow marked):

image

They are not paralogs, almost no similarity between those queries. Can you tell me what are them?

@gtonkinhill
Copy link
Owner

Hi,

This is usually caused by fragmented genes which Panaroo will merge together.
Depending upon the reading frame they were originally called in, they can look different from the other genes in the cluster, so it is important to also consider the DNA sequence when comparing them.

@mhjonathan
Copy link
Author

Hi, thank you for the answer.

Then how can I deal with this problem? I expect fragmented genes would be filtered out with --remove invalid-genes option. It can draw incorrect conclusion with clustering gene with fragmented gene that has no connection each other, right?

@gtonkinhill
Copy link
Owner

Hi,

Sorry, I missed your reply. The --remove-invalid-genes filters out invalid GFF entries but not fragmented gene calls. Instead, Panaroo merges these together with the ';' as a delimiter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants