Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to detect recombination from a core gene alignment? #39

Closed
rndg20 opened this issue Jun 3, 2016 · 6 comments
Closed

Comments

@rndg20
Copy link

rndg20 commented Jun 3, 2016

Hi

I have used CFML previously to detect recombination from a core genome alignment. However, I am now wondering whether it is possible to use it for a core gene alignment? (I have used Roary to determine the core genes).

Thanks

@xavierdidelot
Copy link
Owner

Hello,

Yes you can use ClonalFrameML to analyse alignments of core genes. You need to prepare a xmfa file where each core gene sequence is a multi-fasta block separated by lines with just the '=' sign. So if you have each of your core genes as separate fasta files, you just need to add a '=' at the end of each file and concatenate these files to produce your xmfa input.

Best wishes,
Xavier

@rndg20
Copy link
Author

rndg20 commented Jun 10, 2016

Thank you so much for your reply. Just to clarify, I need to create separate multi-fasta files for each of the core genes (so 4059 multi-fasta blocks) and then concatenate them together with an '=' separating them.

As it stands my MSA is in the standard format (

Sample1
......
Sample2
.....)
and does not specify where genes start and end.

@xavierdidelot
Copy link
Owner

Yes, that's right. You will need to find out where genes start and end, because otherwise the software will assume that they occur next to each other which affects linkage and recombination detection.

@izabelcavassim
Copy link

One can only use the concatenation of genes under the xmfa format with the option "-xmfa_file true" included in the command. Right?

I am running it now, so am I going to get an estimate of R/theta for each gene? Is it unreliable to make an estimate per gene? Would you recommend LD based approaches instead?
Thank you!!

@xavierdidelot
Copy link
Owner

Hi Maria,

Yes the -xmfa_file true allows you to use a xmfa file as input which can contain multiple (unconcatenated) genes. R/theta is estimated for the whole set, not for each gene separately, but you could compute the number of recombination events that hit each gene from the list of all events contained in the output file with suffix importation_status.txt

Best wishes,
Xavier

@izabelcavassim
Copy link

izabelcavassim commented May 5, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants