Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mergePairs & justConcatenate #279

Closed
cresil opened this issue Jul 5, 2017 · 1 comment
Closed

mergePairs & justConcatenate #279

cresil opened this issue Jul 5, 2017 · 1 comment

Comments

@cresil
Copy link

cresil commented Jul 5, 2017

Hi,

Is it possible to both merge pairs and concatenate those which do not overlap?

The reason for the request is Im dealing with ITS data which of course consists of variable amplicon size. Therefore some of the sequences meet the minimal overlap criteria and merge while others do not and I lose them in the process.
JustConcatenate option just concatenates everything even though there is a potential merge.

I think a combination of both would get the best results for further use in multiple alignments and phylogeny analysis.

Cheers

M

@benjjneb
Copy link
Owner

benjjneb commented Jul 5, 2017

Yes and no. You can get the result you are asking for with some R code and the returnRejects flag:

merger <- mergePairs(ddF, drpF, ddR, drpR, returnRejects=TRUE)
concat <- mergePairs(ddF, drpF, ddR, drpR, justConcatenate=TRUE)
merger[!merger$accept,] <- concat[!merger$accept,]

The issue is that some of the pairs that failed to merge are because of mismatches in the true overlap region, rather than non-overlap, thus those should not be replaced by concatenated versions of the sequences.

We don't have an automated solution for solving that issue, but its certainly something you could devise a solution for, probably based on the output in the merger data.frame, in particular the $nmatch, $nmismatch and $nindel columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants