Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to trim all reads to a given length #19

Closed
fplazaonate opened this issue Mar 18, 2021 · 1 comment
Closed

Add option to trim all reads to a given length #19

fplazaonate opened this issue Mar 18, 2021 · 1 comment

Comments

@fplazaonate
Copy link

Hi,

I would like to run Simka on multiple samples with the -max-reads option to deal with various sequencing depth.
However, the samples have also various read length.
I guess this may slighlty bias the results as longer reads increase the total number of kmers.
Would it be possible to add an option to trim all reads to a given length?

Florian

@clemaitre
Copy link
Collaborator

Hi Florian,

thank you for using Simka and for your interesting comment.
I agree, it would make more sense to also trim the reads to a common size. Thus, it would be quite relevant if this was an option of Simka. However, I looked in the code, and this feature is not part of Simka's code but rather of the GATB library it is built on. I'm sorry but, at the moment, we do not have enough human resources on GATB to implement it.

Therefore, my only answer, which is not ideal, is to trim the reads beforehand as a pre-processing step with an independent tool (such as seqtk trimfq).

Regards,
Claire

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants