dealing with UMIs containing N #36

rmvpaeme · 2020-10-12T06:07:57Z

After running calib, about 30% of my reads contain an UMI with at least one 'N'. How do you suggest that we deal with these? Currently, we are filtering out all these reads.

Thanks.

rmvpaeme · 2020-10-13T16:32:04Z

in addition, could it be possible to add an option that outputs the base with the highest quality for bases that don't have a 50%+1 majority vote?

baraaorabi · 2020-10-16T17:43:27Z

I think you should examine the .msa files generated by the consensus step. Do they make sense to you? If a lot of read clusters have bad consensus (thus generating a lot of N's), it might be worth changing the clustering parameters.

Also, if you want to have your own consensus building rules (e.g. using qualities or keeping non-majority plurality bases), you can use the .msa file. Each cluster has its full multiple sequence alignment printed out and you should be able to process in your own custom way.

baraaorabi self-assigned this Oct 16, 2020

baraaorabi closed this as completed Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dealing with UMIs containing N #36

dealing with UMIs containing N #36

rmvpaeme commented Oct 12, 2020

rmvpaeme commented Oct 13, 2020

baraaorabi commented Oct 16, 2020

dealing with UMIs containing N #36

dealing with UMIs containing N #36

Comments

rmvpaeme commented Oct 12, 2020

rmvpaeme commented Oct 13, 2020

baraaorabi commented Oct 16, 2020