Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop QC trim plan for Oil samples #369

Closed
sr320 opened this Issue Sep 11, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@sr320
Copy link
Member

sr320 commented Sep 11, 2018

Given fastQC
http://owl.fish.washington.edu/Athaliana/20180910_Cvirginica_oil_fastqc/multiqc_report.html

what is best approach to clean libraries up - should probably drop a couple of libraries

@kubu4

This comment has been minimized.

Copy link
Contributor

kubu4 commented Sep 11, 2018

I say process them with TrimGalore!, using default settings and see how they turn out (should get rid of a LOT of adapter sequence; I think that's why we see that small peak in the Sequence Duplication Levels plot). I suspect that'll help a bit, but we'll have to try to figure out if/why FastQC doesn't like them.

I'll go ahead and get them trimmed.

@kubu4

This comment has been minimized.

Copy link
Contributor

kubu4 commented Sep 13, 2018

Here's what things look like after default trimming parameters with TrimGalore!:

http://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/multiqc_report.html

They're better, but I think we might need to implement more aggressive trimming. Specifically:

  • trim 26bp from 3' end

  • trim 8-10bp from 5' end (this would be Bismark recommendation, regardless)

@sr320

This comment has been minimized.

Copy link
Member Author

sr320 commented Sep 13, 2018

Why 26 from 3'?

Sequence content seems informative.
http://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/multiqc_report.html#fastqc_per_base_sequence_content
(the graphs not the heat maps).

TGACCA should likely be dropped.
screen shot 2018-09-13 at 08 55 25

and the flatlining from 60-80 on other samples seems odd. then there is the blip at 70 bp in cutadapt graph

@kubu4

This comment has been minimized.

Copy link
Contributor

kubu4 commented Sep 13, 2018

Agreed with that assessment.

And, now that I've looked at things further, don't want to trim last 26bp.

Actually, want to filter out sequences that are shorter than75bp. That should resolve this issue:

selection_067

selection_068

@sr320 sr320 closed this Sep 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.