Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report #10

Closed
hoelzer opened this issue Jan 14, 2021 · 8 comments · Fixed by #51
Closed

report #10

hoelzer opened this issue Jan 14, 2021 · 8 comments · Fixed by #51
Assignees
Labels
enhancement New feature or request

Comments

@hoelzer
Copy link
Collaborator

hoelzer commented Jan 14, 2021

It would be good to have at least a simple summary report for the reconstructed consensus sequence. This should include:

  • used version of poreCov
  • used tools and versions within poreCov
  • basic stats about the reconstructed consensuses (length, N50, number Ns, maybe pairwise identity to Wuhan strain ...)
  • if possible some stats about the called variants

E.g. in a single PDF report per run.

For the first part (technical stats) it might be also enough to use the nextflow internal functions for reporting.

@hoelzer hoelzer added the enhancement New feature or request label Jan 19, 2021
@hoelzer
Copy link
Collaborator Author

hoelzer commented Jan 19, 2021

We can base the report on the following script:
https://gitlab.com/RKIBioinformaticsPipelines/ncov_minipipe/-/blob/master/ncov_minipipe.Rmd

and modify it according to the Nanopore needs.

@replikation replikation self-assigned this Jan 19, 2021
@hoelzer
Copy link
Collaborator Author

hoelzer commented Jan 20, 2021

@oliverdrechsel @rekm welcome to the reporting issue! :)

So basically @replikation needs to know what are the inputs (and formats) for the different parts https://gitlab.com/RKIBioinformaticsPipelines/ncov_minipipe/-/blob/master/ncov_minipipe.Rmd already provides so he can generate the inputs.

Maybe we can start basic and try to implement the Rmd script with support for

  • some raw read metrics
  • consensus coverage

@replikation
Copy link
Owner

yep basically either an example "of input" to this script or a summary of what the script needs to function properly

@oliverdrechsel
Copy link
Collaborator

the report currently takes a bunch of files. I think we can either modify the report to have all files optional or butcher it and make a nanopore version

What i find quite important for the report output (html) is that figures and tables have a maximum size and implement scroll bars. This keeps the report short and easy to scroll, although each section might contain 100 samples.

*coverage.tsv

loads of 0's because it's a negative control as example

$ head NK-1xTE_1.coverage.tsv
NC_045512.2     1       0
NC_045512.2     2       0
NC_045512.2     3       0
NC_045512.2     4       0
NC_045512.2     5       0
NC_045512.2     6       0
NC_045512.2     7       0
NC_045512.2     8       0
NC_045512.2     9       0
NC_045512.2     10      0

fragment size

this is basically the fragment size column of the mapping bam file

$ head NK-1xTE_1.fragsize.tsv
0
102
-102
-121
121
102
-102
102
-102
93

mapping statistics

output from bwa mem

$ head NK-1xTE_1.bamstats.txt
794 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
2 + 0 supplementary
0 + 0 duplicates
718 + 0 mapped (90.43% : N/A)
792 + 0 paired in sequencing
396 + 0 read1
396 + 0 read2
526 + 0 properly paired (66.41% : N/A)
640 + 0 with itself and mate mapped

transformed for improved reading in R

$ head NK-1xTE_1.bamstats.pipe.txt
794|0|in total (QC-passed reads + QC-failed reads)
0|0|secondary
2|0|supplementary
0|0|duplicates
718|0|mapped (90.43% : N/A)
792|0|paired in sequencing
396|0|read1
396|0|read2
526|0|properly paired (66.41% : N/A)
640|0|with itself and mate mapped

version

$ cat pipeline.version
v2.0.4

pangolin (optional)

$ cat NK-TE_1_21-00101.lineage.txt
taxon,lineage,probability,pangoLEARN_version,status,note
NK-TE_1_21-00101_iupac_consensus_v2.0.4,None,0,2021-01-16,fail,N_content:1.0

kraken (optional)

kraken2 result of a run of reads against a human/SARS-CoV-2 database (https://zenodo.org/record/3854856)
kraken read filtering improved out mapping tremendously

$ head NK-TE_1_21-00101.kraken.report.txt
 22.58  56      56      U       0       unclassified
 77.42  192     0       R       1       root
 77.02  191     0       D       10239     Viruses
 77.02  191     0       D1      2559587     Riboviria
 77.02  191     0       K       2732396       Orthornavirae
 77.02  191     0       P       2732408         Pisuviricota
 77.02  191     0       C       2732506           Pisoniviricetes
 77.02  191     0       O       76804               Nidovirales
 77.02  191     0       O1      2499399               Cornidovirineae
 77.02  191     0       F       11118                   Coronaviridae

@replikation
Copy link
Owner

@oliverdrechsel thanks ill prepare the information on my end and check if I can add all the mandatory items so you don't need to change anything.

@oliverdrechsel
Copy link
Collaborator

@oliverdrechsel thanks ill prepare the information on my end and check if I can add all the mandatory items so you don't need to change anything.

Thanks a lot. I don't know if things like fragment size make sense as minION is not performing paired end sequencing.

@replikation
Copy link
Owner

replikation commented Jan 21, 2021

@oliverdrechsel just checked its 0 everywhere. so this might be good as an optional parameter or do you have another idea that I can supply here instead?
€ not sure how the final report looks like (e.g. can I supply some more nanopore relevant things here)

@replikation
Copy link
Owner

@oliverdrechsel looking at your report script i tend to rewrite it as nextflow is more "file" oriented. so using dirs and recursively looking for inputs is counter intuitive here. Would this be okay if I fork your script and adjust it to the workflow?

@replikation replikation linked a pull request Feb 17, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants