Quality control for data generated by Illumina Sequencing.
We needed a scalable local solution for generating quality control reports in the 100k Genomes Project. We started with these measurements and visualizations and will likely add more later.
The program will need to have a c++11 compatible compiler. To visualize the graphs you'll need R and ggplot2.
./HTseqQA <options> -i <fastq or gzip'd fastq>
Other options
-o <int> , manually set the offset
-r , Only print out the Rscript for figure generation
-g , Creat a greyscale version of graphs
If you need to be running many files I suggest using parallel:
parallel HTseqQA -i {} ::: /path/to/all/*.fastqs
On a standard computer we're able to process 4000+ files for bacterial genomes over night.
Simple text file that tells you the number of reads seen.