Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: How to make falco work with multiqc? #13

Closed
sklages opened this issue Feb 16, 2021 · 2 comments
Closed

Q: How to make falco work with multiqc? #13

sklages opened this issue Feb 16, 2021 · 2 comments

Comments

@sklages
Copy link

sklages commented Feb 16, 2021

As I understood, current version 0.2.4 output should be compatible with current multiqc, I am using 1.9.

No matter which files I create with falco, multiqc always fails to find any analysis results.

sample_S17_L002_R1_001.fastq.gz_fastqc_data.txt
sample_S17_L002_R1_001.fastq.gz_fastqc_report.html
sample_S17_L002_R1_001.fastq.gz_summary.txt

Even when putting these files into sample_S17_L002_R1_001.fastq.gz.zip ... no success.

So I obviously missed here soemthing probably very basic/simple.

Any idea what I am doing wrong?

@guilhermesena1
Copy link
Collaborator

guilhermesena1 commented Feb 17, 2021

Hello,

I assume you ran falco on several FASTQ files, which prepends the FASTQ name with an underscore(in this case sample_S17_L002_R1_001.fastq.gz_) to the file names.

I believe MultiQC takes a directory as input and it looks for files named summary.txt, fastqc_data.txt and fastqc_report.html exactly (without the prefix), so if you move these files to a directory (e.g. named the same as your FASTQ sample) and rename the 3 files to remove the prefix, it should work. Please let me know though if you still have trouble or if that solves the problem.

@sklages
Copy link
Author

sklages commented Feb 17, 2021

Hi,

I am running falco on individual fastq files (for better logging/timestamping) and rename the standard files according to my sample names.

Yes, seems it takes a directory or zip file to search for content.

Now I changed my pipeline in that, that a correct subdir is created containing the relevant files, like:

|--sample_S17_L002_R1_001_fastqc
|   |-- fastqc_data.txt
|   |-- fastqc_report.html
|   `-- summary.txt
`-- sample_S17_L002_R1_001_fastqc_report.html -> sample_S17_L002_R1_001_fastqc/fastqc_report.html

together with a symlink (properly named for user convenience) pointing to fastqc_report.html.

Well, it works :-)

Testing on a small 4GB fastq file shows, that falco runs 2-2.5x faster compared to fastqc (single-threaded). Running fastqc with 4 threads on a single fastq file, is not faster than single-threaded (as expected). Running fastqc with 4 threads on a pair of fastq files, it runs the same time as for one file, nowusing 2 cores.

So I think I will stick with falco for now, although the HTML report is not yet "perfect" (e.g. some tables do not match the page/report design).

Thanks for a nice piece of software!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants