Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable s_name_filenames: true for fastqc #890

Closed
cgirardot opened this issue Jan 17, 2019 · 2 comments
Closed

Enable s_name_filenames: true for fastqc #890

cgirardot opened this issue Jan 17, 2019 · 2 comments

Comments

@cgirardot
Copy link

cgirardot commented Jan 17, 2019

Is your feature request related to a problem? Please describe.
This is similar to issue #864 but for fastqc/data

As explained in the MultiQC doc:

Sample names are discovered by parsing the line beginning Filename in fastqc_data.txt, not based on the FastQC report names.

I have fastqc/data report files which all have the same file name in the Filename slot while the report file name contains the sample name.

Here are 2 examples.
FastQC_Report_All_Reads_ATAC_DSP45.txt
FastQC_Report_All_Reads_ATAC_DSP60.txt

Describe the solution you'd like
Similarly to trimmomatic, implementing

fastqc:
    s_name_filenames: true

would solve the isuue

Describe alternatives you've considered
there are no easy alternatives

Additional context
Note that this issue will occur with all report generated by galaxy workflows making use of collection. In other words, you might want to enable this for ALL module that extract sample name from parsing the file content

@ewels
Copy link
Member

ewels commented Jan 17, 2019

Note to self: could be interesting to look into whether this could be a new command line flag which ties into the core self.clean_s_name() function. All modules which set sample name based on a variable would have to submit the entire f object (with full file path) to the function then. May be possible to tie into the existing root variable pass through too? Will need to think about how to set the config flags.

@ewels ewels added this to the MultiQC v1.9 milestone Nov 13, 2019
@ewels ewels modified the milestones: MultiQC v1.9, MultiQC v1.10 May 30, 2020
ewels added a commit that referenced this issue Jul 4, 2021
…i flag

Forces modules to use the log filename for the sample identifier, even if the module usually takes this from the file contents

See #949 #890 and #864
@ewels
Copy link
Member

ewels commented Jul 4, 2021

Hi @cgirardot,

This has now been added in fa84c47 and will be included in the MultiQC v1.11 release.

The documentation for this new feature is here: https://multiqc.info/docs/#using-log-filenames-as-sample-names

Pasted here for convenience:

Using log filenames as sample names

A substantial number of MultiQC modules take the sample name identifiers that you see in the report from the file contents - typically the filename of the input file. This is because log files can often be called things like mytool.log or even concatenated. Using the input filename used by the tool is typically safer and more consistent across modules.

However, sometimes this does not work well. For example, if the input filename is not relevant (eg. using a temporary file or FIFO, process substitution or stdin etc.). In these cases your log files may have useful filenames but MultiQC will not be using them.

To force MultiQC to use the log filename as the sample identifier, you can use the --fn_as_s_name command line flag or set the use_filename_as_sample_name:

use_filename_as_sample_name: true

This affects all modules and all search patterns. If you want to limit this to just one or more specific search patterns, you can do by giving a list:

use_filename_as_sample_name:
  - cutadapt
  - picard/gcbias
  - picard/markdups

Note that this should be the search pattern key (see above) and not just the module name. This is because some modules search for multiple files.

The log filename will still be cleaned. To use the raw log filenames, combine with the --fullnames/-s flag or fn_clean_sample_names config option described above.

I hope this is helpful! Sorry for the delay. Let me know if you hit any problems with it 👍🏻

@ewels ewels closed this as completed Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants