Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple fastq files #58

Closed
sandipmkale opened this issue Jan 20, 2022 · 15 comments
Closed

multiple fastq files #58

sandipmkale opened this issue Jan 20, 2022 · 15 comments

Comments

@sandipmkale
Copy link

Hello,

I have 50 pair end fastq files, can I input all together to chromap.

Thanks and regards

Sandip

@haowenz
Copy link
Owner

haowenz commented Jan 20, 2022

As long as the command line does not exceed the limit posed by your operating system, you can use "," to concatenate the file paths like the following.

chromap --preset atac -x index -r ref.fa -1 read1_1.fq,read2_1.fq,read3_1.fq -2 read1_2.fq,read2_2.fq,read3_2.fq -o output.bed

Make sure the file orderings for both read ends are matched (in this example, both are in order 1->2->3).

@sandipmkale
Copy link
Author

sandipmkale commented Jan 20, 2022 via email

@haowenz
Copy link
Owner

haowenz commented Jan 20, 2022

So far this is the only way.

@haowenz haowenz closed this as completed Jan 28, 2022
@jeremymsimon
Copy link

Hi @haowenz is it possible to enhance this so that users can specify * wildcards in their filenames instead? That would make it similar in syntax to some other tools out there. For example:

chromap \
   --preset atac \
   -x index \
   -r ref.fa \
   -1 read*_1.fq \
   -2 read*_2.fq \
   -o output.bed

@haowenz
Copy link
Owner

haowenz commented Aug 25, 2023

Chromap uses cxxopts to parse the command line. And it does not support that. The only relative easy improvement I can think of is to have a txt file as input for the read file names.

@jeremymsimon
Copy link

That could be helpful if the other option isn't possible! We often have multiple lanes worth of reads per sample, and multiple samples, so any way of helping do that programmatically would be great!

@mourisl
Copy link
Collaborator

mourisl commented Aug 30, 2023

@jeremymsimon We have implemented the wildcard compatibility in read paths. Could you please try the https://github.com/swiftgenomics/chromap/tree/regexp-file-paths branch? It if works well on your data, we will merge it to the master branch.

@mourisl mourisl reopened this Aug 30, 2023
@jeremymsimon
Copy link

Thanks @mourisl - I gave this a try but it didn't seem to read in the 2nd set of FASTQs. I ran this as:

chromap \
	-t 4 \
	--preset atac \
	-x GRCh38.primary_assembly.genome.chromap.idx \
	-r GRCh38.primary_assembly.genome.fa \
	-1 S19_*R1*.fastq.gz \
	-2 S19_*R3*.fastq.gz \
	-o ATAC_chromap_fragments_TEST.tsv \
	-b S19_*R2*.fastq.gz

but I suspect because the * gets expanded out to 2 filenames separated by a space, things don't get processed properly

It doesn't error, but the log doesn't report reading anything other than the first set of files. Let me know if you intended for the wildcard to be specified differently

@haowenz
Copy link
Owner

haowenz commented Sep 5, 2023

To allow Chromap to parse the wildcards, you have to put your path into quotation marks. Otherwise, the system will parse it and then only the first file will be the input.

@jeremymsimon
Copy link

Okay got it! It ran successfully for me, and gave an identical output to that of my previous run with each file named separately (separated by commas).

The only thing to note is that I needed to have my gcc v9.2.0 module loaded at execution, which seemingly was not needed before, otherwise I got a 'GLIBCXX_3.4.21' not found error

Thanks!

@haowenz
Copy link
Owner

haowenz commented Sep 5, 2023

Interesting. What is your default gcc version before loading the gcc module?

@jeremymsimon
Copy link

v4.8.5 is the default on our system. I loaded v9.2.0 to install this branch of chromap and then needed to make sure it was loaded here at runtime

@haowenz
Copy link
Owner

haowenz commented Sep 6, 2023

I see. 4.8.5 is too old. Chromap needs at least 7.3.0 to compile. So this is expected.

@haowenz haowenz closed this as completed Sep 6, 2023
@jeremymsimon
Copy link

Appreciate you implementing this! Will it get rolled into the next full release as well?

@haowenz
Copy link
Owner

haowenz commented Sep 6, 2023

Yes. It has already been merged into master branch. And it will be in next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants