-
Notifications
You must be signed in to change notification settings - Fork 753
Description
Bug report
This is a slightly strange one. I'm encountering a problem where I feed Nextflow a URL which has been encoded - in this case, https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2%237.sub.cram where a # is encoded as %23. This URL is not available in the un-encoded format.
The pipeline runs fine the first time, but upon resuming it de-encodes the URL (subsituting %23 for a #, tries to stage the file with this name, and fails. This behaviour does not happen if I remove the file() call in the initial input channel (but in my real case, I want to run checkIfExists ASAP).
Expected behavior and actual behavior
Actual behaviour: resume fails as Nextflow tries to stage the wrong remote path
Expected behaviour: resume should work and both processes should be cached.
Steps to reproduce the problem
process TEST {
input:
val file
output:
tuple val(file), val(integer), emit: output
exec:
integer = 1
}
process TEST2 {
input:
tuple path(file), val(integer)
script:
"""
echo ${file.getName()} > ${integer}.txt
"""
}
workflow {
ch_input = Channel.of(file("https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2%237.sub.cram"))
TEST(ch_input)
TEST2(TEST.out.output)
}
Program output
Initial run:
> nextflow run main.nf -ansi-log false
N E X T F L O W ~ version 25.04.2
Launching `main.nf` [curious_elion] DSL2 - revision: a4a37a61e7
[fe/ed5463] Submitted process > TEST (1)
Staging foreign file: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2%237.sub.cram
[d2/8204f8] Submitted process > TEST2 (1)
Resumed run:
> nextflow run main.nf -ansi-log false -resume (base)
N E X T F L O W ~ version 25.04.2
Launching `main.nf` [tender_hodgkin] DSL2 - revision: a4a37a61e7
[fe/ed5463] Cached process > TEST (1)
Staging foreign file: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
WARN: Unable to stage foreign file: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram (try 1 of 3) -- Cause: Unable to access path: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
WARN: Unable to stage foreign file: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram (try 2 of 3) -- Cause: Unable to access path: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
WARN: Unable to stage foreign file: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram (try 3 of 3) -- Cause: Unable to access path: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
ERROR ~ Error executing process > 'TEST2 (1)'
Caused by:
Can't stage file https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram -- reason: Unable to access path: https://tolit.cog.sanger.ac.uk/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
Command executed:
echo 41741_2#7.sub.cram > 1.txt
Command exit status:
-
Command output:
(empty)
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
Environment
- Nextflow version: 25.04.2.5947
- Java version: openjdk 17.0.10 2024-01-16
- Operating system: macOS - but can replicate on Linux
Additional context
N/A