-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement support for tuples with optional file outputs #2710
implement support for tuples with optional file outputs #2710
Conversation
Signed-off-by: Robrecht Cannoodt <rcannood@gmail.com>
f484f65
to
50ef43c
Compare
Hi @rcannood , the CI tests are failing. Please check the details to see which tests failed. It may be that they were already failing on master. If you can confirm that the same tests were failing before your changes, please let me know here and we can disregard them. Otherwise, you'll need to address those test failures. |
Signed-off-by: Robrecht Cannoodt <rcannood@gmail.com>
ac43cdf
to
e8266b0
Compare
Thanks for taking a look at this PR, @bentsherman . Apologies for the failing test, that was a mistake from my end. Should be fixed now. |
Signed-off-by: Robrecht Cannoodt <rcannood@gmail.com>
…extflow into feature/optional_tuple_values
Signed-off-by: Robrecht Cannoodt <rcannood@gmail.com>
Signed-off-by: Robrecht Cannoodt <rcannood@gmail.com>
I think that test may have been just a clever way of testing multiple cases at once, but feel free to hijack it 🙂 |
Sorry, I was assessing the impact of having The main concern is what happens when a downstream task is expecting a triple and the 3rd element is |
I see. To some extent, if the user wants to use the functionality of having optional files, they should know how to deal with the output. In our case, we will map the output of a process with optional output to catch null values. The following code works with the current implementation in the PR: nextflow.enable.dsl=2
process test_process1 {
input:
tuple val(id)
output:
tuple val(id), path("proc1_output.txt"), path("proc1_output2.txt", optional: true)
script:
"""
echo $id > proc1_output.txt
if [[ "$id" == "foo" ]]; then
echo $id > proc1_output2.txt
fi
"""
}
process test_process2 {
input:
tuple val(id), path(input), path(input2)
output:
tuple val(id), path("proc2_output.txt")
script:
"""
cat $input > proc2_output.txt
${ input2 instanceof List && input2.isEmpty() ? "" : "cat ${input2} >> proc2_output.txt" }
"""
}
workflow {
Channel.fromList( ["foo", "bar"] )
| test_process
| map { it ->
if (!it[2]) it[2] = []
it
}
| test_process2
} However, it's a bit cumbersome to have to map null values to empty lists to get the code to work. I think it might make sense to allow the following: nextflow.enable.dsl=2
process test_process1 {
input:
tuple val(id)
output:
tuple val(id), path("proc1_output.txt"), path("proc1_output2.txt", optional: true)
script:
"""
echo $id > proc1_output.txt
if [[ "$id" == "foo" ]]; then
echo $id > proc1_output2.txt
fi
"""
}
process test_process2 {
input:
tuple val(id), path(input), path(input2, optional: true) // this setting allows input2 to be null
output:
tuple val(id), path("proc2_output.txt")
script:
"""
cat $input > proc2_output.txt
${ input2 ? "cat ${input2} >> proc2_output.txt" : "" }
""" // <- no need to check if input2 is a list
}
workflow {
Channel.fromList( ["foo", "bar"] )
| test_process // no need to map output of test_process
| test_process2
} What are your thoughts @pditommaso ? |
I noticed in the documentation that the 'join' operator can also result in the creation of tuples that have Would it be reasonable to already accept this PR, and work out how to deal with null inputs in processes in a separate issue? :) |
That's a good point, I think this feature is useful, but some points need to be taken into consideration:
|
We discussed a bit with @jorgeaguileraseqera how to move one on this. The idea is to add the option In a symmetry manner the |
Closing in favor of #2893 |
This is a suggested implementation for #2678.
I spent a few hours coming up with a solution which seems to work. Running the following main.nf (called 'attempt 1' in #2678):
Results in the following output:
the "attempt 2" approach (
tuple val(id), path("output.txt"), path("output2.txt") optional true
) -- which is not a solution to my problem -- still yields the same result:If I make both the tuple and the path non-optional (
tuple val(id), path("output.txt"), path("output2.txt")
, I get the following (expected) output:Please let me know if the suggested implementation could be of use.