Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temp file get removed early if it is still an input of one unfinished rule after checkpoint, resulting in early finish #1930

Closed
yanhui09 opened this issue Oct 25, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@yanhui09
Copy link

Snakemake version

7.16.0

Describe the bug

Temp files get deleted early even when they are required input of one unfinished rule after one checkpoint, resulting in an early uncompleted finish like #823. It happens when the checkpoint output is a directory, and the directory content and a Temp output from previous rules are the input of the downstream rules. It seems ok when the checkpoint output is just a file.

I'm not sure if it's a similar bug like #823 or just a limited way of writing snakemake workflow. So I report a new issue here.

Logs

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job          count    min threads    max threads
---------  -------  -------------  -------------
all              1              1              1
process01        1              1              1
process02        1              1              1
somestep         1              1              1
total            4              1              1

Select jobs to execute...

[Tue Oct 25 17:23:02 2022]
rule process01:
    output: processed01.txt
    jobid: 3
    reason: Missing output files: processed01.txt
    resources: tmpdir=/tmp

[Tue Oct 25 17:23:02 2022]
Finished job 3.
1 of 4 steps (25%) done
Select jobs to execute...

[Tue Oct 25 17:23:02 2022]
rule process02:
    input: processed01.txt
    output: processed02.txt
    jobid: 2
    reason: Missing output files: processed02.txt; Input files updated by another job: processed01.txt
    resources: tmpdir=/tmp

[Tue Oct 25 17:23:02 2022]
Finished job 2.
2 of 4 steps (50%) done
Removing temporary output processed01.txt.
Select jobs to execute...

[Tue Oct 25 17:23:02 2022]
checkpoint somestep:
    input: processed02.txt
    output: my_directory
    jobid: 1
    reason: Missing output files: my_directory; Input files updated by another job: processed02.txt
    resources: tmpdir=/tmp
Downstream jobs will be updated after completion.

[Tue Oct 25 17:23:02 2022]
Finished job 1.
3 of 4 steps (75%) done
BUG: Out of jobs ready to be started, but not all files built yet. Please check https://github.com/snakemake/snakemake/issues/823 for more information.
Remaining jobs:
 - all: 
 - process: processed1/3.txt
 - process: processed1/2.txt
 - process: processed1/1.txt
Complete log: .snakemake/log/2022-10-25T172301.913665.snakemake.log

Minimal example

# a target rule to define the desired final output
rule all:
    input:
        lambda wildcards: aggregate_input(wildcards),

rule process01:
    output:
        temp("processed01.txt"),
    shell:
        "echo PROCESSED > {output}"

rule process02:
    input:
        "processed01.txt"
    output:
        "processed02.txt",
    shell:
        "echo PROCESSED > {output}"

# the checkpoint that shall trigger re-evaluation of the DAG
# an number of file is created in a defined directory
checkpoint somestep:
    input:
        "processed02.txt"
    output:
        directory("my_directory/"),
    shell:
        """
        mkdir my_directory/
        cd my_directory
        for i in 1 2 3; do touch $i.txt; done
        """

rule process:
    input:
        "processed01.txt",
        "my_directory/{i}.txt",
    output:
        "processed1/{i}.txt",
    shell:
        "echo PROCESSED > {output}"

# collect process output
def aggregate_input(wildcards):
    checkpoint_output = checkpoints.somestep.get(**wildcards).output[0]
    return expand(
        "processed1/{i}.txt",
        i=glob_wildcards(os.path.join(checkpoint_output, "{i}.txt")).i,
    )

Additional context

The unfinished jobs can be solved by a re-run, which will re-create the temp files. Using --keep-going flag, the workflow will exit with 0 but it still stops at the same step. It only finishes successfully when the temp files are kept e.g., using --notemp.
The issue seems to violate the use of temp(), which shall be deleted after all rules that use it as input are completed. Also, it is a bit unpleasant to re-collect the output. 😿

@yanhui09 yanhui09 added the bug Something isn't working label Oct 25, 2022
@yanhui09
Copy link
Author

temp() works correctly if the temp is also the input of checkpoints. Something related to DAG re-evaluation at checkpoint?
The below workflow works.

# a target rule to define the desired final output
rule all:
    input:
        lambda wildcards: aggregate_input(wildcards),

rule process01:
    output:
        temp("processed01.txt"),
    shell:
        "echo PROCESSED > {output}"

rule process02:
    input:
        "processed01.txt"
    output:
        "processed02.txt",
    shell:
        "echo PROCESSED > {output}"

# the checkpoint that shall trigger re-evaluation of the DAG
# an number of file is created in a defined directory
checkpoint somestep:
    input:
        "processed01.txt",
        "processed02.txt"
    output:
        directory("my_directory/"),
    shell:
        """
        mkdir my_directory/
        cd my_directory
        for i in 1 2 3; do touch $i.txt; done
        """

rule process:
    input:
        "processed01.txt",
        "my_directory/{i}.txt",
    output:
        "processed1/{i}.txt",
    shell:
        "echo PROCESSED > {output}"

# collect process output
def aggregate_input(wildcards):
    checkpoint_output = checkpoints.somestep.get(**wildcards).output[0]
    return expand(
        "processed1/{i}.txt",
        i=glob_wildcards(os.path.join(checkpoint_output, "{i}.txt")).i,
    )

@yanhui09
Copy link
Author

This can be avoided by appending the checkpoint input. Thus I close the issue.

ThomasBrazier pushed a commit to ThomasBrazier/snpArcher-dev that referenced this issue Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant