Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate unit test with snakemake but missing quote for configfile in generated test scripts #843

Open
Grelot opened this issue Jan 18, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@Grelot
Copy link

Grelot commented Jan 18, 2021

snakemake 5.31.1

I ran my workflow with success a first time

snakemake --configfile config/config_test_rapidrun.yaml -s Snakefile --cores 8 --use-conda

and then I ran again my workflow to generate unit test scripts using --generate-unit-tests

 snakemake --configfile config/config_test_rapidrun.yaml -s Snakefile  --cores 8 --use-conda --generate-unit-tests

I ran pytest3

python3 -m pytest .tests/unit/

Then all the tests failed. It appears that the path of the configfile is not quoted inside the scripts:

Instance of error:

========================================================================== ERRORS ==========================================================================
____________________________________________________ ERROR collecting .tests/unit/test_trim_primers.py _____________________________________________________
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/site-packages/_pytest/python.py:578: in _importtestmodule
    mod = import_path(self.fspath, mode=importmode)
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/site-packages/_pytest/pathlib.py:531: in import_path
    importlib.import_module(module_name)
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:161: in exec_module
    source_stat, co = _rewrite_test(fn, self.config)
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:354: in _rewrite_test
    tree = ast.parse(source, filename=fn_)
../../../miniconda3/envs/snakemake_rapidrun/lib/python3.9/ast.py:50: in parse
    return compile(source, filename, mode, flags,
E     File "/home/pguerin/working/edna/snakemake_rapidrun_swarm/.tests/unit/test_trim_primers.py", line 37
E       /home/pguerin/working/edna/snakemake_rapidrun_swarm/config/config_test_rapidrun.yaml
E       ^
E   SyntaxError: invalid syntax

And the faulty code:

        # Run the test job.
        sp.check_output([
            "python",
            "-m",
            "snakemake",
            "results/04_trim_primers/projet1/chond/sample2.fastq",
            "-F",
            "-j1",
            "--keep-target-files",
            "--configfile",
            /home/pguerin/working/edna/snakemake_rapidrun_swarm/config/config_test_rapidrun.yaml

            "--use-conda",
            "--directory",
            workdir,
        ])
@Grelot Grelot added the bug Something isn't working label Jan 18, 2021
@sroener
Copy link

sroener commented Feb 4, 2021

I have the same issue when defining a configfile with --configfile.

If I define a configfile in my Snakefile like so:

configfile: "config/config.yaml"

Pytest throws the following error:

E               subprocess.CalledProcessError: Command '['python', '-m', 'snakemake', 'results/workflow/GRCh38/A.fa.gz', '-F', '-j1', '--keep-target-files', '--use-conda', '--directory', PosixPath('/tmp/tmplu3iwqww/workdir')]' returned non-zero exit status 1.

../../anaconda3/envs/snakemake_tests/lib/python3.8/subprocess.py:512: CalledProcessError
------------------------------------------------------------------------------ Captured stderr call -------------------------------------------------------------------------------
results/workflow/GRCh38/A.fa.gz
WorkflowError in line 20 of /home/user/git_group/new_project/workflow/Snakefile:
Config file config/config.yaml not found.
  File "/home/user/git_group/new_project/workflow/Snakefile", line 20, in <module

Interestingly, I am able to reproduce the WorkflowError when specifying a temporary directory with -d temp/. I assume that snakemake is looking for any external config files based on this new working directory, but isn't finding any. Following up on this assumption, I tried using absolute paths which lead to tests running (as in finding input files), but failing due to the following error:

E           ValueError: Unexpected files:
E           results/logs/createRandomIntervals.GRCh38.A.log
E           results/logs/extendInterval.GRCh38.A.log
E           results/logs/getSequence.GRCh38.A.log
E           results/workflow/GRCh38/A.bed.gz

.tests/unit/common.py:41: ValueError

I hope this helps

EDIT:

After playing a while with an example workflow I figured these things out to make the tests pass:

  • files not directly included in rule (eg. configfile, etc) have to be specified with full path. I assume they are not copied to the temp dir, because they are not direct inputs of the rules.
  • rules must include all files (even by-products), otherwise tests will fail with ValueError: Unexpected files. This happend to me by using bedtools getfasta, which created a missing .fai file.
  • tests seem to not work properly with log files.

@marrip
Copy link

marrip commented Feb 23, 2021

Hey,

I have the same problem. I define my config.yaml in common.smk, run my workflow succesfully, generate the unit tests automatically and then I wanted to test them with pytest. There are a couple of issues with that:

  • My workflow workdir is usually not the local repository but snakemake will generate the .tests directory in the workdir and not in the repository. Thus, running the tests will throw an error as it does not find the Snakefile. So I have to move my tests to the repo after generation.
  • The next problem is the config.yaml, samples.tsv and units.tsv which are not added as to the directories for the tests (data and expected). The tests fail because they are necessary for the workflow. I had to add them manually to all directories.
  • I understand that log files can not be compared, however, their generation is part of the rules which will lead to problems when pytest compares the output from the rule to the expected. I added os.remove(...) to remove them prior to the check but that is an ugly hack and I would like to get rid of that.
  • I use singularity for all my rules to make my workflows more portable - the autogenerated tests ignore container definitions which has the tests fail.
  • For two rules I receive a MissingRuleException which is weird as I could run the workflow successfully. The output from pytest is rather confusing and does not indicate any helpful information. It says it could be due to input functions. I don't use input functions for these specific rules. I add the definition of one of them below.
rule mark_duplicates:
  input:
    "{sample}/{unit}/bwa_mem.bam"
  output:
    bam = "{sample}/{unit}/mark_duplicates.bam",
    metrics = "{sample}/{unit}/mark_duplicates.metrics",
  shell:
    "gatk MarkDuplicates "
      "--java-options "
      "-Xmx30g "
      "-I {input} "
      "-O {output.bam} "
      "-M {output.metrics} "
      "--TMP_DIR {wildcards.sample}/{wildcards.unit}"

Any kind of feedback would be highly appreciated 😄

@nikostr
Copy link
Contributor

nikostr commented Apr 27, 2021

I think the following line needs to be updated:

and replaced with

    "{{ configfile }}",

But there would also need to be some changes in how the config files are handled. Would it make sense to simply copy the config directory to the workdir? But there seem to be several additional issues here.

@nikostr
Copy link
Contributor

nikostr commented May 5, 2021

Okay, I've fiddled around a bit with this in the past few days. A result is here:

https://github.com/nikostr/dna-seq-deepvariant-glnexus-variant-calling/tree/main/.tests/unit

The changes I've done to get this working:

This is of course possible to do manually, but I'd propose that this is a sensible update, together with clear instructions on how to generate the unit tests. These could specify that the config directory should contain all the config files necessary to run the tests.

kelly-sovacool added a commit to SchlossLab/mikropml-snakemake-workflow that referenced this issue Nov 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants