Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failings because test.yml strings present in file are not found. #2118

Closed
mahesh-panchal opened this issue Dec 9, 2022 · 14 comments
Closed
Labels
bug Something isn't working

Comments

@mahesh-panchal
Copy link
Member

Description of the bug

https://github.com/nf-core/modules/actions/runs/3657880825/jobs/6181972088

The test workflow fails to find the strings in a file even though the output is there:

  files:
    - path: output/purgedups/test.dups.bed
      md5sum: f5282a0c87eee32d5fe665284572b794
    - path: output/purgedups/test.purge_dups.log
      contains:  # These strings not found
        - "tp:"
        - "check overpuring"
    - path: output/purgedups/versions.yml

but the log file looks like:

...
tp:ERR5069949.3258358:�
tp:ERR5069949.1476386:ӰU
tp:ERR5069949.2415814:
[M::flt_by_bm_mm] check overpuring

Command used and terminal output

$ nf-core modules test purgedups/purgedups

                                          ,--./,-.
          ___     __   __   __   ___     /,-._.--~\
    |\ | |__  __ /  ` /  \ |__) |__         }  {
    | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                          `._,._,'

    nf-core/tools version 2.7.1 - https://nf-co.re


INFO     Press enter to use default values (shown in brackets) or type your own responses                                                                                                      
? Choose software profile Docker
INFO     Setting environment variable '$PROFILE' to 'docker'                                                                                                                                   
───────────────────────────────────────────────────────────────────────────────────── purgedups/purgedups ─────────────────────────────────────────────────────────────────────────────────────
INFO     Running pytest for module 'purgedups/purgedups'                                                                                                                                       
===================================================================================== test session starts =====================================================================================
platform linux -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /workspace/modules, configfile: pytest.ini
plugins: workflow-1.6.0
collecting ... 
collected 1329 items                                                                                                                                                                          

purgedups purgedups test_purgedups_purgedups:
        command:   nextflow run ./tests/modules/nf-core/purgedups/purgedups -entry test_purgedups_purgedups -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/purgedups/purgedups/nextflow.config
        directory: /tmp/pytest_workflow__eo11vac/purgedups_purgedups_test_purgedups_purgedups
        stdout:    /tmp/pytest_workflow__eo11vac/purgedups_purgedups_test_purgedups_purgedups/log.out
        stderr:    /tmp/pytest_workflow__eo11vac/purgedups_purgedups_test_purgedups_purgedups/log.err
'purgedups purgedups test_purgedups_purgedups' done.

tests/test_versions_yml.py Exception in thread Thread-3:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/threading.py", line 980, in _bootstrap_inner
ss    self.run()
s  File "/opt/conda/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
s  File "/opt/conda/lib/python3.9/site-packages/pytest_workflow/content_tests.py", line 149, in find_strings
    self.found_strings = check_content(
s  File "/opt/conda/lib/python3.9/site-packages/pytest_workflow/content_tests.py", line 52, in check_content
    for line in text_lines:
  File "/opt/conda/lib/python3.9/codecs.py", line 322, in decode
s    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 1886: invalid start byte
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 11%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 25%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 39%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 53%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 66%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss.sssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 80%]
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 94%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss                                                                                                                    [ 99%]
tests/modules/nf-core/purgedups/purgedups/test.yml ...FF..                                                                                                                              [100%] Keeping temporary directories and logs. Use '--kwd' or '--keep-workflow-wd' to disable this behaviour.


========================================================================================== FAILURES ===========================================================================================
________________________________________________________________________________________ test session _________________________________________________________________________________________
'tp:' was not found in /tmp/pytest_workflow__eo11vac/purgedups_purgedups_test_purgedups_purgedups/output/purgedups/test.purge_dups.log while it should be there.
________________________________________________________________________________________ test session _________________________________________________________________________________________
'check overpuring' was not found in /tmp/pytest_workflow__eo11vac/purgedups_purgedups_test_purgedups_purgedups/output/purgedups/test.purge_dups.log while it should be there.
====================================================================================== warnings summary =======================================================================================
../../opt/conda/lib/python3.9/site-packages/pytest_workflow/plugin.py:122: 773 warnings
  /opt/conda/lib/python3.9/site-packages/pytest_workflow/plugin.py:122: PytestRemovedIn8Warning: The (fspath: py.path.local) argument to YamlFile is deprecated. Please use the (path: pathlib.Path) argument instead.
  See https://docs.pytest.org/en/latest/deprecations.html#fspath-argument-for-node-constructors-replaced-with-pathlib-path
    return YamlFile.from_parent(parent, fspath=path)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================================== short test summary info ===================================================================================
FAILED tests/modules/nf-core/purgedups/purgedups/test.yml::purgedups purgedups test_purgedups_purgedups::output/purgedups/test.purge_dups.log::content::contains 'tp:'
FAILED tests/modules/nf-core/purgedups/purgedups/test.yml::purgedups purgedups test_purgedups_purgedups::output/purgedups/test.purge_dups.log::content::contains 'check overpuring'
================================================================== 2 failed, 6 passed, 1321 skipped, 773 warnings in 29.01s ===================================================================

System information

Nextflow version: version 22.10.1
Hardware: Cloud (Gitpod)
Executor: local
OS: Unix (Gitpod)
nf-core/tools: 2.7.1
python: 3.9.15

@mahesh-panchal mahesh-panchal added the bug Something isn't working label Dec 9, 2022
@awgymer
Copy link
Contributor

awgymer commented Dec 9, 2022

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 764: invalid continuation byte

Seems to me like this may actually be an issue where it fails to decode the file but the eventual "failure" is just reported as the value not being present?

@mahesh-panchal
Copy link
Member Author

I'm not sure what I can do about those funky characters. What's the protocol here then?

@awgymer
Copy link
Contributor

awgymer commented Dec 9, 2022

Probably the only actual solution is to open an issue with pytest-workflow.

They could maybe just use errors=ignore in their open call but that has downsides. More comprehensive attempts to sniff or fallback to other encodings are doable but whether they'd implement them I don't know.

@fabianegli
Copy link
Contributor

The problem is that some file is apparently written with utf-16 and then read as utf-8. I think it would make sense to try and define the encoding upon file creation.

@mahesh-panchal
Copy link
Member Author

Not sure how to change that then. The locale inside the docker container is utf-8

LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

I even ran a nf-core modules test purgedups/purgedups with export LANG=C.UTF-8 in the module script block, and the file still shows strange characters, and throws the same error above when adding the contains back to the test.yml.

@awgymer
Copy link
Contributor

awgymer commented Dec 19, 2022

I would assume it's down to the purgedups tool rather than anything you can control.

So personally I still think getting the testing framework to allow multiple encodings is better than trying to persuade each individual tool maintainer to output only in utf-8 encoding but 🤷🏼‍♂️

@mahesh-panchal
Copy link
Member Author

I think you're right there. I'm not sure though it's something I can chase up though particularly as I'm technically on leave now.

I think though, from the perspective here, that this is something that can't be solved. Checking for file existence is OK for the moment. Perhaps if/when nf-test is rolled out across modules, it'll be possible there. Thanks for investigating though.

@fabianegli
Copy link
Contributor

I just tried running the test on my machine - and they all pass.

What profile did you use for the tests?

Here's my locale - maybe it can help you:

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

@mahesh-panchal
Copy link
Member Author

I just tried running the test on my machine - and they all pass.

What profile did you use for the tests?

Here's my locale - maybe it can help you:

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

Including with the contains that was removed?

@fabianegli
Copy link
Contributor

That's what I did:

mkdir testdir
cd testdir
python3.7 -m venv venv
. ./venv/bin/activate
pip install nf-core
git clone --depth 1 https://github.com/nf-core/modules.git
cd modules
export PROFILE=docker
nf-core modules test purgedups/purgedups

I think I know what you mean now - the failing tests are not in the module?

@mahesh-panchal
Copy link
Member Author

Exactly. The part that fails is not included in the module. It's only described above

@fabianegli
Copy link
Contributor

OK, this log is malformed. This should probably be reported to the purgedups devs.

The generated file is a malformed UTF-8 file. See attachment.

test.purge_dups.log

@fabianegli
Copy link
Contributor

Screenshot 2022-12-20 at 15 20 54

Line 34 contains one of the problematic sequences ('E/\x14V').

@fabianegli
Copy link
Contributor

LUMC/pytest-workflow#161 (comment)

pytest-workflow now implements an encoding specifier. In this case I believe it to be "latin-1".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants