Skip to content

[pyos][tests] Most tests don't test anything #34

@sneakers-the-rat

Description

@sneakers-the-rat

Part of: pyOpenSci/software-submission#267 (comment)

E.g. here:

def test_doi_variants(self, run_onecite_process):
"""All common DOI notations should be recognised."""
for doi in (
"10.1038/nature14539",
"doi:10.1038/nature14539",
"DOI: 10.1038/nature14539",
"https://doi.org/10.1038/nature14539",
):
code, out, err, _ = run_onecite_process(doi, input_type="txt")
assert code == 0, f"failed for {doi!r}: {err}"
assert out.strip(), f"empty output for {doi!r}"

allegedly we are testing whether these are correctly parsed as DOIs, but it tests nothing of the kind. It just asserts that the processing didn't fail - which it arguably should since the network requests that would get the metadata are mocked. This whole test module is like this, as are many of the other test modules.

e.g. here we don't mock network, but we still don't test anything aside from "some bibtex was generated" - this does not test whether the input DOI was correctly parsed:

code, out, err = _run(["process", f, "--quiet"])

this test says we shouldn't make any API calls, and yet despite the means of actually testing whether that's the case near at hand and used in other test modules, nothing is tested -

"""A fully-specified .bib entry shouldn't need any API calls."""

This whole test module tests nothing - https://github.com/HzaCode/OneCite/blob/main/tests/test_integration.py

same with this one - https://github.com/HzaCode/OneCite/blob/main/tests/test_onecite_basic.py

same with this one (those would be nice to have, and what should be tested there is whether the output formats are correct rather than just that an error wasn't raised) - https://github.com/HzaCode/OneCite/blob/main/tests/test_output_formats.py

this module gets close and has some real parser tests in it, but due to the LLM-generated nature of this package i actually have no confidence in whether the mocked API results are what they should be. this would be more believable using e.g. vcr - https://github.com/HzaCode/OneCite/blob/main/tests/test_pipeline_unit.py

since most of the tests don't test anything, that explains why my results are so bad - true to form with LLMs, I get something, it's just that what i get is wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions