Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent test results when using different runners #28

Open
pfalcon opened this issue Sep 14, 2019 · 7 comments
Open

Inconsistent test results when using different runners #28

pfalcon opened this issue Sep 14, 2019 · 7 comments

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Sep 14, 2019

Docs at https://ppci.readthedocs.io/en/latest/development.html#running-the-testsuite present three ways to run the testsuite, and the way it's worded there, one can only imagine that all they equivalent. However, trying them results in different number of tests run:

  1. python -m unittest discover -s test . This would be a default way, as it uses builtin Python module. But:
Ran 165 tests in 0.483s

OK (skipped=48)
  1. python -m pytest -v test/ . Quite different result:
====== 1338 passed, 672 skipped, 2 warnings in 16.79s ======
  1. tox -e py3. This gives the biggest coverage:
1473 passed, 537 skipped, 2 warnings in 36.43s

Would be nice to know the reason for discrepancies and do something about them (ideally, make them all run the same amount of tests, vs leaving only 1 way to run them ;-) ).

@pfalcon
Copy link
Contributor Author

pfalcon commented Sep 14, 2019

I should also note that Python standard convention of python3 setup.py test gives yet another outcome:

...
  File "/usr/lib/python3.6/unittest/loader.py", line 153, in loadTestsFromName
    module = __import__(module_name)
  File "/home/pfalcon/projects-3rdparty/Python/Python-compilers-for-non-python/ppci-mirror/ppci/arch/arm/vfp.py", line 29, in <module>
    class Vadd(VfpInstruction):
  File "/home/pfalcon/projects-3rdparty/Python/Python-compilers-for-non-python/ppci-mirror/ppci/arch/arm/vfp.py", line 33, in Vadd
    syntax = Syntax(["vadd.f64", " ", d, ",", " ", n, ",", " ", m])
  File "/home/pfalcon/projects-3rdparty/Python/Python-compilers-for-non-python/ppci-mirror/ppci/arch/encoding.py", line 508, in __init__
    raise TypeError('Invalid element "{}"'.format(element))
TypeError: Invalid element "vadd.f64"

@windelbouwman
Copy link
Owner

Hmm, this is curious. I agree that all methods should result in the same total amount of test cases.

@windelbouwman
Copy link
Owner

Okay, so the first method python -m unittest discover -s test does not recurse into subdirectories. That's why it does not find all tests. I propose to remove this methods from the docs.

The pytest and tox methods actually discover both 2010 test cases, but do not run them all. This is depending on environment variables, which are set properly in the tox.ini file. I propose to document this behavior.

The last method, via setup.py, is a deprecated method, I propose to remove it.

@windelbouwman
Copy link
Owner

I updated the docs as per the above comment: https://ppci.readthedocs.io/en/latest/development.html#running-the-testsuite

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 5, 2020

Thanks for looking into this, and detailed analysis.

The last method, via setup.py, is a deprecated method, I propose to remove it.

To clarify, deprecated by whom/what? You see, "python setup.py test" is the standard, generalized way to test a Python complication. It allows to abstract away a particular runner some specific project may use behind a common interface. I'd recommend to support it, with proper tests_require directive, etc.

Beyond that, what can I say - you're the author of the testsuite, so you would know how to do it best. Choosing one test method makes a good sense.

If you still want my 2 cents on that, then I find testing to be rather boring area :-P. My favorite testing tool is nosetests, where I just write Python functions with assert's, voila. And that's when I need to test API stuff, but I generally try to lean on the side of integration tests on the "command line app" level, e.g. for PPCI that would be: input C code, expected IR output, a shell driver to execute command of compiling C to IR, and diffing results. That's as close as possible to the way people actually use the stuff. Back to unit testing, tox is the least familiar tool for me, I always considered it too eerie and complicated ;-). But yeah, I definitely looking forward to learn new things from dealing with PPCI ;-).

@windelbouwman
Copy link
Owner

The main testing tool used is pytest, tox is a sort of wrapper around venv and pytest.

To clarify, deprecated by whom/what?

When I run this:

$ python setup.py test
running test
WARNING: Testing via this command is deprecated and will be removed in a future version. Users looking for a generic test entry point independent of test runner are encouraged to use tox.
running egg_info
writing ppci.egg-info/PKG-INFO
writing dependency_links to ppci.egg-info/dependency_links.txt
...

I guess it is deprecated from setuptools?

Unittests are useful in this project, there is simply too much stuff going on behind some highlevel api calls, it make sense to test subsystems. Btw, there are a whole bunch of sample snippets with C code and corresponding output on stdout, this is what you mean by integration testing, having a C sourcecode and diff it's output with the expected output.

@windelbouwman
Copy link
Owner

I found here some information about setuptools: https://setuptools.readthedocs.io/en/latest/setuptools.html#test-build-package-and-run-a-unittest-suite Looks like that feature is being deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants