Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/mitigate flaky integration tests #684

Closed
3 of 5 tasks
bhrutledge opened this issue Aug 16, 2020 · 12 comments · Fixed by #1128
Closed
3 of 5 tasks

Fix/mitigate flaky integration tests #684

bhrutledge opened this issue Aug 16, 2020 · 12 comments · Fixed by #1128
Assignees
Labels
testing Test frameworks, tests, etc.

Comments

@bhrutledge
Copy link
Contributor

bhrutledge commented Aug 16, 2020

As I write this, the checks on master are currently failing on Travis and GitHub actions due to test_integration.py.

https://travis-ci.org/github/pypa/twine/jobs/718433234
https://github.com/pypa/twine/runs/991209553

I've noticed this intermittently since the tests were added, but it seems to be more of a problem lately.

TODO:

@bhrutledge
Copy link
Contributor Author

Following up on questions from @sigmavirus24 in #703 (comment):

  • Why were we emulating a CLI call for an integration test?

  • What were we testing the integration of? As I understood it, these should be testing against some version of PyPI (TestPyPI, DevPI, etc.) but this is testing argparse as well ... Thus these are almost end-to-end tests (almost because they're not testing that setuptools builds the binary with an entrypoint correctly by shelling out to test things) and thus acceptance tests, not integration tests.

Wait, do we not all share the same definitions of "acceptance, "integration", and "unit" tests? Shocking! 😉

That said, these do feel like "acceptance"/"end-to-end"/"smoke" tests to me, and seem valuable in that capacity, as a way to simulate the user's experience and detect upstream issues.

  • Is testing exactly this way important, or are you okay with doing integration testing differently?

I'm definitely open ideas, and am not married to these tests. They were originally added by @jaraco, and my effort has been focused on keeping their value while reducing their costs, which I think manifest primarily as PR noise due to flakiness of the test servers. At best, that requires a contributor or maintainer to dive into the test results to discover what failed. At worst, it blocks a merge or release.

I'm keen to reduce the PR noise as soon as possible. Assuming we want some amount of testing against real servers, I'd rather not entirely disable these tests. If we do end up rewriting them, I think we'll still want them in place until that's done, which means they'll still be noisy. With that in mind, what do folks think about the "TODO" that I added to the issue description:

Use a separate GH Action job (and maybe tox testenv) for integration tests to clearly distinguish them from the "fast" tests

I think that'd be running pytest -k 'not integration' and pytest -k 'integration' separately, perhaps on different conditions (e.g., always run "fast" tests, only run "integration" tests on merge and/or schedule).

@sigmavirus24
Copy link
Member

I think that'd be running pytest -k 'not integration' and pytest -k 'integration' separately, perhaps on different conditions (e.g., always run "fast" tests, only run "integration" tests on merge and/or schedule).

👍 Let's do that. I like that option as an interim step while we think a bit harder about this

@bhrutledge
Copy link
Contributor Author

The tests against devpi and pypiserver are failing. I though it was related to tox>=4, but they're still failing with the same error with tox<4. Poking around the source of jaraco.envs, it looks like the tox environment setup isn't working as expected, because commands like devpi-init aren't being found.

I can reproduce this locally by running tox -e integration. I'm tempted to remove the dependency on jaraco.envs and coupling to tox, and try to isolate the venv setup as a fixture in test_integration.py.

This was referenced Dec 12, 2022
@jaraco
Copy link
Member

jaraco commented Dec 12, 2022

You’re welcome to proceed that way or assign to me and I can investigate the issue. Let me know.

@bhrutledge
Copy link
Contributor Author

@jaraco It might be worth verifying that jaraco.envs.ToxEnv is working as expected, in isolation. In my initial local testing, an assertion that .tox/devpi exists after the install() was failing.

That said, it looks like it might be relatively straightforward to use venv.create() in conjunction with pytest's temporary directory fixture, which sounds like a fun exercise, that I think would simplify/clarify the tests.

As of #956, the devpi and pypiserver tests are skipped until this is resolved.

@jaraco
Copy link
Member

jaraco commented Dec 13, 2022

It's working fine for me:

 twine main $ pip-run jaraco.envs
Python 3.11.0 (main, Oct 26 2022, 19:06:18) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from jaraco.envs import ToxEnv
>>> class DevPi(ToxEnv):
...   name = 'devpi'
... 
>>> env = DevPi().create()
devpi: install_deps> python -I -m pip install devpi devpi-server
.pkg: install_requires> python -I -m pip install 'setuptools>=45' 'setuptools_scm[toml]>=6.0' wheel
.pkg: _optional_hooks> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: install_requires_for_build_editable> python -I -m pip install wheel
.pkg: get_requires_for_build_sdist> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
devpi: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0'
devpi: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/1/twine-4.0.3.dev12+g4aba39c.tar.gz
.pkg: _exit> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
  devpi: OK (16.43 seconds)
  congratulations :) (16.50 seconds)
>>> env.exe('devpi').isfile()
True
>>> import subprocess
>>> subprocess.run([env.exe('devpi')])
usage: devpi [-h] [--version] [--debug] [-y] [-v] [--clientdir DIR]
             {use,getjson,patchjson,list,remove,user,passwd,login,logoff,logout,index,upload,test,push,install,refresh} ...

The devpi commands (installed via devpi-client) wrap common Python packaging, uploading, installation and testing activities, using a remote devpi-server managed
index. For more information see http://doc.devpi.net

positional arguments:
  {use,getjson,patchjson,list,remove,user,passwd,login,logoff,logout,index,upload,test,push,install,refresh}
    use                 show/configure current index and target venv for install activities.
    getjson             show remote server and index configuration.
    patchjson           send a PATCH request with the specified json content to the specified path.
    list                list project versions and files for the current index.
    remove              removes project info/files from current index.
    user                add, remove, modify, list user configuration.
    passwd              change password of specified user or current user if not specified.
    login               login to devpi-server with the specified user.
    logoff              log out of the current devpi-server.
    logout              log out of the current devpi-server.
    index               create, delete and manage indexes.
    upload              (build and) upload packages to the current devpi-server index.
    test                download and test a package against tox environments.
    push                push a release and releasefiles to an internal or external index.
    install             install packages through current devpi index.
    refresh             invalidates the mirror caches for the specified package(s).

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

generic options:
  --debug               show debug messages including more info on server requests
  -y                    assume 'yes' on confirmation questions
  -v, --verbose         increase verbosity
  --clientdir DIR       directory for storing login and other state
CompletedProcess(args=[Path('.tox/devpi/bin/devpi')], returncode=0)
>>> 
>>> subprocess.run([env.exe('devpi-init')])
INFO  NOCTX Loading node info from /Users/jaraco/.devpi/server/.nodeinfo
INFO  NOCTX generated uuid: c505e7279f634f29970a2da02b2396b5
INFO  NOCTX wrote nodeinfo to: /Users/jaraco/.devpi/server/.nodeinfo
INFO  NOCTX DB: Creating schema
INFO  [Wtx-1] setting password for user 'root'
INFO  [Wtx-1] created user 'root'
INFO  [Wtx-1] created root user
INFO  [Wtx-1] created root/pypi index
INFO  [Wtx-1] fswriter0: committed at 0
CompletedProcess(args=[Path('.tox/devpi/bin/devpi-init')], returncode=0)

Moreover, I ran tox -e integration and the devpi tests passed. The pypiserver tests failed though. Can you confirm the devpi tests fail for you locally the way they are failing in CI?

That said, it looks like it might be relatively straightforward to use venv.create() in conjunction with pytest's temporary directory fixture, which sounds like a fun exercise, that I think would simplify/clarify the tests.

My motivation for using tox was because it provides the following functionality:

  • Provides a home for virtualenvs (stores them in .tox based on the env name) that's already well-defined (ignored for test discovery, etc). Of course you could just hard-code '.tox/' but it feels a little dirty to be creating envs in .tox that aren't managed by tox.
  • Provides a well-defined mechanism to declare the dependencies that make up that environment. You could of course create a requirements.txt file for each of these envs, but that starts to get unwieldy when there's more than one.

I'm not super-happy with the interfaces exposed by jaraco.envs, so I'm not surprised it's not straightforward to test/exercise the functionality.

@bhrutledge
Copy link
Contributor Author

Can you confirm the devpi tests fail for you locally the way they are failing in CI?

Yep; tox -e integration fails with Tox 3 and Tox 4.

I have a branch that's passing after replacing jaraco.envs with fixture logic based on the venv module. I've got some refactoring to do, but will hopefully have a PR up this week.

@jaraco
Copy link
Member

jaraco commented Dec 14, 2022

The issue seems to be that under some circumstances, the env used to install pypiserver is .tox/integration and not .tox/pypiserver as expected.

(Pdb) env.create()
.pkg: _optional_hooks> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
pypiserver: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tox/.tmp/package/9/twine-4.0.3.dev11+g815853f.d20221214.tar.gz
.pkg: _exit> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  pypiserver: OK (3.73 seconds)
  congratulations :) (3.80 seconds)
<tests.test_integration.PypiserverEnv object at 0x104d7e790>

@jaraco
Copy link
Member

jaraco commented Dec 14, 2022

I'm able to replicate the issue without jaraco.envs. The issue is either with tox or more likely with running tox from a virtualenv:

 twine main $ tox -e integration --notest
integration: install_deps> python -I -m pip install build coverage jaraco.envs munch portend pretend pytest pytest-rerunfailures pytest-services pytest-socket
.pkg: install_requires> python -I -m pip install 'setuptools>=45' 'setuptools_scm[toml]>=6.0' wheel
.pkg: _optional_hooks> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: install_requires_for_build_editable> python -I -m pip install wheel
.pkg: get_requires_for_build_sdist> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
integration: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0'
integration: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/1/twine-4.0.3.dev12+g4aba39c.tar.gz
.pkg: _exit> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  integration: OK (10.81 seconds)
  congratulations :) (10.87 seconds)
 twine main $ .tox/integration/bin/python -m tox -e pypiserver --notest
pypiserver: install_deps> python -I -m pip install pypiserver
.pkg: _optional_hooks> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
pypiserver: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0'
pypiserver: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/2/twine-4.0.3.dev12+g4aba39c.tar.gz
.pkg: _exit> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  pypiserver: OK (6.43 seconds)
  congratulations :) (6.48 seconds)

Edit: I'm not sure that shows what I think it shows. The fact that .tox/integration appears in the second execution isn't necessarily indicative of an issue. What I'm finding surprising, however, is that .tox/pypiserver doesn't exist after that's run.

@jaraco
Copy link
Member

jaraco commented Dec 14, 2022

I upgraded my copy of virtualenv as found in my $PATH and cleaned the twine checkout and now I'm able to replicate the failures in both devpi and pypiserver.

Then, I ran tox -e devpi,pypiserver --notest, followed by tox -e integration and the tests pass.

@jaraco
Copy link
Member

jaraco commented Dec 14, 2022

I am able to work around the issue by pinning to tox<4:

 twine main $ git diff
diff --git a/tox.ini b/tox.ini
index e526324..d41a594 100644
--- a/tox.ini
+++ b/tox.ini
@@ -26,6 +26,7 @@ deps =
     pytest-rerunfailures
     pytest-services
     build
+    tox<4
 passenv =
     PYTEST_ADDOPTS
 commands =

You have to pin in the integration environment, as that's the version of tox that gets used by jaraco.envs.ToxEnv.

@jaraco
Copy link
Member

jaraco commented Jun 25, 2024

Since my last comment, I root-caused and Gabor fixed the issue in tox that was leading to failures with pypiserver. Moreover, while investigating recent failures as part of #1120, I noticed that the failures tend to be 503 Service Unavailable from test.pypi.org (https://github.com/pypa/twine/actions/runs/9294853403). So it seems as if the flaky tests are due primarily to a remote service that's under-provisioned for our use-case.

I'm reluctant to switch to production pypi. I don't want to be polluting that repository with lots of artifacts from tests. I also don't want to expose a token for a project in pypi to the world.

Maybe we should just drop the integration test for PyPI and rely on twine's own self-upload during any release to catch any regressions. It's the other servers that are less likely to have visibility into regressions.

Maybe the best course of action is to simply mark these PyPI tests as xfail until there's a better solution.

jaraco added a commit that referenced this issue Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Test frameworks, tests, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants