Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests non-determistically fail due to unreliable timing #100

Closed
mejo- opened this issue Dec 26, 2017 · 8 comments
Closed

Tests non-determistically fail due to unreliable timing #100

mejo- opened this issue Dec 26, 2017 · 8 comments

Comments

@mejo-
Copy link

mejo- commented Dec 26, 2017

Hi, I'm forwarding a bugreport that got reported against the python-blessed Debian package:

python-blessed's testsuite appears to use method timing/benchmarking in such
a way that it will non-deterministically FTBFS:

[…]
________________________ test_esc_delay_cbreak_timout_0 ________________________

  def test_esc_delay_cbreak_timout_0():
      """esc_delay still in effect with timeout of 0 ("nonblocking")."""
      pid, master_fd = pty.fork()
      if pid == 0:  # child
          cov = init_subproc_coverage('test_esc_delay_cbreak_timout_0')
          term = TestTerminal()
          os.write(sys.__stdout__.fileno(), SEMAPHORE)
          with term.cbreak():
              stime = time.time()
              inp = term.inkey(timeout=0)
              measured_time = (time.time() - stime) * 100
              os.write(sys.__stdout__.fileno(), (
                  '%s %i' % (inp.name, measured_time,)).encode('ascii'))
              sys.stdout.flush()
          if cov is not None:
              cov.stop()
              cov.save()
          os._exit(0)
  
      with echo_off(master_fd):
          os.write(master_fd, u'\x1b'.encode('ascii'))
          read_until_semaphore(master_fd)
          stime = time.time()
          key_name, duration_ms = read_until_eof(master_fd).split()
  
      pid, status = os.waitpid(pid, 0)
      assert key_name == u'KEY_ESCAPE'
      assert os.WEXITSTATUS(status) == 0
      assert math.floor(time.time() - stime) == 0.0
  assert 34 <= int(duration_ms) <= 45, int(duration_ms)

E AssertionError: 71
E assert 71 <= 45
E + where 71 = int('71')

blessed/tests/test_keyboard.py:528: AssertionError
==================== 1 failed, 305 passed in 75.27 seconds =====================
E: pybuild pybuild:283: test: plugin distutils failed with: exit code=1: cd /build/1st/python-blessed-1.14.1/.pybuild/pythonX.Y_3.5/build; python3.5 -m pytest
dh_auto_test: pybuild --test --test-pytest -i python{version} -p 3.5 returned exit code 13
debian/rules:7: recipe for target 'build' failed
make: *** [build] Error 25
dpkg-buildpackage: error: debian/rules build gave error exit status 2
I: copying local configuration
E: Failed autobuilding of package

[…]

The full build log is attached or can be viewed here:

https://tests.reproducible-builds.org/debian/logs/unstable/amd64/python-blessed_1.14.1-1.build1.log.gz
@P-EB
Copy link

P-EB commented Dec 31, 2017

The assertion that fails is not relevant to this test. Indeed, this test is with no minimal duration and no timeout. There is no way to predict how long it'll take depending on the computer, the architecture or whatever else.

The assertion above

assert math.floor(time.time() - stime) == 0.0

Allows to make sure that the test takes less than 1 second, which is reasonable enough.

I suggest to remove the incriminated assertion.

@jquast
Copy link
Owner

jquast commented Jun 20, 2018

We will shortly delete/disable any such tests.

We can delete all tests, honestly. I don't know why you folks run them. The code isn't changing. We run tests when we change code. Why do you bother me about my tests?

@jquast
Copy link
Owner

jquast commented Jun 20, 2018

tests are commented out in next release.

@jquast jquast closed this as completed Jun 20, 2018
@jquast
Copy link
Owner

jquast commented Jun 20, 2018

@mejo-
Copy link
Author

mejo- commented Jun 20, 2018

@jquast: Tests are there for quality assurance, i.e. to detect regressions. So it's actually a good thing that distributions which ship python-blessed (like Debian does) run the testsuite at build-time.

@jquast
Copy link
Owner

jquast commented Jun 20, 2018

I am the one who wrote over 3,000 LOC in tests, which is more than the package itself, so I feel like I know better than you what the tests are there for.

We have successfully disabled a large swatch of very useful tests. I hope that is helpful to you, it certain is a detriment to me and all future contributors, thanks for the help.

@mejo-
Copy link
Author

mejo- commented Jun 20, 2018

I don't want to question your authority on declaring the purpose of tests written by you 😁

I'm just stating what becomes more and more common (and in my pov for a good reasion). That testsuites are run by distributions as build time in order to detect regressions.

If some of the tests are not intended to be run by anyone but upstream (yourself), what do you think about separating them from the ones that should succeed on all systems?

@P-EB
Copy link

P-EB commented Jun 25, 2018

Dear @jquast ,

We can delete all tests, honestly. I don't know why you folks run them. The code isn't changing. We run tests when we change code. Why do you bother me about my tests?

As @mejo- pointed out, we run these because it allows to have a better insurance that no regression happens. And not only from your own code and changes, but also from the dependencies of your code. As for an example, recently, tornado got updated, and all its reverse-dependencies got broken into debian. Without this continuous integration feature that is running tests periodically, we'd have taken longer to realize it.

Hence, it's a matter of "your tests are useful to youself as a developer, but also useful for the distributions that rely on your software". Maybe some tests are not suitable for integration in GNU/Linux distributions. To tackle this very issue, there are two solutions : either you are willing to help us, and you could split your tests between those specific to you as an upstream author and those useful for anyone willing to get your software, or you consider that it's not your job (and honestly I'd understand you), and we, as the maintainers, will disable/alter the tests that are not adapted to our architecture.

Please, keep in mind that we are not putting your work at questions, we're only notifying you about a usage of your tests that you actually seem to have not think about before.

Cheers. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants