New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@vcr decorator for unit/doctests - recording/playback of network traffic via python sockets #1663

Open
wants to merge 110 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@barsch
Member

barsch commented Feb 9, 2017

What does this PR do?

Any Python socket communication in unittests (decorated with the @vcr function) and/or doctests (containing a # doctest: +VCR) will be recorded on the first run and saved into a special 'vcrtapes' directory as single pickled file for each test case. Future test runs will reuse those recorded network session allowing for faster tests without any network connection. In order to create a new recording one just needs to remove/rename the pickled session file(s).

So pretty similar to VCRPy but on socket level instead of HTTP (request, urllib3, etc.) - so therefore it should be more powerful ...

Why was it initiated? Any relevant issues?

Network tests tend to fail sporadically, need usually a very long time (compared to other tests) and (surprise!) require a network connection (if not mocked).

This PR tackles all three issues mentioned above. Here some performance tests (network connection had been disabled during tests of the vcr branch):

(Python36) d:\Workspace\obspy>git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.

(Python36) d:\Workspace\obspy>obspy-runtests -d obspy.clients.syngine.tests.test_client
Running d:\workspace\obspy\obspy\scripts\runtests.py, ObsPy version '1.0.2.post0+902.gbfe6b3c90a.obspy.master'
..............
----------------------------------------------------------------------
Ran 14 tests in 9.743s
OK

(Python36) d:\Workspace\obspy>git checkout vcr
Switched to branch 'vcr'
Your branch is up-to-date with 'origin/vcr'.

(Python36) d:\Workspace\obspy>obspy-runtests -d obspy.clients.syngine.tests.test_client
Running d:\workspace\obspy\obspy\scripts\runtests.py, ObsPy version '1.0.2.post0+903.g151e12679f.obspy.vcr'
..............
----------------------------------------------------------------------
Ran 14 tests in 0.432s
OK

ToDo

  • ensure platform compatibility
  • clients.arclink: issues on Linux with current telnet implementation - I may have to rewrite that :/
  • clients.seishub: needs vcrtapes for client doctests (@megies)
  • clients.seedlink: needs reworking - current tests don't work - can't add @vcr therefore ...
  • clients.fdsn: could cover more tests - currently the service discovery thread (which is using sockets too) is intercepting occasionally the test case itself (@krischer)
  • ensure the whole test suite works without network connection
  • remove (non-)network tests options from obspy-runtests - all tests should be run as default setting
  • SSLSocket support (HTTPS)
  • no arclink specific hacks anymore -> more general option for recv_timeout and recv_endmarker
  • needs documentation
  • needs CHANGELOG.txt entry
  • move it into an extra Python package as its highly reusable - also to increase test coverage by using all appveyor env instead only selected ones due to huge number of tests we have in obspy -> see https://github.com/obspy/vcr/ - you need to pip install -U vcr on your local env
  • include vcr as a git submodule in core/util/testing ?
  • squash old vcr tapes in the git history

@barsch barsch changed the title from new @vcr decorator for unit/doctests - recording/playback of network traffic via python sockets to @vcr decorator for unit/doctests - recording/playback of network traffic via python sockets Feb 9, 2017

@megies megies requested review from krischer and megies Feb 9, 2017

@megies megies added the testing label Feb 9, 2017

@megies megies added this to the 1.1.0 milestone Feb 9, 2017

@megies megies added the .clients label Feb 9, 2017

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 9, 2017

Member

+TESTS:ALL

Member

megies commented Feb 9, 2017

+TESTS:ALL

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 9, 2017

Member

I just wanted to provide vcrtapes for seishub, but it seems seishub client was broken by #1325..

Member

megies commented Feb 9, 2017

I just wanted to provide vcrtapes for seishub, but it seems seishub client was broken by #1325..

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 10, 2017

Member

The networking stuff of Stream test case test_read should be refactored out into a separate test test_read_with_url_via_network (see #1660 and specifically #1660 (comment)). I would even do it in maintenance branch, but I don't want to get you into problems with rebasing..

Member

megies commented Feb 10, 2017

The networking stuff of Stream test case test_read should be refactored out into a separate test test_read_with_url_via_network (see #1660 and specifically #1660 (comment)). I would even do it in maintenance branch, but I don't want to get you into problems with rebasing..

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Feb 10, 2017

Member

@megies - go ahead - I didn't touch anything outside of the clients tests suites yet and I will merge master before doing so

Member

barsch commented Feb 10, 2017

@megies - go ahead - I didn't touch anything outside of the clients tests suites yet and I will merge master before doing so

megies added a commit that referenced this pull request Feb 10, 2017

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 10, 2017

Member

@barsch, I pushed a commit with vcrtapes for seishub, no idea why the doctest vcrtapes weren't recorded..

Those vcrtapes were recorded with this commit added on top of this branch: dd8238c

Member

megies commented Feb 10, 2017

@barsch, I pushed a commit with vcrtapes for seishub, no idea why the doctest vcrtapes weren't recorded..

Those vcrtapes were recorded with this commit added on top of this branch: dd8238c

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Feb 10, 2017

Member

@megies - doctests are currently monkey patched only via obspy-runtests script and not at module level - so run obspy-runtests -d clients.seishub and it should work (using #doctests: +VCR comment once somewhere within a single doctest)

Member

barsch commented Feb 10, 2017

@megies - doctests are currently monkey patched only via obspy-runtests script and not at module level - so run obspy-runtests -d clients.seishub and it should work (using #doctests: +VCR comment once somewhere within a single doctest)

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 10, 2017

Member

Hmm.. somehow seishub/__init__.py doctests are not run on master or vcr branch. They are run on maintenance_1.0.x though...

Member

megies commented Feb 10, 2017

Hmm.. somehow seishub/__init__.py doctests are not run on master or vcr branch. They are run on maintenance_1.0.x though...

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Feb 10, 2017

Member

so its already an issue on master ...

Member

barsch commented Feb 10, 2017

so its already an issue on master ...

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Feb 10, 2017

Member

clients.seishub works for me on windows except for test_localcache as its not been decorated - any reason for that?

Member

barsch commented Feb 10, 2017

clients.seishub works for me on windows except for test_localcache as its not been decorated - any reason for that?

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 11, 2017

Member

clients.seishub works for me on windows except for test_localcache as its not been decorated - any reason for that?

Ah, my bad, will add it soonish..

Member

megies commented Feb 11, 2017

clients.seishub works for me on windows except for test_localcache as its not been decorated - any reason for that?

Ah, my bad, will add it soonish..

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Feb 11, 2017

Member

@barsch added missing unittest vcrtape for seishub. but can't add doctest vcrtape because that test is failing, see #1664.

Member

megies commented Feb 11, 2017

@barsch added missing unittest vcrtape for seishub. but can't add doctest vcrtape because that test is failing, see #1664.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 11, 2017

Member

@barsch.. there's some issues with vcr.py highlighted by our flake8 check, see CircleCI fail log:

  280 class VCR(object):
  281     """
  282     """
  283     @classmethod
  284     def reset(cls, debug=False):
  285         global vcr_playlist, vcr_status, vcr_debug
  286         vcr_playlist = []
  287         vcr_status = VCR_RECORD
  288         vcr_debug = debug
>>289         vcr_recv_timeout = 2
>>290         vcr_recv_endmarker = None
>>291         vcr_recv_size = None
  292 

The last three variables are not declared global so these statements are without effect..

  364             # check for arclink module
  365             if 'arclink' in source_filename:
  366                 vcr_recv_endmarker = 'END\r\n'
  367             else:
>>368                 vcr_recv_endmarker = ''

This variable is not used inside this scope and also not declared global, so this statement seems to be without effect.

Member

megies commented Mar 11, 2017

@barsch.. there's some issues with vcr.py highlighted by our flake8 check, see CircleCI fail log:

  280 class VCR(object):
  281     """
  282     """
  283     @classmethod
  284     def reset(cls, debug=False):
  285         global vcr_playlist, vcr_status, vcr_debug
  286         vcr_playlist = []
  287         vcr_status = VCR_RECORD
  288         vcr_debug = debug
>>289         vcr_recv_timeout = 2
>>290         vcr_recv_endmarker = None
>>291         vcr_recv_size = None
  292 

The last three variables are not declared global so these statements are without effect..

  364             # check for arclink module
  365             if 'arclink' in source_filename:
  366                 vcr_recv_endmarker = 'END\r\n'
  367             else:
>>368                 vcr_recv_endmarker = ''

This variable is not used inside this scope and also not declared global, so this statement seems to be without effect.

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 11, 2017

Member

yeah thanks for poiting out - its pretty much WIP and I needed to sleep - so I accidently pushed it instead of local commit only ...

however I'm not a fan of globals at all - any better idea for a more elegant solution - having anything within the decorator as before works without globals but is pretty much untestable ...

Member

barsch commented Mar 11, 2017

yeah thanks for poiting out - its pretty much WIP and I needed to sleep - so I accidently pushed it instead of local commit only ...

however I'm not a fan of globals at all - any better idea for a more elegant solution - having anything within the decorator as before works without globals but is pretty much untestable ...

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 12, 2017

Member

ok this became now a new module see https://github.com/obspy/vcr - primary because its much faster testable than via an obspy branch and so it can be reused in other projects - download it via pip install -U vcr if you want to use it! Please use the vcr repository also for non-obspy specific feedback on vcr.

This PR is still valid as it integrates vcr into obspy - its so far looking pretty good ...

Member

barsch commented Mar 12, 2017

ok this became now a new module see https://github.com/obspy/vcr - primary because its much faster testable than via an obspy branch and so it can be reused in other projects - download it via pip install -U vcr if you want to use it! Please use the vcr repository also for non-obspy specific feedback on vcr.

This PR is still valid as it integrates vcr into obspy - its so far looking pretty good ...

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 12, 2017

Member

Did all seishub vcr tapes from scratch.

This branch will need some cleanup before merging, once it's ready. Let me know when everything is done, I can rebase and squash then.

Member

megies commented Mar 12, 2017

Did all seishub vcr tapes from scratch.

This branch will need some cleanup before merging, once it's ready. Let me know when everything is done, I can rebase and squash then.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 13, 2017

Member

Seishub is fully vcr-ified now, verified on Py27 and Py36, tests run in under one second with network unplugged.

Member

megies commented Mar 13, 2017

Seishub is fully vcr-ified now, verified on Py27 and Py36, tests run in under one second with network unplugged.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 13, 2017

Member

If I remove the threading during service discovery and do a plain sequential download of wadls, then FDSN can be fully vcr-ified and tests run without network access in under 3 seconds.

I know that the threaded wadl download is a slick feat, but maybe it's worth to lose it to ease up testing? I tried and connecting to IRIS with threaded wadl download takes me 5.3 seconds (averaged across 10 connections), compared to ~8 seconds with plain sequential wadl download.. and connecting should be a one-off operation in most practical use cases?

Member

megies commented Mar 13, 2017

If I remove the threading during service discovery and do a plain sequential download of wadls, then FDSN can be fully vcr-ified and tests run without network access in under 3 seconds.

I know that the threaded wadl download is a slick feat, but maybe it's worth to lose it to ease up testing? I tried and connecting to IRIS with threaded wadl download takes me 5.3 seconds (averaged across 10 connections), compared to ~8 seconds with plain sequential wadl download.. and connecting should be a one-off operation in most practical use cases?

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 13, 2017

Member

at least an option to temporary disable it would be cool

Member

barsch commented Mar 13, 2017

at least an option to temporary disable it would be cool

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 13, 2017

Member

Hmm.. but if we do this and use it in tests than the threaded download would be untested code.. :-/

Member

megies commented Mar 13, 2017

Hmm.. but if we do this and use it in tests than the threaded download would be untested code.. :-/

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Mar 13, 2017

Member

We could add a slight delay for each thread so that the download start order would be deterministic. It would not really impact total runtime but it should be enough for the vcr decorator to do its thing. Would that work?

Member

krischer commented Mar 13, 2017

We could add a slight delay for each thread so that the download start order would be deterministic. It would not really impact total runtime but it should be enough for the vcr decorator to do its thing. Would that work?

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Mar 13, 2017

Member

The threading is really useful for very slow datacenters were it improves things a lot. Sometimes one or more of the fdsn services also time-out while one works. In these cases the threading is a lot fsater.

Member

krischer commented Mar 13, 2017

The threading is really useful for very slow datacenters were it improves things a lot. Sometimes one or more of the fdsn services also time-out while one works. In these cases the threading is a lot fsater.

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 13, 2017

Member

I wouldn't want to remove it - just an option to disable/preload without threads - some code which I can put into TestCase.setUp to run service discovery only once at my desire and disable it for the actually test cases

Member

barsch commented Mar 13, 2017

I wouldn't want to remove it - just an option to disable/preload without threads - some code which I can put into TestCase.setUp to run service discovery only once at my desire and disable it for the actually test cases

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Mar 13, 2017

Member

That would be easy enough - the results are already cached - just fill it before you run the tests: https://github.com/obspy/obspy/blob/master/obspy/clients/fdsn/client.py#L108

Member

krischer commented Mar 13, 2017

That would be easy enough - the results are already cached - just fill it before you run the tests: https://github.com/obspy/obspy/blob/master/obspy/clients/fdsn/client.py#L108

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 13, 2017

Member

We could add a slight delay for each thread so that the download start order would be deterministic

So, essentially a short time.sleep() after thread.start()? Tried it but didn't work with vcr. Patching self.services is not a real solution since it can't cover our doctests.. One option would be an Client.__init__(..., threaded_wadl_downloads=True) kwarg, but then again, if we switch it to False in our tests that means that our default client behavior is untested..

Member

megies commented Mar 13, 2017

We could add a slight delay for each thread so that the download start order would be deterministic

So, essentially a short time.sleep() after thread.start()? Tried it but didn't work with vcr. Patching self.services is not a real solution since it can't cover our doctests.. One option would be an Client.__init__(..., threaded_wadl_downloads=True) kwarg, but then again, if we switch it to False in our tests that means that our default client behavior is untested..

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 28, 2017

Member

@barsch do you have an overview of the current status on this one? Can we get it ready by Sunday? Any help needed?

Member

megies commented Mar 28, 2017

@barsch do you have an overview of the current status on this one? Can we get it ready by Sunday? Any help needed?

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 28, 2017

Member
  • current test cases in io.seedlink not working - therefore I can't vcr them
  • not sure about io.fdsn - don't know where I can safely disable service discovery and where not - probably need to split up into two test files - one which can be used with vcr and one which can be optional disabled for environment without network connection

everything else should be covered with current release of vcr (0.0.5) - found some non-trivial issue with requests package on Linux resulting into a failing vcr master but that shouldn't effect the current test cases in ObsPy

Member

barsch commented Mar 28, 2017

  • current test cases in io.seedlink not working - therefore I can't vcr them
  • not sure about io.fdsn - don't know where I can safely disable service discovery and where not - probably need to split up into two test files - one which can be used with vcr and one which can be optional disabled for environment without network connection

everything else should be covered with current release of vcr (0.0.5) - found some non-trivial issue with requests package on Linux resulting into a failing vcr master but that shouldn't effect the current test cases in ObsPy

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 28, 2017

Member

Rebased on current master and force-pushed.

Also always switched on ALL modules for runtests on Appveyor+Travis via the ymls.

Member

megies commented Mar 28, 2017

Rebased on current master and force-pushed.

Also always switched on ALL modules for runtests on Appveyor+Travis via the ymls.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 28, 2017

Member

Appveyor runs ALL tests in 5mins, only broken seedlink tests failing: http://tests.obspy.org/75341/
:-)

Member

megies commented Mar 28, 2017

Appveyor runs ALL tests in 5mins, only broken seedlink tests failing: http://tests.obspy.org/75341/
:-)

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 28, 2017

Member

We should add another option to obspy-runtests to skip using vcr tapes, something like obspy-runtests --no-vcr.

For that we would need an approach to control this: https://github.com/obspy/obspy/blob/vcr/obspy/clients/earthworm/tests/test_client.py#L19-L21 (which will also be added st seedlink tests)

Also, we should maybe deprecate --all and always run all test suites.. ?

Member

megies commented Mar 28, 2017

We should add another option to obspy-runtests to skip using vcr tapes, something like obspy-runtests --no-vcr.

For that we would need an approach to control this: https://github.com/obspy/obspy/blob/vcr/obspy/clients/earthworm/tests/test_client.py#L19-L21 (which will also be added st seedlink tests)

Also, we should maybe deprecate --all and always run all test suites.. ?

megies added some commits Apr 3, 2017

vcr tests: improve normalization in outgoing traffic checks
sort REST parameters in URLs appearing in HTTP Authorization headers
avoid dynamic requests header in one fdsn test
setting a fixed User-Agent string, otherwise requests sets it's own
version in there
vcr/testing: make sure no VCRExceptions are masked in our code based by
catch-all try/excepts. otherwise it can lead to very strange errors in
VCR testing
Revert "reverting one test case as it can use vcr too - at least it w…
…orks here"

This reverts commit 1ca414d.

Unnecessary network traffic outside of obspy API should be avoided in
VCR tests wherever possible. See http://tests.obspy.org/77048/#2
avoid doctest exception when adding docstrings w/o tests on old python
see http://tests.obspy.org/77048/#3

>>> Cannot import test suite for module obspy.clients.seedlink due to:
----------------------------------------------------------------------
File "/usr/local/bin/obspy-runtests", line 9, in <module>
load_entry_point('obspy==1.0.3.post0-1039.g43368f9f71.obspy.vcr',
'console_scripts', 'obspy-runtests')()
File "/obspy/obspy/scripts/runtests.py", line 945, in main
errors = run(argv, interactive)
File "/obspy/obspy/scripts/runtests.py", line 913, in run
vcr_overwrite=args.vcr_overwrite)
File "/obspy/obspy/scripts/runtests.py", line 715, in run_tests
suites, status, import_failures = _get_suites(verbosity, tests)
File "/obspy/obspy/scripts/runtests.py", line 179, in _get_suites
suite.append(ut.loadTestsFromName(test, None))
File "/usr/lib/python2.7/unittest/loader.py", line 113, in
loadTestsFromName
test = obj()
File "/obspy/obspy/clients/seedlink/tests/__init__.py", line 16, in
suite
add_doctests(suite, MODULE_NAME)
File "/obspy/obspy/core/util/testing.py", line 249, in add_doctests
testsuite.addTest(doctest.DocTestSuite(_module, **kwargs))
File "/usr/lib/python2.7/doctest.py", line 2384, in DocTestSuite
raise ValueError(module, "has no tests")
ValueError: (<module 'obspy.clients.seedlink.client.__init__' from
'/obspy/obspy/clients/seedlink/client/__init__.pyc'>, 'has no tests')
vcr: normalize Accept-Encoding header in outgoing HTTP requests
should not be problematic as tests will fail if server responds with an
encoding that was not offered by the client
vcr decorator: make it possible to pass in kwargs to vcr.vcr
while most options are handled globally by obspy-runtests this might be
good to have in debugging situations
vcr testing: move setting outgoing traffic check normalization function
to decorator/doctest wrapper, so that they are also in effect whan not
running tests via obspy-runtests
CI: force disable live network in tests
during releasing we should run all tests live through network but the
point of our vcr approach is to avoid random network fails, so don't run
network tests and rely on vcr tests
@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Apr 7, 2017

Member

Rebased and force-pushed for fresh CI

Member

megies commented Apr 7, 2017

Rebased and force-pushed for fresh CI

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Apr 7, 2017

Member

no idea why appveyor is hickupping with io.arclink??

https://ci.appveyor.com/project/obspy/obspy/build/1.0.4673-master/job/n873g5k5xtyb62hk#L509

all other tests pass on appveyor..

Member

megies commented Apr 7, 2017

no idea why appveyor is hickupping with io.arclink??

https://ci.appveyor.com/project/obspy/obspy/build/1.0.4673-master/job/n873g5k5xtyb62hk#L509

all other tests pass on appveyor..

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jul 27, 2017

Member

Punting to 1.2.0.. otherwise we'll never even get to 1.1.0

Everybody feel free to tackle it of course..

Member

megies commented Jul 27, 2017

Punting to 1.2.0.. otherwise we'll never even get to 1.1.0

Everybody feel free to tackle it of course..

@megies megies modified the milestones: 1.2.0, 1.1.0 Jul 27, 2017

@megies megies modified the milestones: 1.2.0, 1.3.0 Apr 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment