-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrepancy in original headers and playback headers (bytes vs. strs) #122
Comments
Hi @wimglenn, Sorry for the delayed response. I'm a bit sick so my response times will be kind of bad. I also haven't had a chance to toy with your reproduction test case but I sincerely appreciate you putting together a tiny one like that. Could you also post the tracebacks you're seeing? Those would help me narrow this down and maybe guide you towards sending a pull request that fixes this. Would you be interested in doing that? |
Sure, I can submit a PR. I just cloned and ran Traceback below:
|
I'm working on it in my fork https://github.com/wimglenn/betamax/tree/bytes_vs_text_handling Unfortunately the fix will have to be quite intrusive, because I think what needs to be done simply can't be done with JSONSerializer. That's because it's a format which doesn't correctly preserve the types of the strings inside a python dict. on python2, it's data destructive:
on python3, it bails out (more correct, but makes betamax fail hard):
|
I ran in to a similar problem when using Requests, Betamax, and some third library such as suds-jurko or requests-oauthlib. Those extra libraries can introduce byte strings in to the flow, and the currently-available Betamax serializers can't handle bytes. I wrote up some documentation about the problem, and a possible fix. |
I think we have a partial fix but it's not yet been released. @hroncok would you have time to help prepare a release? |
Sure. Not today, but sometimes during the weekend should be fine. Anything special, or just changelog + version bump + tag? |
I don't think we need anything beyond what you mentioned. Thank you in advance. I'm traveling next week and won't have time for this any time soon |
@sigmavirus24 and @christianmlong I wrote a test for one function that downloads a large zip file using requests module. I've found discrepancy in Content-Length when comparing test execution with betamax and without it. Using Betamax, the Content-Length of the binary string extracted is way larger. Besides that, I need to pass that binary string to BytesIO and then to My test setup: import betamax
from betamax.fixtures import unittest
mode = os.getenv('BETAMAX_RECORD_MODE')
with betamax.Betamax.configure() as config:
config.cassette_library_dir = 'tests/test_funcs/cassettes'
config.default_cassette_options['record_mode'] = mode
print(f'Using record mode <{mode}>')
def the_function(session):
# session = requests.Session()
from io import BytesIO
from zipfile import ZipFile
response = session.get("https://ww2.stj.jus.br/docs_internet/processo/dje/xml/stj_dje_20211011_xml.zip")
zip_in_memory = BytesIO(response.content)
try:
my_zip = ZipFile(zip_in_memory, 'r')
my_zip.testzip()
result = True
except Exception:
result = False
return result
class BaseTest(unittest.BetamaxTestCase):
custom_headers = None
custom_proxies = None
_path_to_ignore = None
_no_generator_return_search = False
def setUp(self):
super(BaseTest, self).setUp()
if self.custom_headers:
self.session.headers.update(self.custom_headers)
if self.custom_proxies:
self.session.proxies.update(self.custom_proxies)
self.worker_under_test = self.worker_class()
self.worker_under_test._session = self.session
def test_search(self):
result = the_function(self.session)
assert result I pass the Related question: https://stackoverflow.com/questions/69653406/how-to-mock-a-function-that-downloads-a-large-binary-content-using-betamax |
I don't believe your issue is related to this issue (and likely that we should close this one at this point). Please open an issue and include the headers from the response that you're recording. |
I think there is something whack in the way non-ascii data is handled. I'm getting discrepancies in the response objects from live server and playback, and this is causing some issues in my app. I've tried to make a minimal example to reproduce the bug in a script below:
It fails in both python2 and python3, but with different failure modes.
--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/38125389-discrepancy-in-original-headers-and-playback-headers?utm_campaign=plugin&utm_content=tracker%2F198445&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F198445&utm_medium=issues&utm_source=github).In python2 it fails because the response1 and response2 have mutated.
In python3 it fails earlier, during the recording, whilst attempting to serialise a bytestring - an action which is a
TypeError
in python3 (and this gives a big hint at why the python2 version fails).The text was updated successfully, but these errors were encountered: