-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use json instead of ujson in tests #1255
Conversation
to catch bytes instead of str passed to storage. Fixes Kinto#1238. A new boolean settings `storage_strict_json` is introduced to configure json instead of ujson in tests but keep ujson in prod. `strict_json` is True by default in `StorageBase` constructor, but False by default in `load_from_config`. strict_json is still not in effect in: - test_main.py, - tests/core/test_initialization.py - tests/plugins/test_{history,quotas}.py#test_a_statsd...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't followed the whole conversation about this, but the approach here adds a bit of complexity (new setting, new storage behaviour, ...).
Are we still sure that ujson brings us enough benefit compared to python 3.6 json native module? Maybe we could drop it completely.
What do you think?
CHANGELOG.rst
Outdated
@@ -40,6 +40,7 @@ This document describes changes between each past release. | |||
|
|||
- Make memory storage consistent with PostgreSQL with regard to bytes (#1237) | |||
- Some minor cleanups about the use of kinto.readonly (#1241) | |||
- use json instead of ujson in storage in tests (#1255) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Use (capitalized)
@@ -17,7 +17,8 @@ class PluginSetup(unittest.TestCase): | |||
def test_a_statsd_timer_is_used_for_history_if_configured(self): | |||
settings = { | |||
"statsd_url": "udp://127.0.0.1:8125", | |||
"includes": "kinto.plugins.history" | |||
"includes": "kinto.plugins.history", | |||
"storage_strict_json": True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary to change this for the history plugin tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? The goal is to make sure the JSON used by the history public are serializable. It should not hurt to activate it.
docs/configuration/settings.rst
Outdated
@@ -122,6 +122,8 @@ Storage | |||
+------------------------------+-------------------------------+--------------------------------------------------------------------------+ | |||
| kinto.storage_max_backlog | ``-1`` | Number of threads that can be in the queue waiting for a connection. | | |||
+------------------------------+-------------------------------+--------------------------------------------------------------------------+ | |||
| kinto.storage_strict_json | ``False`` | (test only) validate that records are serializable using the json module.| | |||
+------------------------------+-------------------------------+--------------------------------------------------------------------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is meant to be used in tests only I don't think it's necessary to document it here. No?
Kinto speed benefit heavily from the use of ujson. We should definitely keep it. |
The rationale is that PostgreSQL backend is using ujson while the memory backend was not. We discovered that sometime ujson doesn't behave exactly as the json module. @elelay upgraded the memory backend to use ujson so that both backend behave the same way with regards to JSON handling. To be honest I would just use ujson everywhere without adding a new setting. |
Do we have numbers? We did benchmarks 2 years ago with Python 2.7. I continuously read about python 3.6 super performances. We should probably confirm that with updated figures.
+1 (all or nothing) |
Sure if you feel like doing so.
We are not using Python 3.6 yet in production as far as I know. But I really doubt JSON handling in Python could be faster than in libc. |
The problem is that the behavior of ujson is arguably broken and lets a developer accidentally serialize things that shouldn't be serializable. Any arbitrary object will be serialized by using its Personally I don't think this is too much complexity to avoid those kinds of bugs. (The PR only changes ~75ish lines of code.) The only thing that I might consider is trying to localize the complexity to the |
I see an up to 5x speed improvement (decode 256 UTF8 strings) for ujson/json, with an average of 2x for encode and decode.
|
+1 this is why I took the trouble to implement this PR. Now, choosing between ujson and json could maybe be implemented more elegantly. Is it possible to mock an attribute in any future instance of a class or sthing? Then we could switch the json attribute of StorageBase before tests and not have to explicitly configure it. |
Meanwhile, I'm modifying ujson to add a 'reject_bytes' parameter. |
Refs ultrajson/ultrajson#266. I tried "simplifying" this by centralizing the logic to dispatch between JSON flavors in @leplatrem @Natim Would you veto if I wanted to land it? I see a +1 for using ujson on everything, but no strong objection against the patch as it stands now. Do you have an alternative implementation in mind that would be cleaner? This PR needs a rebase in the meantime, plus there are a few remaining nits from the review. |
No, go ahead.
I am not confident handling the conflicts myself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved the merge conflict using the fact that I was the person who wrote the conflicting code. Otherwise LGTM with a couple remaining nits that @leplatrem pointed out.
Thanks @glasserc. I think all nits are addressed. |
Thanks for your work and for bearing with us while we sorted this out! |
Fixes #1238
r? @glasserc @Natim