Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Ensure that calls to json.dumps are compatible with the standard library json. #7836

Merged
merged 9 commits into from
Jul 15, 2020

Conversation

clokep
Copy link
Contributor

@clokep clokep commented Jul 13, 2020

This is essentially a continuation of #7802 (so part of #7674) and grabs a bunch more instances where we were import simplejson (indirectly via canonicaljson) to handle converting bytes to JSON objects.

After this PR, there's still ~20 instances to handle, all in the store code.

@clokep
Copy link
Contributor Author

clokep commented Jul 13, 2020

Hmm...looks like this might have broken something. 😢 Will need to do some more work here.

@@ -213,7 +215,7 @@ async def query_keys(self, request, query, query_remote_on_cache_miss=False):
else:
signed_keys = []
for key_json in json_results:
key_json = json.loads(key_json)
key_json = json.loads(key_json.decode("utf-8"))
Copy link
Contributor Author

@clokep clokep Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be using db_to_json instead (and the byte() calls above could be removed)? Feels really weird to put that into the REST layer though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, arguably get_server_keys_json should be manipulating the results so as not to return memoryview objects.

I think this is fine though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it is doing it for efficiency since only some of the results actually end up being deserialized into JSON objects? Not sure though.

@clokep clokep requested a review from a team July 14, 2020 11:53
@clokep
Copy link
Contributor Author

clokep commented Jul 14, 2020

Hmm...looks like this might have broken something. 😢 Will need to do some more work here.

I fixed the issue, there was a place where postgresql was returning a memoryview, while sqlite returned bytes.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks ok to me; however...

The intention of using from canonicaljson import json was not so much a sneaky way to get simplejson without using the word "simple" but more an attempt to use a consistent json de/serialiser everywhere. (We'd previously had problems where we validated an event using one json deserialiser, but then used a different deserialiser when reading it from the database, and somebody sent an event which broke synapse for everyone in Matrix HQ.)

so I chose my words carefully in #7674 (comment): my idea was that we would switch canonicaljson over to stdlib json, and allow that to propagate into synapse.

WDYT?

@clokep
Copy link
Contributor Author

clokep commented Jul 15, 2020

The intention of using from canonicaljson import json was not so much a sneaky way to get simplejson without using the word "simple" but more an attempt to use a consistent json de/serialiser everywhere. (We'd previously had problems where we validated an event using one json deserialiser, but then used a different deserialiser when reading it from the database, and somebody sent an event which broke synapse for everyone in Matrix HQ.)

Ah, that makes a bit more sense. I kind of assumed it was done in a couple of spots and then spread via copying/pasting, etc. I didn't realize it was done purposefully.

I find it confusing that we do sometimes import from the std lib JSON and other times we import from canonicaljson (which is usually simplejson). I was hoping to make this all consistent. Maybe doing it ahead of time could cause issues though! I was hoping to eventually make the canonicaljson's JSON implementation private (_json), but that would likely be very difficult now that I'm thinking about it.

I think this somewhat came up in #7372 too where it wold be beneficial to have a single place where we can import json from and develop a consistent config to use.

Regardless, I think the majority of this PR is valid: we could essentially just not change the imports as part of it, but still do the decoding. Does that sound reasonable as a first step @richvdh? We potentially would want to back out part of #7802 too then, which moves some imports as well.

@clokep
Copy link
Contributor Author

clokep commented Jul 15, 2020

We talked a bit about this in #synapse-dev:matrix.org and decided to rollback the changes to the imports (and also rollback the import changes in #7802).

@clokep clokep changed the title Stop using simplejson to handle converting bytes to json Ensure that calls to json.dumps are compatible with the standard library json. Jul 15, 2020
@clokep clokep requested a review from richvdh July 15, 2020 16:16
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@clokep clokep merged commit 3545051 into develop Jul 15, 2020
@clokep clokep deleted the clokep/json-unicode branch July 15, 2020 17:40
babolivier pushed a commit that referenced this pull request Sep 1, 2021
* commit 'a973bcb8a':
  Add some tiny type annotations (#7870)
  Remove obsolete comment.
  Ensure that calls to `json.dumps` are compatible with the standard library json. (#7836)
  Avoid brand new rooms in `delete_old_current_state_events` (#7854)
  Allow accounts to be re-activated from the admin APIs. (#7847)
  Fix tests
  Fix typo
  Newsfile
  Use get_users_in_room rather than state handler in typing for speed
  Fix client reader sharding tests (#7853)
  Convert E2E key and room key handlers to async/await. (#7851)
  Return the proper 403 Forbidden error during errors with JWT logins. (#7844)
  remove `retry_on_integrity_error` wrapper for persist_events (#7848)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants