Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be stricter about JSON that is accepted by Sygnal #216

Merged
merged 5 commits into from
Apr 13, 2021
Merged

Conversation

richvdh
Copy link
Member

@richvdh richvdh commented Apr 12, 2021

This disables the JSON extensions which Python supports by default (parsing of
Infinity / -Infinity and NaN). These shouldn't be accepted since they're
not technically valid JSON and other languages won't be able to interpret it
properly.

This disables the JSON extensions which Python supports by default (parsing of
`Infinity` / `-Infinity` and `NaN`). These shouldn't be accepted since they're
not technically valid JSON and other languages won't be able to interpret it
properly.
@richvdh richvdh requested a review from a team April 12, 2021 17:28
sygnal/gcmpushkin.py Show resolved Hide resolved
Signed-off-by: Dan Callahan <danc@element.io>
Signed-off-by: Dan Callahan <danc@element.io>
Signed-off-by: Dan Callahan <danc@element.io>
@callahad
Copy link
Contributor

I rewrote the json decoder to use functools.partial to bind json.loads instead of instantiating a new JSONDecoder. Tests pass now, and it behaves as expected:

>>> safe_json_loads('{"hello": "world"}')
{'hello': 'world'}

>>> safe_json_loads('{"hello": Infinity}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "<stdin>", line 3, in _reject_invalid_json
ValueError: Invalid JSON value: 'Infinity'

@callahad callahad requested review from callahad and a team and removed request for callahad April 12, 2021 21:44
callahad
callahad previously approved these changes Apr 12, 2021
@callahad callahad dismissed their stale review April 12, 2021 21:47

Trying in vain to clear my review and replace it with a request for review by synapse-core... but apparently GitHub doesn't make that easy if you're a member of synapse-core

@callahad callahad requested a review from a team April 12, 2021 21:47
sygnal/utils.py Outdated


# a custom JSON decoder which will reject Python extensions to JSON.
safe_json_loads = partial(json.loads, parse_constant=_reject_invalid_json)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that json.loads instantiates a JSONDecoder, why this is an improvement?

(Also: we do the local json_decoder thing in Sydent and Synapse, and there is value in being consistent)

Copy link
Contributor

@callahad callahad Apr 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is an improvement?

Because it works 🙂 cf failures in CI with the previous solution.

Keeping aligned with loads fits my brain better, but I don't have strong feelings at all so long as the tests pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further reflection, I think I have a very slight preference for the partial solution, since it is literally a drop-in replacement for json.loads, while JSONDecoder.decode, empirically, is not.

The actual difference is that json.loads accepts str | bytes and handles decoding while JSONDecoder.decode only accepts str.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'd rather avoid constructing a new JSONDecoder every call if possible. I've tweaked this to use a static json_decoder whilst still (hopefully) passing CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather avoid constructing a new JSONDecoder every call if possible

Seems like a premature optimization, but tests pass, so if you're happy I'm happy :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, our comments have been crossing.

Personally, I would rather we were type-strict and were certain about whether we had bytes or strs, rather than using interfaces which handle both. (Incidentally: I now realise the reason this didn't bite us on Synapse or Sydent is that both of those support python 3.5, where json.loads does not accept bytes, so we had already done the explicit .decode() calls).

So, to summarise, I see the following in favour of this approach:

  • consistency with Synapse and Sydent
  • type-strictness
  • avoids constructing a JSONDecoder on each call (yes, a tiny optimisation, but stuff like this adds up, especially on Synapse)

So, thank you for humouring me :)

@richvdh richvdh merged commit 214497a into master Apr 13, 2021
@richvdh richvdh deleted the rav/json-decoder branch April 13, 2021 10:19
awesome-michael added a commit to Awesome-Technologies/sygnal that referenced this pull request May 6, 2021
Sygnal 0.9.3 (2021-04-22)
=========================

Features
--------

- Prevent the push key from being rejected for temporary errors and oversized payloads, add TTL logging, and support `events_only` push data flag. ([\matrix-org#212](matrix-org#212))
- WebPush: add support for Urgency and Topic header ([\matrix-org#213](matrix-org#213))

Bugfixes
--------

- Fix a long-standing bug where invalid JSON would be accepted over the HTTP interfaces. ([\matrix-org#216](matrix-org#216))
- Limit the size of requests received from HTTP clients. ([\matrix-org#220](matrix-org#220))

Updates to the Docker image
---------------------------

- Remove manually added GeoTrust Root CA certificate from docker image as Apple is no longer using it. ([\matrix-org#208](matrix-org#208))

Improved Documentation
----------------------

- Make `CONTIBUTING.md` more explicit about how to get tests passing. ([\matrix-org#188](matrix-org#188))
- Update `CONTRIBUTING.md` to specify how to run code style and type checks with Tox, and add formatting to code block samples. ([\matrix-org#193](matrix-org#193))
- Document how to work around pip installation timeout errors. Contributed by Omar Mohamed. ([\matrix-org#215](matrix-org#215))

Internal Changes
----------------

- Update Tox to run in the installed version of Python (instead of specifying Python 3.7) and to consider specific paths and folders while running checks, instead of the whole repository (potentially including unwanted files and folders, e.g. the virtual environment). ([\matrix-org#193](matrix-org#193))
- Make development dependencies available as extras. Contributed by Hillery Shay. ([\matrix-org#194](matrix-org#194))
- Update `setup.py` to specify that a minimum version of Python greater or equal to 3.7 is required. Contributed by Tawanda Moyo. ([\matrix-org#207](matrix-org#207))
- Port CI checks to Github Actions. ([\matrix-org#210](matrix-org#210), [\matrix-org#219](matrix-org#219))
- Upgrade development dependencies. Contributed by Omar Mohamed ([\matrix-org#214](matrix-org#214))
- Set up `coverage.py` to run in tox environment, and add html reports ([\matrix-org#217](matrix-org#217))

Change-Id: I14ae821ff2a1562e91fd87ce25f73baaa0b9430b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants