New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pretty printing, key sorting, and better performance to to_json in Jinja #91253
Conversation
homeassistant/helpers/template.py
Outdated
def to_json(value, ensure_ascii=True, indent=None, sort_keys=False): | ||
"""Convert an object to a JSON string.""" | ||
return json.dumps(value, ensure_ascii=ensure_ascii) | ||
return json.dumps( | ||
value, ensure_ascii=ensure_ascii, indent=indent, sort_keys=sort_keys | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are generally moving away from json
because of performance reasons.
We should use homeassistant.helpers.json
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take a look at the json helpers api, but I'm using this here because it's actually how the existing to_json
Jinja filter is implemented, and would be worried about introducing quirks by using a different API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it a bit, it looks like the helper version is made for serializing home assistant objects and handles things like custom types, date encodings, etc. I'm not sure that's the right fit for Jinja, but open to input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even using orjson directly would be fine. The performance issues are with the built in json lib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Will explore making the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way out of this would be to deprecate the existing to_json
(leaving it in place but deprecated/not recommended) and replace it with as_json
using the new library. The existing conversion APIs like as_datetime
, as_local
use the as
naming scheme instead of the to
naming scheme, so it would be more consistent anyway. Deserializing may be possible compatibly already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest question would be whether the quirks of orjson are the ones we want to live with in the API -- Hyrum's law and all 😉.
90-95% of the common paths in the code base has already moved to it so I think thats not a concern we need to retread.
It seems at a glance like a fairly inflexible API (sacrificing some usability and flexibility for performance, which is often the right trade off as internal implementation detail but maybe less nice for developer experience) and would take things like uuids and datetimes without a straightforward way to turn those features off -- quirks we'd need to support in perpetuity.
I don't think templates are going to have dataclass
or UUID
objects, but we already serialize datetime
to isoformat
almost everywhere else so it lines up nicely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way out of this would be to deprecate the existing
to_json
(leaving it in place but deprecated/not recommended) and replace it withas_json
using the new library. The existing conversion APIs likeas_datetime
,as_local
use theas
naming scheme instead of theto
naming scheme, so it would be more consistent anyway. Deserializing may be possible compatibly already.
That seems like a reasonable path forward 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually -- from_json is already using orjson. So it's just to_json. I think I can set a default handler and turn on passthroughs to disable all of the nonstandard types (at least at the outset so we wouldn't be stuck with them). I'm going to give that a shot. Will update this PR with the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK -- gave it a shot. Please take a look. I'm not sure if I did all the things right for deprecation here (and certainly we'd need to document the change).
Co-authored-by: J. Nick Koston <nick@koston.org>
Ok -- added a documentation PR and hopefully addressed outstanding comments. Let me know if there's more to do! |
LGTM. I need to do manual testing but may run out of time tonight |
Awesome. Thanks for the review and all the back and forth! Appreciate you working through it with me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment here: I suggest adding type hints to the new functions.
Co-authored-by: epenet <6771947+epenet@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this is a breaking change and why we need to make this breaking?
This will make every single template out there (in forums, google, youtube vids, tutorials whatever) that uses it, broken.
If we want a faster JSON serializer, I suggest applying it to to_json
instead.
One more option just came to mind:
I may sketch that out briefly -- see if it makes it more palatable, but @frenck definitely interested in feedback on that approach. |
@depoll IMHO, removing/deprecating/replacing is not an option or consideration. Templates are used a lot, including blueprints for which end-users aren't provided updates. |
5 seems seem like reasonable approach if we don't want to accept a breaking change. Personally I am not a fan of backwards compatibility forever here considering I had trouble finding an example of someone using this on the forums (not that it means it's not being used) but it isn't especially difficult to maintain it here if that's the goal. |
Ok, so I gave option 5 a shot. It's still technically breaking because If that's still too breaky, there are two paths forward:
|
That seems like a reasonable approach to me. I think |
I think it would also be fine to not deprecate |
Another option would be to keep |
I think this would still technically be breaking (because it's effectively changing |
That seems like the best compromise as we want it perform well by default and users shouldn't have to figure out how to make it performant. |
I agree on the latter parts, I think that is acceptable. I also agree that we can't keep backward compatibility forever. But that said, looking at the resulting solution, the larger part of use cases won't be affected (which IMHO is a good compromise). |
I think we have a good path forward here. I'm happy with the result. Thanks everyone for the input and collaboration |
Tried to update the PR description and text to match where we landed. It definitely resulted in a few things getting bundled together, but I think we landed in a good spot. Also updated the docs PR. |
Let's separate the text into proposed changes and breaking changes with the new functionality in the proposed changes section. The breaking changes section should be limited to what an affected user needs to do to handle the change and under what conditions they would need to do it. It's what will appear in the release notes |
Ok -- moved the text into the proposed change section and called out the breaking part explicitly in the Breaking Change section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @depoll
Manual testing looks good. (not that I expected it to work any other way than the test cases but always good to go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @depoll 👍
../Frenck
…n Jinja (home-assistant#91253) Co-authored-by: J. Nick Koston <nick@koston.org> Co-authored-by: epenet <6771947+epenet@users.noreply.github.com>
Breaking change
The
ensure_ascii
argument forto_json
in Jinja templates now defaults toFalse
, allowing us to use a faster JSON encoder by default. For most people, this should not be an issue as JSON parsers broadly accept unicode input. If you still need to encode unicode characters inside of JSON strings, setensure_ascii
toTrue
explicitly to restore the old behavior.Proposed change
Adds
pretty_print
andsort_keys
toto_json
in Jinja templates.to_json
now uses the faster orjson serializer by default, which requires us to defaultensure_ascii
to false, though this shouldn't impact many people as parsers following the JSON standard should support unicode in strings anyway.ensure_ascii
continues to exist for people to set explicitly if they still need this compatibility option.Type of change
Additional information
Checklist
black --fast homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: