Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lowercase strings for bool dict keys #614

Merged
merged 1 commit into from Nov 27, 2023

Conversation

eltoder
Copy link
Contributor

@eltoder eltoder commented Nov 18, 2023

Fixes #613

Also,

  • Consolidate key conversion for sorted and unsorted cases.
  • Fix memory leak of the "null" string when handling None dict key.

Fixes ultrajson#613

Also,

* Consolidate key conversion for sorted and unsorted cases.
* Fix memory leak of the "null" string when handling None dict key.
@eltoder
Copy link
Contributor Author

eltoder commented Nov 18, 2023

One thing I noticed is that objects of unknown types are usually passed through repr for serialization, but when used as dict keys they are passed through str instead. This seems like a weird inconsistency, but not sure if we should do anything about it.

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1c8188d) 91.67% compared to head (a1c4a9a) 91.68%.

❗ Current head a1c4a9a differs from pull request most recent head 834c33e. Consider uploading reports for the commit 834c33e to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #614      +/-   ##
==========================================
+ Coverage   91.67%   91.68%   +0.01%     
==========================================
  Files           6        6              
  Lines        1946     1949       +3     
==========================================
+ Hits         1784     1787       +3     
  Misses        162      162              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bwoodsend
Copy link
Collaborator

bwoodsend commented Nov 19, 2023

@hugovk Reckon this constitutes a breaking change? It's not that unreasonable for someone to be using ast.literal_eval(key) to deserialise string-ified keys. That said, this is undefined behaviour territory so you could argue that the backwards compatibility rules don't really apply.

To be honest, I'm not particularly comfortable with interchanging one undefined behaviour for another.

@hugovk
Copy link
Member

hugovk commented Nov 19, 2023

If we go for this, I would err on calling it a breaking change: the major bump will draw attention to the change, for those who don't care it doesn't matter, but for those who do, they were warned.

@eltoder
Copy link
Contributor Author

eltoder commented Nov 19, 2023

@bwoodsend Do you mean that someone would iterate over all dict keys and call literal_eval on each? This seems like a terrible idea. It won't work for the the majority of key types. It's also highly non-portable. All other json implementations listed in the README produce lowercase strings:

>>> json.dumps({True: 1, False: 2, None: 3})
'{"true": 1, "false": 2, "null": 3}'
>>> simplejson.dumps({True: 1, False: 2, None: 3})
'{"true": 1, "false": 2, "null": 3}'
>>> orjson.dumps({True: 1, False: 2, None: 3}, option=orjson.OPT_NON_STR_KEYS)
b'{"true":1,"false":2,"null":3}'

It also seems odd to special case None to convert it to "null", but not do the same for True and False. We can call it a breaking change, though AFAICT, the same was not done for the change to handle None.

@bwoodsend
Copy link
Collaborator

Do you mean that someone would iterate over all dict keys and call literal_eval on each? This seems like a terrible idea. It won't work for the the majority of key types.

It would work for booleans, integers and floats. If that's all your data's keys contain then I don't see any reason why people wouldn't do it. I'm not saying it's a good idea but people do do weird stuff with this library. We don't have to change much before we get feature requests demanding that we add an option to enable some previous behaviour.

I do agree that None mapping to "null" but booleans being capitalised is weird.

@eltoder
Copy link
Contributor Author

eltoder commented Nov 19, 2023

If we are discussing a real use case, then it seems highly unlikely that someone would have keys of mixed types of specifically just numbers and bools and nothing else. If they have other types in the mix, literal_eval won't work. If they have just numbers, which does seem reasonable, it is much faster and safer to use int or float instead.

Of course, every change can break something12, but that's not a reason to not fix issues :)

Footnotes

  1. https://www.hyrumslaw.com/

  2. https://xkcd.com/1172/

@bwoodsend bwoodsend added the changelog: Changed For changes in existing functionality label Nov 25, 2023
@bwoodsend bwoodsend merged commit a08b75b into ultrajson:main Nov 27, 2023
22 of 23 checks passed
@eltoder eltoder deleted the feature/dumps-bool-keys branch December 1, 2023 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: Changed For changes in existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bool values produce capitalized strings when used as dict keys
4 participants