-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
gh-117391: Add convert_keys to JSON encoder to enable encoding of dict keys not supported in JSON
#117392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… dictionary keys through the `.default()` method for conversion to a supported type.
…ry keys that are not of a JSON-supported simple type through the encoder's .default() method.
…vert_keys` as documented.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
| Add `convert_keys` parameter to JSON encoding, to allow passing dict keys that | ||
| are not supported types in JSON through the encoder's `.default()` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Add `convert_keys` parameter to JSON encoding, to allow passing dict keys that | |
| are not supported types in JSON through the encoder's `.default()` method. | |
| Add ``convert_keys`` parameter to JSON encoding, to allow passing dict keys that | |
| are not supported types in JSON through the encoder's ``.default()`` method. |
|
Sorry for the noise; it took me a few tries to get the docs formatting correct. Example of use: bytes, pathlib.Path, uuid.UUID, and ipaddress objects are "enough like" str that they are used as keys. Dumping such a dict to JSON doesn't work without this feature, which can be used like: class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, bytes):
return obj.hex(":")
elif isinstance(obj, (pathlib.Path, uuid.UUID, ipaddress._IPAddressBase)):
return str(obj)
# Let the base class default method raise the TypeError
return super().default(obj)
d = {
b"foobar": "a bytes key",
pathlib.Path("/tmp"): "a Path key",
uuid.uuid4(): "a UUID key",
ipaddress.ip_address("127.0.0.1"): "an IP address key",
}
>>> print(json.dumps(d, cls=CustomEncoder, indent=4, convert_keys=True))
{
"66:6f:6f:62:61:72": "a bytes key",
"/tmp": "a Path key",
"724cd681-d8ea-4946-98bd-aecca19c0311": "a UUID key",
"127.0.0.1": "an IP address key"
}I've contributed before, many moons ago, but this is the first time I've added something to a C speedup module, so I may not have the refcounting correct. I'd appreciate corrections on this. |
gaogaotiantian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put whether we should have this feature aside, I don't believe your python implementation and C implementation are equivalent. In your Python encoder, the keys could be converted by the default function to bool or None, but in _make_iterencode they should all be converted to str. You did that in the C implementation.
There's some twisted logic here - basically whether you allow default to convert to "serializable" objects like int or bool. As the function is currently used for values, it's fine, but if you want to use it for both keys and values, that would raise some issue. Do you want to double-convert from those to strings? Then you'll have another corner case - there are serializable objects that can't be converted to "basic types", for example, tuples.
If we truly want to do this, I think a better approach is to simply pass a function for convert_keys which converts everything to a string. The implementation would be much cleaner as well with the clearer restriction.
Thank you, you're right that the code in That's easy enough to do - but would involve either repeating some code, or factoring it out into a separate method, and I'm not sure which would be preferred here. Thoughts?
Yes. The way
I don't really see this as a corner case. Presumably the author who is customizing serialization of a dict knows that if they want to convert keys from other types to int, bool, None, float, or str, they know how to handle this. The only way it would seem to be a problem is if they wanted to handle a specific type differently when it occurs as a value (serializing to a tuple/list, say) than when it occurs as a key (serialize to str). That seems like a pretty rare case, in which case the author can handle it differently (say, by not setting
Separate conversion functions are certainly possible, but it looks to me like a rare case for which the additional complexity is perhaps more than necessary. |
…ssing the key through `.default()` to a str.
The reason I thought about alternatives was because this, the duplicated code is quite ugly.
Your implementation actually deviates the documentation a bit:
With For example, def default(val):
if isinstance(val, tuple):
return str(val)
s = json.dumps(v, convert_keys=True, default=default)
print(s)will be I believe a separate convert function will make the code for CPython cleaner, not more complex - because you don't have to do the double conversion anymore. For the user code, the only scenario that it's worse than the current implementation is when the user needs to convert a certain type to a basic type other than Maybe when using with Personally I do not use customized json encoder that much so I believe we should hear from the json experts. |
|
Thank you, Tian - I appreciate your thoughts and the time you spent on this. |
This adds
convert_keysas an optional parameter to the JSON encoder methods, which causes unsupported dict key types to be passed to the encoder's.default()method. They can then be encoded if.default()returns one of the supported JSON types..default()before throwing TypeError #117391📚 Documentation preview 📚: https://cpython-previews--117392.org.readthedocs.build/