New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json module should issue warning about duplicate keys #89217
Comments
The json module will allow the following without complaint: import json
d1 = {1: "fromstring", "1": "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2) And it prints: {"1": "fromstring", "1": "fromnumber"} This would be extremely confusing to anyone who doesn't already know that JSON keys have to be strings. Not only does I suggest that if json.dump or json.dumps notices that it is producing a JSON document with duplicate keys, it should issue a warning. Similarly, if json.load or json.loads notices that it is reading a JSON document with duplicate keys, it should also issue a warning. |
Sorry to the people I'm pinging, but I just noticed the initial dictionary in my example code is wrong. I figured I should fix it before anybody tested it and got confused about it not matching up with my description of the results. It should've been: import json
d1 = {"1": "fromstring", 1: "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2) |
In general this sounds reasonable; - but a couple of thoughts / comments:
|
Another good option would be to use typed dict like |
-0 on doing this. The suggested warning/error adds overhead that everyone would pay for but would almost never be of benefit. I haven't seen this particular problem arise in practice. The likely reasons it doesn't come up are 1) that generated data doesn't normally produce mixed type keys, 2) because mixed type keys don't round-trip, and 3) even using numeric keys only (not mixed) is uncommon because it results in poor outcomes that fail round-trip invariants. Andrei Kulakov is right in saying that such data suggests deeper problems with the design and that static typing would be beneficial. One last thought: Even with regular dicts, we don't normally warn about encountering duplicate keys: >>> dict([(1, 'run'), (1, 'zoo'), (3, 'tree')])
{1: 'zoo', 3: 'tree'} |
I second this. Especially on load. I have ran into this issue more than once when dealing with JSON from less experienced sources. Yes it's a bad choice in keys, but if someone is doing that it is hard to detect since you aren't even warned about it. I also don't think it's fair to compare internal behavior of Python data types to the behavior when pulling in external structures into an internal structure. It's also maybe bad practice that dict doesn't warn about the example @rhettinger gave. It seems reasonable to me to expect some kind of warning when imperfect mapping occurs between data structures, especially when data is lost. I find this particularly annoying here with JSON because I use the jsonschema package to validate JSON quite a bit and this is something you essentially have to validate on read in. For now, I use the solution in this SO post, but it feels bad having to opt in rather than out. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: