Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect serialization of float subclass with custom __str__ method #57

Closed
ryanmarquardt opened this issue Jul 31, 2022 · 5 comments
Closed
Labels
available Indicates which bugs have been triaged and reproduced, rather than just filed bug

Comments

@ryanmarquardt
Copy link

I have a float subclass with a customized str method, and its str output is appearing in the results of json5.dumps. Here's a simplified example:

>>> import json5
>>> class MyFloat(float):
...     def __str__(self):
...         return f"<{super().__str__()}>"

>>> print(json5.dumps(MyFloat(1)))
<1.0>

>>> json5.loads(json5.dumps(MyFloat(1))
Traceback (most recent call last):
...
ValueError: <string>:1 Unexpected "<" at column 1
@dpranke
Copy link
Owner

dpranke commented Jul 31, 2022

Thanks for reporting the issue. I have reproduced it locally (thanks to the easy repro!). I am a bit surprised that this hasn't been reported before.

It looks like json.dumps(MyFloat(1)) returns 1.0 instead of <1.0>. I'm not immediately sure why that is; it looks like the json module is calling repr(MyFloat(1)) instead of str(MyFloat(1)) (and json5 is doing the latter).

My first reaction is that json5 is right and json is wrong, i.e., we should be calling str here, not repr.

That said, I view matching json's behavior to be much more important than trying to say which one is "right". So, I'll definitely change it, but I'll also poke into this a bit more and see if there's something that explains why json might be correct to use repr.

Also, I'll point out that you shouldn't count on loads(dumps())he to always work in the presence of custom classes, at least not if you want to preserve the fact that the object is a MyFloat and not a str, at least not unless you're using a custom parse_float method. But you probably already know that :).

@dpranke dpranke added bug available Indicates which bugs have been triaged and reproduced, rather than just filed labels Jul 31, 2022
@ryanmarquardt
Copy link
Author

Thanks for taking a look at this so quick. It looks like the same issue crops up for int subclasses too.

I did some searching, and found that str and repr used to be different for float before around python 3.2, according to https://bugs.python.org/issue9337. repr was more precise, which is probably why it was used when json was originally written. Either way, the methods are identical now, so it doesn't make any difference.

Is there any reason why using float.__str__(MyFloat(1)) directly to sidestep any customization would cause problems?

@dpranke
Copy link
Owner

dpranke commented Jul 31, 2022

I did some searching ...

Thanks for those leads! :).

Is there any reason why using float.__str__(MyFloat(1)) directly to sidestep any customization would cause problems?

I have no idea. Changing the default behavior of any public function can usually cause problems. More importantly, reaching deep into a class's private methods doesn't feel great to me.

I think I can probably get equivalent behavior by actually calling JSONEncoder().encode(MyFloat(1)) and that would provide a stronger guarantee that json5's behavior matches json's. I'll probably code that up and see if it flies.

@dpranke
Copy link
Owner

dpranke commented Aug 1, 2022

I forgot that back in 10e4e82 I removed (the last) dependency on json.
I don't remember why I didn't want the dependency, exactly: it might have been just for aesthetics, or it might
have been by request from someone.

At any rate, in order to not reintroduce the dependency on json now, I'll call float.repr and int.repr directly
as you suggested instead.

dpranke added a commit that referenced this issue Aug 1, 2022
[GitHub issue #57](#57)
Fixed serialization for objects that subclass `int` or `float`:
Previously we would use the objects __str__ implementation, but
that might result in an illegal JSON5 value if the object had
customized __str__ to return something illegal. Instead,
we follow the lead of the `JSON` module and call `int.__repr__`
or `float.__repr__` directly.

While I was at it, I added tests for dumps(-inf) and dumps(nan)
when those were supposed to be disallowed by `allow_nan=False`.
@dpranke
Copy link
Owner

dpranke commented Aug 1, 2022

Okay, I've fixed this in 1f7f806 and pushed that to PyPI as v0.9.9.

Please let me know if you find any issues, and thanks again!

@dpranke dpranke closed this as completed Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
available Indicates which bugs have been triaged and reproduced, rather than just filed bug
Projects
None yet
Development

No branches or pull requests

2 participants