-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] Convert NaNs and Infs in responses to strings #2634
Conversation
@@ -152,6 +153,8 @@ async def run_params_api(run_id: str, | |||
'traces': run.collect_sequence_info(sequence, skip_last_value=True), | |||
'props': get_run_props(run) | |||
} | |||
# Convert NaN and Inf to strings | |||
response = convert_nan_and_inf_to_str(response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @n-gao! Thanks a lot for contribution.
My concern is the efficiency here, as for larger run params, this might take a while actually.
My suggestion is to do something like this:
response_string = json.dumps(
response,
ensure_ascii=False,
allow_nan=True,
indent=None,
separators=(",", ":"),
)
parsed_response_string = re.sub(r'(NaN|Infinity|-Infinity)', r'"\1"', response_string)
return Response(parsed_response_string.encode('utf-8'), media_type="application/json")
As Fastapi's JSONResponse
dumps the dict into a string anyways, so we can do that step manually, and add quotes around the values that are not json encodable, I guess this will do the trick much more efficiently.
What do you think about this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mihran113, thanks for the feedback! I wouldn't fear much about the performance of the metric as it is very similar to the python JSON encoder. So, using json.dumps
will roughly have the same runtime. So, first replacing NaNs and then encoding will have at most double the runtime then first encoding and then replacing. If one accounts for the extra time to search through the whole string to replace substrings, the runtimes are probably going to be very close. But, the main reason I'd avoid such an implementation is that if any string contains the word NaN
, Infinity
, or -Infinity
this approach would break the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, makes sense, and after all this will be a temporary fix, before we move this endpoint to streaming response as well. So, can I ask you to update changelog as well, so I can merge the PR?
Here's an example how to do it, as there's no other changes yet after the patches:
## Unreleased
### fixes
- Convert NaNs and Infs in responses to strings (n-gao)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, seems there was just a change on changelog, sorry about that, but there's a conflict now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thanks for the contribution.
Resolves #2629 |
# Conflicts: # CHANGELOG.md
This PR introduces a recursive function that replaces all
np.nan
byNaN
,np.inf
byinf
and-np.inf
by-inf
as strings to avoid JSON encoding errors. An alternative would be to use the orjson but then these instances would be replaced byNone
.