Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Convert NaNs and Infs in responses to strings #2634

Merged
merged 3 commits into from
Apr 6, 2023

Conversation

n-gao
Copy link
Contributor

@n-gao n-gao commented Mar 30, 2023

This PR introduces a recursive function that replaces all np.nan by NaN, np.inf by inf and -np.inf by -inf as strings to avoid JSON encoding errors. An alternative would be to use the orjson but then these instances would be replaced by None.

@mihran113 mihran113 requested review from mihran113 and alberttorosyan and removed request for mihran113 April 5, 2023 15:18
@@ -152,6 +153,8 @@ async def run_params_api(run_id: str,
'traces': run.collect_sequence_info(sequence, skip_last_value=True),
'props': get_run_props(run)
}
# Convert NaN and Inf to strings
response = convert_nan_and_inf_to_str(response)
Copy link
Contributor

@mihran113 mihran113 Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @n-gao! Thanks a lot for contribution.
My concern is the efficiency here, as for larger run params, this might take a while actually.
My suggestion is to do something like this:

    response_string = json.dumps(
            response,
            ensure_ascii=False,
            allow_nan=True,
            indent=None,
            separators=(",", ":"),
        )
    parsed_response_string = re.sub(r'(NaN|Infinity|-Infinity)', r'"\1"', response_string)
    return Response(parsed_response_string.encode('utf-8'), media_type="application/json")

As Fastapi's JSONResponse dumps the dict into a string anyways, so we can do that step manually, and add quotes around the values that are not json encodable, I guess this will do the trick much more efficiently.
What do you think about this approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mihran113, thanks for the feedback! I wouldn't fear much about the performance of the metric as it is very similar to the python JSON encoder. So, using json.dumps will roughly have the same runtime. So, first replacing NaNs and then encoding will have at most double the runtime then first encoding and then replacing. If one accounts for the extra time to search through the whole string to replace substrings, the runtimes are probably going to be very close. But, the main reason I'd avoid such an implementation is that if any string contains the word NaN, Infinity, or -Infinity this approach would break the string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, makes sense, and after all this will be a temporary fix, before we move this endpoint to streaming response as well. So, can I ask you to update changelog as well, so I can merge the PR?

Here's an example how to do it, as there's no other changes yet after the patches:

## Unreleased 

### fixes
-  Convert NaNs and Infs in responses to strings (n-gao)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, seems there was just a change on changelog, sorry about that, but there's a conflict now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Copy link
Contributor

@mihran113 mihran113 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for the contribution.

@mihran113 mihran113 changed the title Convert NaNs and Infs in responses to strings. Fixes #2629 [fix] Convert NaNs and Infs in responses to strings Apr 6, 2023
@mihran113
Copy link
Contributor

Resolves #2629

@mihran113 mihran113 merged commit 962d42c into aimhubio:main Apr 6, 2023
mihran113 pushed a commit that referenced this pull request May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants