Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(#1559): handle api connection err gracefully #1636

Merged
merged 8 commits into from Aug 8, 2022
Merged

fix(#1559): handle api connection err gracefully #1636

merged 8 commits into from Aug 8, 2022

Conversation

Ankush-Chander
Copy link
Contributor

@Ankush-Chander Ankush-Chander commented Jul 23, 2022

Please refer: #1559

Changes made:

@Ankush-Chander
Copy link
Contributor Author

Hi @frascuchon

The error trace now looks like:

Traceback (most recent call last):
  File "/home/ankushchander/workplace/open_source/rubrix-tryout/test_rubrix.py", line 2, in <module>
    rb.init(api_url="http://localhost:69000")
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 576, in wrapped_func
    return func(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 590, in init
    __ACTIVE_API__ = Api(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 150, in __init__
    self._user: User = whoami(client=self._client)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/sdk/users/api.py", line 23, in whoami
    raise Exception(f"Could not connect to api_url:{client.base_url}") from None
Exception: Could not connect to api_url:http://localhost:69000

Let me know if further changes required.

@frascuchon
Copy link
Member

Thanks, @Ankush-Chander for your work.

Indeed the referenced issue #1559 talks about the dataset load with unexpected errors. If you want to take a look, you could reproduce the error with the following code snippet:

import rubrix as rb

rb.log(name="mock-ds", records=rb.TextClassificationRecord(text="My text"))
rb.load(name="mock-ds") # The load works fine
rb.load(name="mock-ds", query="!!") # Here an error will be raise since a malformed query.

The error should be related to how the client handles the response, but I have little time to go deeper.

And again, thanks for your time!

@Ankush-Chander
Copy link
Contributor Author

Thanks @frascuchon!

I am able to replicate the bug with above steps.
Let me investigate and get back.

@Ankush-Chander
Copy link
Contributor Author

Hi @frascuchon ,
I was able to do handle the exception in case of malformed query. The error can be traced back to web server not being able to send the right status_code back to client since the StreamResponse has already started.

So I have handled the exception on the client side.

The error trace currently looks like:

Traceback (most recent call last):
  File "/home/ankushchander/workplace/open_source/rubrix-tryout/test_rubrix.py", line 11, in <module>
    rb.load(name="mock-ds", query="!!")  # Here an error will be raise since a malformed query.
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 576, in wrapped_func
    return func(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 625, in load
    return active_api().load(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 473, in load
    response = get_dataset_data(
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/sdk/text_classification/api.py", line 50, in data
    return build_data_response(
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/sdk/commons/api.py", line 126, in build_data_response
    raise Exception(f"Malformed query!") from None
Exception: Malformed query!

Reference:
encode/starlette#1739 (comment)

Copy link
Member

@frascuchon frascuchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @Ankush-Chander

I think we need to rethink it. Let me have some thoughts about it and then let's discuss the solution.

Comment on lines 123 to 124
except httpx.RemoteProtocolError as err:
raise Exception(f"Malformed query!") from None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, that's mean that all error of this kind received from the server will be considered as a malformed query, but we can have errors of different nature, right?

Maybe, we should work on an error mechanism exchange when streaming has already started. Let's me investigate and make you a design proposal, ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @frascuchon
Thanks for the review.

So, that's mean that all error of this kind received from the server will be considered as a malformed query, but we can have errors of different nature, right?

RemoteProtocolError comes mostly when protocol is violated by the server for eg: when it"s send the malformed HTTP. Error of different nature are most likely to be captured via explicit non 2XX error codes being sent back from server. Hence I attributed this particular exception to "Malformed query" but definitely there can be some other cases which might lead to same exception.

I suppressed the exception(raise Exception(f"Malformed query!") from None) here because the original error trace was nested, too verbose and not actionable by the user.

Maybe, we should work on an error mechanism exchange when streaming has already started. Let's me investigate and make you a design proposal, ok?

Design proposal would be very helpful. Looking forward to it.

Comment on lines 22 to 23
except httpx.ConnectError as err:
raise Exception(f"Could not connect to api_url:{client.base_url}") from None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, could we include some details about the connection error ??

@frascuchon
Copy link
Member

Hi @Ankush-Chander. I hope you're well.

I've been thinking about the solution and I will my thoughts with you. Tell me if make sense for you.

The problem is based on how the data streaming is handled in the scan_data_response function. So, we can include a try-except block to handle the error situations and send them to the client as a normal response.

try:
        for batch in grouper(
            n=chunk_size,
            iterable=stream,
        ):
            filtered_records = filter(lambda r: r is not None, batch)
            yield "\n".join(
                map(
                    lambda r: r.json(by_alias=True, exclude_none=True), filtered_records
                )
            ) + "\n"
except Exception as error:
        yield errors.exception_to_rubrix_error(err) # imported from rubrix.server.errors

If the error occurs at the beginning of streaming data, the response sent to the client will contain a standard HTTP error response (the StreamingResponse class support this mechanism). But, if the error occurs during the data streaming, clients will be responsible to handle the error sent.

We can start by including the block try-except and keep the python client as is. We should also include the changes for other tasks (see here and here), or find a way to reuse the function cross different tasks.

What do you think about it?

@Ankush-Chander
Copy link
Contributor Author

Hi @frascuchon,
It looks neat.
Let me try it on my local and get back to you.

Thanks
-Ankush

@frascuchon frascuchon linked an issue Aug 1, 2022 that may be closed by this pull request
@Ankush-Chander
Copy link
Contributor Author

Hi @frascuchon
I tried out the above option and realized that scan_data_response is expected to yield json strings of compatible records seperated by newline which are being decoded back on client side in build_data_response.

So if we yield an exception from scan_data_response then it will force us to make adjustments in build_data_response.

Here are some options I thought about(given the fact there is not much I could do in the stream_generator to propagate exception back to client):

  1. In case of exception we don"t yield anything, just return. End user doesn"t get any error(same as empty response) but never get to know about malformed query either.
try:
        for batch in grouper(
            n=chunk_size,
            iterable=stream,
        ):
            filtered_records = filter(lambda r: r is not None, batch)
            yield "\n".join(
                map(
                    lambda r: r.json(by_alias=True, exclude_none=True), filtered_records
                )
            ) + "\n"
except Exception as error:
        return
  1. We yield a custom dictionary with error info in it from scan_data_response and handle validationError here when that object is being converted into the pydantic model.
    I find it hacky. We are sending an object as if it"s a valid record which our server should not and then relying on client to handle it in a "clever" way.
try:
        for batch in grouper(
            n=chunk_size,
            iterable=stream,
        ):
            filtered_records = filter(lambda r: r is not None, batch)
            yield "\n".join(
                map(
                    lambda r: r.json(by_alias=True, exclude_none=True), filtered_records
                )
            ) + "\n"
except Exception as error:
        yield json.dumps({"error":"malformed query"})
        return

@frascuchon
Copy link
Member

Hi @Ankush-Chander

I prefer option 2 since clients will be aware that something was wrong on the server side. Then, we can adapt the client to handle those situations with error handling during response parsing:

        try:
            parsed_responses = []
            for r in response.iter_lines():
                parsed_responses.append(data_type(**json.loads(r)))
            return Response(
                status_code=response.status_code,
                content=b"",
                headers=response.headers,
                parsed=parsed_response,
            )
        except Exception as err:
            raise GenericApiError(message="Cannot process response", record=r, error=err) 

Does make sense for you?

Note: to normalize the error format sent by the server you can use the errors.exception_to_rubrix_error(err) helper

@Ankush-Chander
Copy link
Contributor Author

Hi @frascuchon
I made following changes in order to achieve step 2.

  1. add encode method to GenericRubrixServerError class in order for it to be sent from server.
  2. yield exception in case of error from server side
  3. check for record sanity on client side. Raise exception for end user in case of exception object received as record .

Please note exception raised in generator function can"t be caught normally on client side so I had to apply try catch where its being parsed and converted into standard type.

Error message for end user now looks like:

Traceback (most recent call last):
  File "/home/ankushchander/workplace/open_source/rubrix-tryout/test_rubrix.py", line 11, in <module>
    rb.load(name="mock-ds", query="!!")  # Here an error will be raise since a malformed query.
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 574, in wrapped_func
    return func(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 623, in load
    return active_api().load(*args, **kwargs)
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/api.py", line 471, in load
    response = get_dataset_data(
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/sdk/text_classification/api.py", line 46, in data
    return build_data_response(
  File "/home/ankushchander/workplace/open_source/rubrix/src/rubrix/client/sdk/commons/api.py", line 122, in build_data_response
    raise GenericApiError(message=parsed_record.get("message", "Cannot process response!"), error=parsed_record.get("type","ServerError")) from None
rubrix.client.sdk.commons.errors.GenericApiError: Rubrix server returned an error with http status: 500
Error details: [{'message': "RequestError(400, 'search_phase_execution_exception', 'Failed to parse query [!!]')", 'error': 'opensearchpy.exceptions.RequestError'}]

Please suggest.

@frascuchon
Copy link
Member

I see. Yes, indeed we need to handle several aspects regarding the response flow (capture the error, encode as bytes, send as the last response message).

Thinking a bit more about it, I found a solution that could simplify this flow by adding a few lines of code: Extending the StreammingResponse class to handle the errors. Here is my proposal:

from starlette.responses import JSONResponse, StreamingResponse
from starlette.types import Send

from rubrix.server.errors import APIErrorHandler


class StreamingResponseWithErrorHandling(StreamingResponse):

    async def stream_response(self, send: Send) -> None:
        try:
            return await super().stream_response(send)
        except Exception as ex:
            json_response: JSONResponse = await APIErrorHandler.common_exception_handler(send, error=ex)
            await send({"type": "http.response.body", "body": json_response.body, "more_body": False})

Here, if there is some error during the streaming, the server communication will end by sending a last message including the normalized rubrix error response body.

By having this, the only change we need to apply in the server part is to change the type of response we send to the client on a data stream.

On the other hand, the client must handle the received error as you propose. Just take into account that the rendered error has the following schema:

# This is just an example. Keep in mind the message structure
{
  "detail": {
    "code": "rubrix.api.errors::ValidationError",
    "params": {
      "extra": "error parameters"
    }
  }
}

You should readjust the client error handling that you've implemented.

Anyway, great work !!! Thanks a lot and sorry for all this back and forth in running the solution

Copy link
Member

@frascuchon frascuchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@frascuchon frascuchon merged commit 4ab30b9 into argilla-io:master Aug 8, 2022
frascuchon pushed a commit that referenced this pull request Aug 18, 2022
frascuchon pushed a commit that referenced this pull request Aug 22, 2022
frascuchon pushed a commit that referenced this pull request Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Load records] Handle server errors that occurred in rb.load
2 participants