Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run Amazon Augmented AI (A2I) and SageMaker Endpoint locally #5

Closed
papagala opened this issue May 19, 2020 · 7 comments
Closed

Comments

@papagala
Copy link

papagala commented May 19, 2020

It's a bit unclear how to run this locally, but managed to bypass errors around credentials not being found by passing a boto3.Session to the sagemaker.Session that calls one of my "profile_name"s.

However, I'm getting this error everytime I try to make a prediction. Specifically when I run

object_detector = sagemaker.predictor.RealTimePredictor(endpoint=endpoint_name,sagemaker_session=sess)
with open(test_photos[2], 'rb') as image:
    f = image.read()
    b = bytearray(f)
results = object_detector.predict(b)

I get

---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

~/anaconda3/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1228         """Send a complete request to the server."""
-> 1229         self._send_request(method, url, body, headers, encode_chunked)
   1230 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_request(self, method, url, body, headers, *args, **kwargs)
     91         rval = super(AWSConnection, self)._send_request(
---> 92             method, url, body, headers, *args, **kwargs)
     93         self._expect_header_set = False

~/anaconda3/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1274             body = _encode(body, 'body')
-> 1275         self.endheaders(body, encode_chunked=encode_chunked)
   1276 

~/anaconda3/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1223             raise CannotSendHeader()
-> 1224         self._send_output(message_body, encode_chunked=encode_chunked)
   1225 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_output(self, message_body, *args, **kwargs)
    142             # we must run the risk of Nagle.
--> 143             self.send(message_body)
    144 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in send(self, str)
    202             return
--> 203         return super(AWSConnection, self).send(str)
    204 

~/anaconda3/lib/python3.7/http/client.py in send(self, data)
    976         try:
--> 977             self.sock.sendall(data)
    978         except TypeError:

~/anaconda3/lib/python3.7/ssl.py in sendall(self, data, flags)
   1014                 while count < amount:
-> 1015                     v = self.send(byte_view[count:])
   1016                     count += v

~/anaconda3/lib/python3.7/ssl.py in send(self, data, flags)
    983                     self.__class__)
--> 984             return self._sslobj.write(data)
    985         else:

BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

ProtocolError                             Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/botocore/httpsession.py in send(self, request)
    262                 decode_content=False,
--> 263                 chunked=self._chunked(request.headers),
    264             )

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    637             retries = retries.increment(method, url, error=e, _pool=self,
--> 638                                         _stacktrace=sys.exc_info()[2])
    639             retries.sleep()

~/anaconda3/lib/python3.7/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    343             # Disabled, indicate to re-raise the error.
--> 344             raise six.reraise(type(error), error, _stacktrace)
    345 

~/anaconda3/lib/python3.7/site-packages/urllib3/packages/six.py in reraise(tp, value, tb)
    684         if value.__traceback__ is not tb:
--> 685             raise value.with_traceback(tb)
    686         raise value

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

~/anaconda3/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1228         """Send a complete request to the server."""
-> 1229         self._send_request(method, url, body, headers, encode_chunked)
   1230 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_request(self, method, url, body, headers, *args, **kwargs)
     91         rval = super(AWSConnection, self)._send_request(
---> 92             method, url, body, headers, *args, **kwargs)
     93         self._expect_header_set = False

~/anaconda3/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1274             body = _encode(body, 'body')
-> 1275         self.endheaders(body, encode_chunked=encode_chunked)
   1276 

~/anaconda3/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1223             raise CannotSendHeader()
-> 1224         self._send_output(message_body, encode_chunked=encode_chunked)
   1225 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_output(self, message_body, *args, **kwargs)
    142             # we must run the risk of Nagle.
--> 143             self.send(message_body)
    144 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in send(self, str)
    202             return
--> 203         return super(AWSConnection, self).send(str)
    204 

~/anaconda3/lib/python3.7/http/client.py in send(self, data)
    976         try:
--> 977             self.sock.sendall(data)
    978         except TypeError:

~/anaconda3/lib/python3.7/ssl.py in sendall(self, data, flags)
   1014                 while count < amount:
-> 1015                     v = self.send(byte_view[count:])
   1016                     count += v

~/anaconda3/lib/python3.7/ssl.py in send(self, data, flags)
    983                     self.__class__)
--> 984             return self._sslobj.write(data)
    985         else:

ProtocolError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))

During handling of the above exception, another exception occurred:

ConnectionClosedError                     Traceback (most recent call last)
<ipython-input-13-ea07fcaa5d16> in <module>
      3     f = image.read()
      4     b = bytearray(f)
----> 5 results = object_detector.predict(b,)

~/anaconda3/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model)
    108 
    109         request_args = self._create_request_args(data, initial_args, target_model)
--> 110         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    111         return self._handle_response(response)
    112 

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    314                     "%s() only accepts keyword arguments." % py_operation_name)
    315             # The "self" in this scope is referring to the BaseClient.
--> 316             return self._make_api_call(operation_name, kwargs)
    317 
    318         _api_call.__name__ = str(py_operation_name)

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    620         else:
    621             http, parsed_response = self._make_request(
--> 622                 operation_model, request_dict, request_context)
    623 
    624         self.meta.events.emit(

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_request(self, operation_model, request_dict, request_context)
    639     def _make_request(self, operation_model, request_dict, request_context):
    640         try:
--> 641             return self._endpoint.make_request(operation_model, request_dict)
    642         except Exception as e:
    643             self.meta.events.emit(

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in make_request(self, operation_model, request_dict)
    100         logger.debug("Making request for %s with params: %s",
    101                      operation_model, request_dict)
--> 102         return self._send_request(request_dict, operation_model)
    103 
    104     def create_request(self, params, operation_model=None):

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _send_request(self, request_dict, operation_model)
    135             request, operation_model, context)
    136         while self._needs_retry(attempts, operation_model, request_dict,
--> 137                                 success_response, exception):
    138             attempts += 1
    139             # If there is a stream associated with the request, we need

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _needs_retry(self, attempts, operation_model, request_dict, response, caught_exception)
    254             event_name, response=response, endpoint=self,
    255             operation=operation_model, attempts=attempts,
--> 256             caught_exception=caught_exception, request_dict=request_dict)
    257         handler_response = first_non_none_response(responses)
    258         if handler_response is None:

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    354     def emit(self, event_name, **kwargs):
    355         aliased_event_name = self._alias_event_name(event_name)
--> 356         return self._emitter.emit(aliased_event_name, **kwargs)
    357 
    358     def emit_until_response(self, event_name, **kwargs):

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    226                  handlers.
    227         """
--> 228         return self._emit(event_name, kwargs)
    229 
    230     def emit_until_response(self, event_name, **kwargs):

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
    209         for handler in handlers_to_call:
    210             logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211             response = handler(**kwargs)
    212             responses.append((handler, response))
    213             if stop_on_response and response is not None:

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempts, response, caught_exception, **kwargs)
    181 
    182         """
--> 183         if self._checker(attempts, response, caught_exception):
    184             result = self._action(attempts=attempts)
    185             logger.debug("Retry needed, action of: %s", result)

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    249     def __call__(self, attempt_number, response, caught_exception):
    250         should_retry = self._should_retry(attempt_number, response,
--> 251                                           caught_exception)
    252         if should_retry:
    253             if attempt_number >= self._max_attempts:

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in _should_retry(self, attempt_number, response, caught_exception)
    275             # If we've exceeded the max attempts we just let the exception
    276             # propogate if one has occurred.
--> 277             return self._checker(attempt_number, response, caught_exception)
    278 
    279 

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    315         for checker in self._checkers:
    316             checker_response = checker(attempt_number, response,
--> 317                                        caught_exception)
    318             if checker_response:
    319                 return checker_response

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    221         elif caught_exception is not None:
    222             return self._check_caught_exception(
--> 223                 attempt_number, caught_exception)
    224         else:
    225             raise ValueError("Both response and caught_exception are None.")

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in _check_caught_exception(self, attempt_number, caught_exception)
    357         # the MaxAttemptsDecorator is not interested in retrying the exception
    358         # then this exception just propogates out past the retry code.
--> 359         raise caught_exception

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _do_get_response(self, request, operation_model)
    198             http_response = first_non_none_response(responses)
    199             if http_response is None:
--> 200                 http_response = self._send(request)
    201         except HTTPClientError as e:
    202             return (None, e)

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _send(self, request)
    267 
    268     def _send(self, request):
--> 269         return self.http_session.send(request)
    270 
    271 

~/anaconda3/lib/python3.7/site-packages/botocore/httpsession.py in send(self, request)
    292                 error=e,
    293                 request=request,
--> 294                 endpoint_url=request.url
    295             )
    296         except Exception as e:

ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57/invocations".

If I run
results = object_detector.predict("dummy text")

I get something I can understand and expect:

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-22-dd7ea88a4b12> in <module>
      3     f = image.read()
      4     b = bytearray(f)
----> 5 results = object_detector.predict("dummy tex")

~/anaconda3/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model)
    108 
    109         request_args = self._create_request_args(data, initial_args, target_model)
--> 110         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    111         return self._handle_response(response)
    112 

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    314                     "%s() only accepts keyword arguments." % py_operation_name)
    315             # The "self" in this scope is referring to the BaseClient.
--> 316             return self._make_api_call(operation_name, kwargs)
    317 
    318         _api_call.__name__ = str(py_operation_name)

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    633             error_code = parsed_response.get("Error", {}).get("Code")
    634             error_class = self.exceptions.from_code(error_code)
--> 635             raise error_class(parsed_response, operation_name)
    636         else:
    637             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "unable to evaluate payload provided". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57 in account 488507749156 for more information.

Name: boto3
Version: 1.13.12

Name: sagemaker
Version: 1.58.2

Name: botocore
Version: 1.16.12

Python 3.7.3

@papagala
Copy link
Author

I think I found the solution. But I still think it's a bug. The files seem to be too large. I'm using smaller sized pics and is working fine.

The error message you get back, says nothing about it which is confusing.

@papagala
Copy link
Author

I compressed all images and the problem went away
pexels-photo-276517
pexels-photo-980382
pexels-photo-1571457

@papagala
Copy link
Author

Ok, I discovered something else. Looks like if you rerun the notebook, the images double, triple in size (they become n times large with n the number of times you run the notebook).

This fixes that problem:

for ind in test_photos_index:
    !rm sample-a2i-images/pexels-photo-{ind}.jpeg

Maybe with bash magic we can check if the file already exists to NOT redownload it, but this is fine for me

@michaelhsieh42
Copy link
Contributor

Hello @papagala, thanks for your feedback. Indeed the SageMaker error

ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57/invocations".

could stem from image payload to the endpoint exceeding the limit 5MB. The original files are around 2-3 MB. Thanks for reporting that the curl would append the image and grow the file size. It's something we can address easily.

We will also address the description for running environment. Thanks for your feedback again.

@papagala
Copy link
Author

papagala commented Jun 5, 2020

Thanks a lot!

@samuel-henry
Copy link
Contributor

Closing based on michaelhsieh42@'s merge

@papagala
Copy link
Author

papagala commented Aug 6, 2020

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants