Skip to content

Conversation

@bveeramani
Copy link
Contributor

@bveeramani bveeramani commented Jul 28, 2020

Issue #, if available:

Description of changes:

  • Add support for Accept headers with multiple MIME types

Testing done:

  • Added a test to test_utils, and updated a test in test_default_inference_handler

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I used the commit message format described in CONTRIBUTING
  • I have used the regional endpoint when creating S3 and/or STS clients (if appropriate)
  • I have updated any necessary documentation, including READMEs

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have checked that my tests are not configured for a specific region or account (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Comment on lines 88 to 91
for content_type in utils.parse_accept(accept):
if content_type in encoder.SUPPORTED_CONTENT_TYPES:
return encoder.encode(prediction, content_type), content_type
return encoder.encode(prediction, content_types.JSON), content_types.JSON
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this in default_inference_handler, but I could've just as easily put it in transformer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could use recommendations on how to more robustly test this behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this in default_inference_handler, but I could've just as easily put it in transformer.

I think it makes more sense here - that way, users can still override it if they want to for some reason

Comment on lines 27 to 35
@pytest.mark.parametrize(
"accept, expected_content_type",
[
("text/csv", "text/csv"),
("text/csv, application/json", "text/csv"),
("unsupported/type, text/csv", "text/csv"),
("unsupported/type", "application/json"),
],
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I should split this up into separate functions. Benefit would be that I could use descriptive test names describing the intent of the test, but adding more functions might be unnecessarily verbose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what split were you considering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate test for each pair of accept and expected_content_type arguments.

From https://docs.microsoft.com/en-us/dotnet/core/testing/unit-testing-best-practices, the scenario under test and the expected behavior should be obvious from the name. But with a parameterized test, we're testing multiple scenarios. Not sure how applicable these practices are for Python, though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you have here is fine

@bveeramani bveeramani changed the title [WIP] feature: Support multiple accept types feature: Support multiple accept types Jul 28, 2020
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-inference-toolkit-pr
  • Commit ID: 310e9cb
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

for content_type in utils.parse_accept(accept):
if content_type in encoder.SUPPORTED_CONTENT_TYPES:
return encoder.encode(prediction, content_type), content_type
return encoder.encode(prediction, content_types.JSON), content_types.JSON
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, not sure if content_types.JSON should be a named constant like DEFAULT_CONTENT_TYPE. If so, which module would I put it in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make more sense to raise an error with having an unsupported content type - presumably if we're at this point in the code, JSON wasn't one of the supported content types

basically the same logic in encode: https://github.com/aws/sagemaker-inference-toolkit/blob/master/src/sagemaker_inference/encoder.py#L107

(list): A list containing the MIME types that the client is able to
understand.
"""
return accept.split(", ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible it might be comma-delimited without spaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find anything about that in the specification https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html, but in all of the examples they use commas with spaces. Also, we would be encoding the Accept header with the Python SDK for most use cases (I think), in which case we would use spaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went and looked at a bunch of Stack Overflow posts and Mozilla's documentation, and it does seem like most people use a space. However, https://developer.mozilla.org/en-US/docs/Glossary/quality_values (linked from this page) seems to imply that the space isn't always needed 🤷‍♀️

Comment on lines 88 to 91
for content_type in utils.parse_accept(accept):
if content_type in encoder.SUPPORTED_CONTENT_TYPES:
return encoder.encode(prediction, content_type), content_type
return encoder.encode(prediction, content_types.JSON), content_types.JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this in default_inference_handler, but I could've just as easily put it in transformer.

I think it makes more sense here - that way, users can still override it if they want to for some reason

for content_type in utils.parse_accept(accept):
if content_type in encoder.SUPPORTED_CONTENT_TYPES:
return encoder.encode(prediction, content_type), content_type
return encoder.encode(prediction, content_types.JSON), content_types.JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make more sense to raise an error with having an unsupported content type - presumably if we're at this point in the code, JSON wasn't one of the supported content types

basically the same logic in encode: https://github.com/aws/sagemaker-inference-toolkit/blob/master/src/sagemaker_inference/encoder.py#L107

Comment on lines 27 to 35
@pytest.mark.parametrize(
"accept, expected_content_type",
[
("text/csv", "text/csv"),
("text/csv, application/json", "text/csv"),
("unsupported/type, text/csv", "text/csv"),
("unsupported/type", "application/json"),
],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what split were you considering?

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-inference-toolkit-pr
  • Commit ID: ad7e583
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

(list): A list containing the MIME types that the client is able to
understand.
"""
return accept.replace(" ", "").split(",")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) I have no opinion on which is more optimal/Pythonic, but figured I'd throw it out there:

[s.strip() for s in accept.split(",")]

@bveeramani bveeramani merged commit b205d06 into aws:master Jul 29, 2020
@bveeramani bveeramani deleted the support-multiple-accept branch July 29, 2020 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants