Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exception for CSV without headers. #25

Merged
merged 4 commits into from
Aug 23, 2021

Conversation

philschmid
Copy link
Collaborator

What does this PR do?

This PR adds an exception when a CSV is sent without or with the wrong headers. Since it is not guaranteed to parse the CSV correctly. see #24

Explanation:

When using batch_transform sagemaker is sending exactly 1 row as input to the endpoint, e.g.

united thx off the response, finally got through the 45 min wait and talked to someone.

The issue with this is that the csv.sniffer would identify the , as delimiter and would split this into two columns, and create an input for question-answering instead of text-classification. There is no way to identify wether the input is 1 column based or 2 column based just from 1 input, e.g. question-answering input

the flight was delayed 45 minutes, How long was the flight delayed?

When someone would use MultiRecord instead the sniffer would identify the delimiter correctly, but for me these are too many constraints and if for it to work properly.

Tested with

where do i live?,My Name is Philipp and I live in Nuremberg.
What is the capital?,Berlin is the capital of Germany.

got as response

{
  "code": 400,
  "type": "InternalServerException",
  "message": "You need to provide the correct CSV with Header columns to use it with the inference toolkit default handler. : 400"
}

Also added a unit test for it.

@philschmid philschmid requested a review from vdantu August 16, 2021 07:13
@philschmid philschmid mentioned this pull request Aug 16, 2021
@philschmid philschmid merged commit 2743a73 into aws:main Aug 23, 2021
@philschmid philschmid deleted the add-error-for-csv-without-header branch August 23, 2021 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant