Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate external mapping size #291

Merged
merged 6 commits into from
Mar 8, 2022
Merged

Conversation

matvey-mtn
Copy link
Contributor

  • add checks for external mapping size
  • ignore empty lines

Copy link
Collaborator

@DanMelman DanMelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
Small comments inside


private void ensureContentLengthIsBelowTheLimit(HttpURLConnection conn) {
String contentLength = conn.getHeaderField(HttpHeaders.CONTENT_LENGTH);
if(contentLength == null) return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be null while the file is still valid and big?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the "Content-Length" header is not required by HTTP protocol.
I cannot assume it's always present.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the url is not configured to return this header, this validation is skipped.
So if you have a file with "only" 50,000 lines, but each line is more than 1MB, we will pass EXTERNAL_MAPPING_MAX_BYTES

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, but there is nothing we can do to avoid this.

for s3 objects Content-Length is always present
for other cases we can instruct users to add Content-Lenght header to protect their systems

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two things we can do:

  1. do not allow urls without that header (makes things more complex for users, but protects our system better)
  2. add some validation while iterating the file - like we count the lines, we can roughly count the bytes/chars per line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added fallback for counting length of the input if Content-Lenght is absent

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't do it as a fallback, I would just remove the header-based validation.
Remember that anyone can manually tweak that header and send you fake length value.
Also, this will keep things simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, this will require adding ResponseTransformer to the tests. Hold on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you anyway needed to test the fallback, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try finding a way to override the constants in the test

Copy link
Collaborator

@DanMelman DanMelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good!

@matvey-mtn matvey-mtn merged commit 8ca8341 into master Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants