Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure all size limits are inclusive, and are internally consistent. #74

Merged
merged 1 commit into from
Sep 22, 2017

Conversation

rfk
Copy link
Contributor

@rfk rfk commented Sep 21, 2017

Fixes #73, adds explicit functional tests to assert https://bugzilla.mozilla.org/show_bug.cgi?id=1401707, and includes "max_request_bytes" in the configuration report even when batches are enabled.

@mhammond r? The changes here are mostly to test cases so if you could check whether they make sense from your perspective that'd be awesome.

@rfk rfk force-pushed the inclusive-max branch 2 times, most recently from de8b836 to aa7ec1b Compare September 21, 2017 04:50
res = self.app.post_json(self.root + '/storage/col2', wbos)
res = res.json
self.assertEquals(len(res['success']), 4)
self.assertEquals(len(res['failed']), 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests just assert compatibility with the old sync1.1 API, which we no longer care about, so I opted to delete this test rather than fix it.

@@ -780,15 +780,20 @@ def test_get_collection_ttl(self):
self.assertEquals(len(res.json), 0)

def test_batch(self):
# This can't be run against a live server
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now can be run against a live server, if it returns the proper config info \o/

)
limits = {}
for name in LIMIT_NAMES:
limits[name] = get_limit_config(request, name)
# This limit is hard-coded for now.
limits["max_record_payload_bytes"] = MAX_PAYLOAD_SIZE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this under the control of get_limit_config so that it can be adjusted via config file, which is important for the memcached tests.

@@ -303,12 +304,13 @@ def parse_multiple_bsos(request):
logger.info(logmsg, userid, collection, id, msg, bso)
continue

if count >= BATCH_MAX_COUNT:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was accidentally an inclusive range, because enumerate was starting the count at 0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, it's still an inclusive range now, it's just more explicitly inclusive no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it's now more obvious that it's an inclusive range on purpose.

Copy link

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

self.assertEquals(max_bytes, 1024 * 1024)
bsos = [{'id': str(i), 'payload': "X" * (210 * 1024)}
for i in range(5)]
# Uploading N+1 210kB items should produce one failure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems less valuable now the max_bytes constraint isn't a constant (and had me confused for a while). IIUC we are now starting with an arbitrary item_size, then calculating how many items we believe will fit given our max_bytes constraint and uploading one more than that limit? I'd be inclined to adjust the comment to make that clearer as it's not clear what "N" is, or why "210kB" is relevant given the existing comment.

(indeed, there seems other item_sizes which would be interesting - 1, max_bytes, max_bytes/2. max_bytes/2+-1 :) But yeah, I'm not sure that would actually add value...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You understand correctly; I'll adjust the comment to make it more obvious what this is about.

# Uploading N+1 210kB items should produce one failure.
item_size = (210 * 1024)
max_items = max_bytes / item_size
self.assertTrue(max_items * item_size < max_bytes)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to just be checking |(max_bytes / item_size) * item_size < max_bytes| (and similarly below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea was to check that item_size is not an exact multiple of max_bytes but I don't think that's actually necessary, I'll remove it.

DEFAULT_LIMITS["max_total_records"] = 100 * DEFAULT_LIMITS["max_post_records"]
DEFAULT_LIMITS["max_total_bytes"] = 100 * DEFAULT_LIMITS["max_post_bytes"]

# In production, the request-size limit is actually controlled by nginx.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused in IRC about this and still am :) Is nginx's size only talking about payloads too? (Note I've no reason to believe this is wrong, I'm just confused :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure nginx's size is concerned with total request size, and not payloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, nginx will basically reject any request with content-length over its configured limit.

Copy link

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. A couple nits/concerns but the code itself is fine.

@@ -303,12 +304,13 @@ def parse_multiple_bsos(request):
logger.info(logmsg, userid, collection, id, msg, bso)
continue

if count >= BATCH_MAX_COUNT:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, it's still an inclusive range now, it's just more explicitly inclusive no?

DEFAULT_LIMITS["max_total_records"] = 100 * DEFAULT_LIMITS["max_post_records"]
DEFAULT_LIMITS["max_total_bytes"] = 100 * DEFAULT_LIMITS["max_post_bytes"]

# In production, the request-size limit is actually controlled by nginx.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure nginx's size is concerned with total request size, and not payloads.

res = self.app.post_json(endpoint, bsos)
self.assertEquals(res.json['failed']['toomany'], 'retry bso')

# `max_total_records` is an (inclusive) limit on the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think desktop will say this. Maybe depend is the wrong word then? (Or we need a client fix...). Same for X-Weave-Total-Bytes below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We can only enforce it if..." is probably a more accurate phrasing. Of course iif desktop can send this it would be helpful :-)

res = self.app.post_json(self.root + '/storage/col2', bsos)
res = res.json
self.assertEquals(len(res['success']), 4)
self.assertEquals(len(res['success']), max_items)
self.assertEquals(len(res['failed']), 1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, everything should fail inside a batch if a single record fails... (Normally the client could just not commit the batch, but that's not an option for batch=1234&commit=true where the body has records, which for desktop – and probably the other clients – it almost always will). But that's certainly a discussion for a different time...

I guess if we respect all the config limits, there should be no failures, so it probably doesn't matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, and in fact I feel like it would be simpler all round if any bad item in a request just failed the whole request. But I think we kept it this way just so we didn't have to mess with existing client code, and particularly so we didn't have different behaviour between batch and non-batch cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants