Skip to content

Limit STDOUT to prevent OOM events in container#81

Merged
MarkKoz merged 10 commits into
masterfrom
sebastiaan/bug/stdout-flood
Nov 20, 2020
Merged

Limit STDOUT to prevent OOM events in container#81
MarkKoz merged 10 commits into
masterfrom
sebastiaan/bug/stdout-flood

Conversation

@SebastiaanZ
Copy link
Copy Markdown
Contributor

@SebastiaanZ SebastiaanZ commented Nov 20, 2020

Recently, we discovered that for some code inputs, snekbox would get into an OOM event on the container level, seemingly bypassing the memory restrictions laid on code execution by NSJail.

After investigating the issue, we identified the culprit to be the STDOUT pipe we use to get output back from NSJail: As output is piped out of the jailed process, it will be collected outside of the NSJail in the main container process instead. This meant that our initial attempts of limiting the allowed filesize within the NSJail failed, as the OOM happened outside of the jailed environment.

To mitigate the issue, I've written a loop that consumes the STDOUT pipe in chunks of 100 characters. Once the size of the accrued output reaches a certain limit (currently set to 1 MB), we send a SIGTERM signal to NSJail to terminate itself. The output up to that point will be relayed back to the caller.

A minimal code snippet to trigger the event and the mitigation:

while True:
    print("1234567890")

I've included a test for this vulnerability in tests/test_nsjail.py.

Recently, we discovered that for some code inputs, snekbox would get
into an OOM event on the container level, seemingly bypassing the memory
restrictions laid on code execution by NSJail.

After investigating the issue, we identified the culprit to be the
STDOUT pipe we use to get output back from NSJail: As output is piped
out of the jailed process, it will be gathered outside of the NSJail in
the main container process instead. This meant that our initial attempts
of limiting the allowed filesize within the NSJail failed, as the OOM
happened outside of the jailed environment.

To mitigate the issue, I've written a loop that consumes the STDOUT pipe
in chunks of 100 characters. Once the size of the accrued output reaches
a certain limit (currently set to 1 MB), we send a SIGTERM signal to
NSJail to terminate itself. The output up to that point will be relayed
back to the caller.

A minimal code snippet to trigger the event and the mitigation:

```py
while True:
    print(" ")
```

I've included a test for this vulnerability in `tests/test_nsjail.py`.
@SebastiaanZ SebastiaanZ added type: bug Something isn't working area: backend Related to internal functionality and utilities priority: 0 - critical Needs to be addressed ASAP labels Nov 20, 2020
@SebastiaanZ SebastiaanZ requested a review from a team as a code owner November 20, 2020 13:24
@SebastiaanZ SebastiaanZ requested review from jerbob and mbaruh November 20, 2020 13:24
@coveralls
Copy link
Copy Markdown

coveralls commented Nov 20, 2020

Coverage Status

Coverage increased (+1.2%) to 93.377% when pulling f3d6b39 on sebastiaan/bug/stdout-flood into b3513df on master.

Comment thread snekbox/nsjail.py Outdated
Comment on lines +150 to +151
# Ask nsjail to terminate itself using SIGTERM
nsjail.terminate()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIGKILL is better since it'll result in a consistent error on the front end. Furthermore, SIGKILL cannot be handled by the child process, so it's more robust in this case.

Comment thread snekbox/nsjail.py Outdated
Comment thread snekbox/nsjail.py Outdated
# We'll consume STDOUT as long as the subprocess is running
while nsjail.poll() is None:
# Read 100 characters from the STDOUT stream
chars = nsjail.stdout.read(100)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100 seems really low. I suggest something in the 10's of kilobytes. Since an overflow of that magnitude is still insignificant, it should not discard the overflowed bytes it reads.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using io.DEFAULT_BUFFER_SIZE or setting a custom bufsize for Popen.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can increase the number of characters read from the stream, although this increases the uncertainty about the size in bytes we're reading due to UTF-8 characters being up to 4 bytes long.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using io.DEFAULT_BUFFER_SIZE or setting a custom bufsize for Popen.

This is the default value for Popen due to the negative default value for bufsize (-1):

negative bufsize (the default) means the system default of io.DEFAULT_BUFFER_SIZE will be used.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this increases the uncertainty about the size in bytes we're reading due to UTF-8 characters being up to 4 bytes long.

Well, even being 40 KB over is not a big deal.

This is the default value for Popen due to the negative default value for bufsize

That was worded poorly. I mean that when you read from the stream, use read(io.DEFAULT_BUFFER_SIZE) or set the bufsize to the same number you use to read from the stream. Not sure if it really makes a difference.

Comment thread snekbox/nsjail.py Outdated
Comment on lines +139 to +155
output_size = 0
output = []

# We'll consume STDOUT as long as the subprocess is running
while nsjail.poll() is None:
# Read 100 characters from the STDOUT stream
chars = nsjail.stdout.read(100)
chars_size = sys.getsizeof(chars)

# Check if these characters take us over the output limit
if output_size + chars_size > OUTPUT_MAX:
# Ask nsjail to terminate itself using SIGTERM
nsjail.terminate()
break

output_size += chars_size
output.append(chars)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd help readability to move this to a separate function that returns the output.

@MarkKoz
Copy link
Copy Markdown
Contributor

MarkKoz commented Nov 20, 2020

Just to double check, there will always be a front-end indication of truncation since 1 MB is 250,000 - 1 million characters, but pastebin has a limit of 100,000 characters or 0.1 - 0.4 MB. Is this correct?

@SebastiaanZ
Copy link
Copy Markdown
Contributor Author

SebastiaanZ commented Nov 20, 2020

Just to double check, there will always be a front-end indication of truncation since 1 MB is 250,000 - 1 million characters, but pastebin has a limit of 100,000 characters or 0.1 - 0.4 MB. Is this correct?

That is correct. My idea was that I did not want to go lower than the pastebin limit, so I used a safe margin. Now I'm thinking that we could also count the number of characters we consume and limit that to, say, 99,900. That means we can always upload the results. We could even include a message like "\n[output too long: truncated]\n" at the bottom. What do you think?

Edit: Hmm, on second thought, that would mean that all the joke runs that generate a lot of output are uploaded to the bin as well, filling it up with large outputs that don't make sense.

@MarkKoz
Copy link
Copy Markdown
Contributor

MarkKoz commented Nov 20, 2020

I think it's fine as-is, though it should be documented that it truncates at around 1 megabyte.

The function now returns a single, joined string instead of a list of a
list of strings. That way, we don't have to join the list in two
different code branches.
Previously, the chunk of output that took us over the output limit was
dismissed. As we've already consumed it and it's not going to have a
very large size, we might as well include it in the final output we
return.
I've increased the number of characters in each chunk we read from
stdout to 10_000. This means we now read roughly 10 KB - 40 KB in each
chunk.
This new behavior matches how other limiters terminate the subprocess,
resulting in a more consistency in the front-end for the end users as
well.
Comment thread tests/test_nsjail.py Outdated
""").strip()

result = self.nsjail.python3(stdout_flood)
self.assertEqual(result.returncode, -9)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is -9 equivalent to 137? Is this some sort of two's complement thing around 127 or 1 byte?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's what you get back when you terminate a subprocess with a signal in Popen: the signal code, but negative. So, that's what we're getting here as well, a -9, because SIGKILL is 9.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this needs to be adjusted somewhere so that the bot will understand that it's still SIGKILL, because right now I believe it will only understand 137.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could return a manual return code from the consume function:

  • If we exceed the limit, we return a manual 137
  • If we don't, we propagate the actual exit code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something more generic to normalise it:

if result.returncode < 0:
    result.returncode = abs(result.returncode) + 128

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess - can be used instead of abs()...

Copy link
Copy Markdown
Contributor

@MarkKoz MarkKoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I'd just like a test for the output truncation. The test you added only tests for whether the process is killed.

When you send a signal `N` to a subprocess using Popen, it will return
`-N` as its exit code. As the rest of the code returns signal exit codes
as `128 + N`, we convert those negative exit codes into the standard
form used by the rest of the code.
I've added a test that checks if output exceeding the limit is
correctly truncated. To make the test more robust, I've defined a
constant for the read chunk size.
Comment thread snekbox/nsjail.py Outdated
# When you send signal `N` to a subprocess to terminate it using Popen, it
# will return `-N` as its exit code. As we normally get `N + 128` back, we
# convert negative exit codes to the `N + 128` form.
returncode = -nsjail.returncode + 128 if nsjail.returncode < 0 else nsjail.returncode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here and not in the main function? I'd rather avoid returning a tuple.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather avoid returning a tuple.

I'll change it, but what is your motivation for disliking it? Just personal preference?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to associate it with functions that are broader in scope than they should be. It could also be a sign of clumsiness, like when returning some sort of boolean in addition to data. It may indicate a function can be further refactored. It's also more difficult to document, especially when not using named tuples. Returning two ostensibly unrelated things in a tuple makes the purpose of the function less clear.

In fact, ideally this function would only be concerned with the output and not do anything to the process, but killing the process within doesn't seem avoidable. This also adds to the trouble of clearly documenting it: this function mainly returns the output, but, by the way, it may also kill the process. That may be an important distinction which wouldn't be clear at a glance.

Comment thread snekbox/nsjail.py Outdated

Once the subprocess has exited, either naturally or because it was terminated,
we return the exit code as an int and the output as a single string.
we rturn the output as a single string.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where the suggestion button went, but there is a typo here.

Comment thread snekbox/nsjail.py Outdated
@MarkKoz MarkKoz merged commit 9ccfe94 into master Nov 20, 2020
@MarkKoz MarkKoz deleted the sebastiaan/bug/stdout-flood branch November 20, 2020 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: backend Related to internal functionality and utilities priority: 0 - critical Needs to be addressed ASAP type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants