-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Restructure codejail darklaunch to better capture errors, fix globals pollution #36521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
robrap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that no test changes make me wonder if we have poor test coverage of our darklaunch features. Could that be enhanced?
4e194c8 to
fa48536
Compare
| ), | ||
| call( | ||
| "Local execution in darklaunch mode produces globals={}, " | ||
| "emsg='division by zero', exception=ZeroDivisionError('division by zero')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this ZeroDivisionError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...huh. That's left over from a previous iteration of the code. I'll have to figure out why that didn't cause an assertion error. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now. I pushed up a commit, but checks aren't running. I'm going to rebase onto master and squash the fixups -- maybe this will also cause actions to re-run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(github is being weird and not showing the fix commit from before I squash it, so here's an alternative link to see the latest change: fc79748)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to assume that the old assertion for ZeroDivisionError started failing after you made your test fix, or did you change everything together without ensuring that the assertion actually works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I left the bug in place long enough to check that it failed under the new structure.
It's so weird. Before making the fix, I set a breakpoint to find out what was going on, and I still don't understand. The assertion for the logs was raising an AssertionError, and it propagated up to the pytest.raises... but somehow pytest.raises was saying "yup, that's a BaseException with message matching unexpected".
Now that I'm not using pytest.raises, the assertion error propagates normally.
…exec
During dark launch of remote codejail, we want to ensure we always run both
local and remote execution -- otherwise we're missing data for the remote
side in an important situation.
This will help answer the question of whether the unexpected exception
happens on both sides, even though it may not look exactly the same due to
differences in how unexpected errors are handled.
An example input that provokes this in unsafe execution mode is
`raise BaseException("hi")`; in safe execution mode, printing to
`sys.__stdout__` should also produce an appropriate error.
- Catch all exceptions, not just Exception, to better prevent errors from
interfering with mainline responses.
- Introduce a separate try block around the monitoring code so that bugs
there don't cause issues.
- Print exception information as well for both sides (but only if not a
SafeExecException, which is redundant with emsg).
Some formatting changes to log messages as well.
Example outputs:
For `1/0`:
```
2025-04-14 17:26:34,239 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:240 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': '1/0'}, emsg=None, exception=None
2025-04-14 17:26:34,239 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:245 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': '1/0'}, emsg='ZeroDivisionError: division by zero', exception=None
```
For `raise BaseException("hi")`:
```
2025-04-14 17:26:13,359 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:240 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': 'raise BaseException("hi")'}, emsg=None, exception=None
2025-04-14 17:26:13,359 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:245 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': 'raise BaseException("hi")'}, emsg='hi', exception=BaseException('hi')
```
With codejail-service down, and `out = 1 + 2`:
```
2025-04-14 17:30:28,597 INFO 12484 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:241 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': 'out = 1 + 2', 'out': 3, 'cfn_return': {'input_list': [{'ok': True, 'msg': 'Output:\n3', 'grade_decimal': 1}]}}, emsg=None, exception=CodejailServiceUnavailable('Codejail API Service is unavailable. Please try again in a few minutes.')
2025-04-14 17:30:28,597 INFO 12484 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:246 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': 'out = 1 + 2', 'out': 3, 'cfn_return': {'input_list': [{'ok': True, 'msg': 'Output:\n3', 'grade_decimal': 1}]}}, emsg=None, exception=None
```
- Separate test for misconfiguration - Add helper method for generic dark launch testing - Test two darklaunch scenarios: Globals interference, and error that would previously have caused the remote side not to run - Rename mocks to have our usual `mock_` prefix
We were running local exec before making the copy of globals_dict for remote_exec, so remote exec has been getting a polluted version of the globals.
fc79748 to
bf2f8c3
Compare
robrap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question to ensure the test does what we want now. Other than that, looks good. Thanks.
|
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
|
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
|
2U Release Notice: This PR has been deployed to the edX production environment. |
Uh oh!
There was an error while loading. Please reload this page.