New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hotfix: LTI grade passback #1464
Conversation
when $lti_check_prior is enabled. This bug was causing grade passback to fail when $lti_check_prior was not enabled after the first call until the LMS expired the nonce which was being reused.
Note: In the second commit I am increasing the threshold which determines when a time difference between the WeBWorK server and the LMS is reported, as there are reports of slow Wifi connections leading to differences above 5 seconds but under 10 seconds. This is intended to reduce false alarms about the time-difference issue. |
Thanks @taniwallach I will apply this to the Runestone server and ask the two cases I know of to see if things change. |
@Alex-Jordan - Thanks. Lots of credit to Larry Riddle whose debug data was what got me to see what is going wrong. |
At least one time, there was a report of an 11 second difference between a WW clock and an LMS clock. As far as I can tell, it is a harmless difference. The WW clock is synced to NIST. I can't say what's going on with the LMS clock, but this was from some company that manages Canvas installations for multiple schools. It's only one data point, but is it enough to suggest raising that tolerance to something like 15 seconds? |
…he LMS from 5 seconds to 15 seconds
f835b75
to
969f3d6
Compare
Done. At https://webwork.maa.org/moodle/mod/forum/discuss.php?d=4770 Larry Riddle reported that "Canvas only allows for 1 minute ahead and 5 minutes behind." so it certainly suffices to try to detect that an issue may occur when the difference is somewhat larger, but not too close to say 50 seconds (so reports can be made of the warning before the drift passes the threshold for causing real trouble). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one report from a school using Runestone hosting and Canvas as the LMS that this has fixed grade passback for them.
With your understanding of all of this, would you expect a grade passback issue with LMSs other than Canvas? My own school uses D2L Brightspace. I just tested grade passback without this patch, and it succeeds. I am guessing that Canvas's nonce handling is just more thorough/secure?
Canvas's LTI is certainly quite strict about enforcing the non-reuse of the nonce. That seems be a reason why many of the reports of "duplicate nonce" problems were from Canvas users. Larry Riddle also reported that the fix works for him with the Canvas LMS: https://webwork.maa.org/moodle/mod/forum/discuss.php?d=5628#p16376 I think these 2 reports justify merging this hotfix a soon as reasonably possible (and after some people look at the code change and agree to the fix). @Alex-Jordan Your report about D2L Brightspace working without the patch seems to indicate that it is not enforcing the non-reuse of nonces. Just to be certain, you did test the pre-hotfix code when The thread at https://webwork.maa.org/moodle/mod/forum/discuss.php?d=5002 refers to testing a version of About other LMS's: It depends on if and how well the LMS enforces non-reuse of nonces when processing grade submissions from LTI. It is possible that Moodle, at least in my institution, may also not really enforce the non-reuse of the nonces properly, which could explain having failed to detect the bug during testing. So is the possibility that the testing of the "final version" simply was not thorough enough. I do recall testing the code from #1177, but it is long enough ago - that I cannot recall the details of the tests I did. The I just checked on an additional server I run on which grade-passback is used quite heavily. It is still on WW 2.15 and has a somewhat older version of My main server had the newer code since the middle of January 2021, and there were no reports of problems. However, I looked now and found that there were no Spring semester courses using grade-passback, and the change might have been made once the Winter semester courses using grade-passback were essentially over. |
Yes, assuming that is something I can set to 0 at the very end of course.conf and it would still take effect. |
Yes - setting it in |
Tim Payer also confirmed that this fix solves problems with Canvas: https://webwork.maa.org/moodle/mod/forum/discuss.php?d=5628#p16383 I think we should try to merge this soon, as the bug causes real problems for grade-passback for Canvas. |
There is something not right with how this is implemented. I clearly didn't review this very carefully before. The TIME_DIFF_THRESHOLD constant should not exist. There is a $NonceLifeTime variable defined in authen_LTI.conf that should be used instead. This is used for the actual verification (in the |
The purpose of the At https://webwork.maa.org/moodle/mod/forum/discuss.php?d=4770 Larry Riddle reported that "Canvas only allows for 1 minute ahead and 5 minutes behind" which was a cause of major problems with grade passback. The shorter Whether this test should be where it is (near where |
The thing is that that is not what it is actually doing. All the code that utilizes the TIME_DIFF_THRESHOLD is doing is checking the time difference in the same way already done with the NonceLifeTime variable, and then giving a warning if the time difference is more. Since the hard coded constant is 15 seconds and the default NonceLifeTime is 60 seconds this results in a warning being displayed when authentication succeeds. This is incorrect. Furthermore, this is not code that is really even related to grade passback, so it is not at all doing what you claim it is doing. |
The main code change in the PR fixes a serious bug in grade passback (which is my fault, having relocated an important line of code from an earlier version improperly). Getting that bug fix in, with or without adjusting if and how The change to About the the suitability of testing for the clock difference in a manner similar to what is being done now using
The purpose of the test using Were we to change the triggering threshold to If we want decide to change this, I can think of several approaches:
Lots of other options/combinations are possible. I was curious to what extent there are forum posts about "Nonce Expired. Your NonceLifeTime may be too short"):
|
What you are trying to do is provide diagnostic information for a server that is not correctly configured. That may be the WeBWorK server or the Canvas server. The thing is that that is not WeBWorK's job. That information may be useful, and certainly is good to be provided if asked for, but absolutely should not be displayed in any case where it is not actually an issue, as is the case of a successful LTI authentication. This diagnostic information should only be displayed in the case that |
I concede. The last commit drops the use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry to have been so stubborn about this, but displaying this warning when it is not an issue will make users think there is an issue. @Alex-Jordan and I both saw this warning when we had slightly slow internet connections. In that case the 15 second time diff would have been enough to prevent the warning, but 15 seconds will still not be enough in many cases. Many students have even slower connections and there are other things that can increase this time.
For instance, the way that Canvas works when you have an external tool set to open in a new window is that it presents a submit button to click on to open the external tool. The oauth timestamp (and all other necessary oauth parameters) are in hidden inputs in the form the submit button is contained in, and is generated at the time that the Canvas page is loaded. Canvas sets a timer and expires the button and form after a certain amount of time (this actually occurs after 2 minutes and 30 seconds) and removes it from the page, replacing it with a note to the user to reload the page to get a new button. So the user may wait and click the button after 30 seconds or so (or even up to 2 minutes and 30 seconds later). In that case the time difference that WeBWorK will see will definitely be more than 15 seconds. This obviously has nothing to do with network lag, and even if the now removed TIME_DIFF_THRESHOLD were set to 15 seconds this results in the warning being displayed.
In this case the time differential can be a combination of network latency when in the initial Canvas page loading, time the student takes to click on the button, as well as network latency in the WeBWorK page loading. This can easily exceed 15 seconds.
@drgrice1 Thanks for the additional explanation. I was to stubborn to concede earlier - sorry. The modality of operation of Canvas for LTI in an external window is something of which I was not aware, and will certainly cause time differences which do not really relate to server clock differences. Now that this has 2 approvals, I think it would be good to merge in the bug fix. It might make sense to leave all 3 commits so it is clear that the changes about |
Unless there are objections, I will merge this tomorrow. |
I have not done actual testing on the current version of this. Has anything meaningful changed with respect to addressing the grade passback bug? Or was the only thing that changed since I did testing the part about reporting the clock difference when it surpassed a threshold? |
The only change was about reporting the clock difference. That is now only done if debugging is enabled. |
Bug: When
$lti_check_prior
is false, then the generation of the time-dependent portion of the nonce is not occurring, as it was incorrectly moved inside theif ( $lti_check_prior )
block in the version in #1177. (My fault.)This bug is causing grade passback to fail when
$lti_check_prior
is not enabled after the first grade-passback call until the LMS expires the nonce which is constantly being reused.Reports:
Fix: Move the generation of
$uuid_p2
back where it belongs (outside theif
block).History: The code has once done it in the correct location (see: https://webwork.maa.org/moodle/mod/forum/discuss.php?d=4906#p14846 and https://webwork.maa.org/moodle/mod/forum/discuss.php?d=4770#p14184).
For some reason when the
$lti_check_prior
control switch was added, the code ended up inside the if block. See for example the file posted in https://webwork.maa.org/moodle/mod/forum/discuss.php?d=4770#p15256 .Apparently that version was only tested sufficiently when
$lti_check_prior=1
was set.