Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surveys: Monitor embedded JotForm load success rates #27764

Merged
merged 1 commit into from Apr 3, 2019

Conversation

islemaster
Copy link
Contributor

@islemaster islemaster commented Mar 28, 2019

PLC-10: Adds a small JavaScript snippet to views where we embed JotForm which monitors whether the form loads successfully within five seconds and reports to New Relic.

Specifically:

  • Expects a JotFormFrameLoaded callback from the JotForm embed script to be called within five seconds of this initial script running. If called, reports a JotFormFrameLoaded event to New Relic with performance.now() page time in milliseconds. If not called, reports JotFormLoadFailed instead.

    This depends on this feature in the JotForm loader script, which I can't find referenced anywhere in the official docs but which clearly looks designed to interface with a host application. (The JotFormFrameLoaded function is not defined anywhere within the loader script itself.)

    image

  • Independently checks reachability of a couple of JotForm domains, www.jotform.com and cdn.jotfor.ms which I noticed monitoring network activity on these pages. Results from these checks are reported with the New Relic event regardless of success or failure, as they may be useful for spotting patterns in load failures.

    I've documented some additional domains we might check but haven't included them because (a) they're all *.jotform.com domains and likely to be reachable if www.jotform.com is reachable, and (b) I haven't found small image files to use for a reachablilty check on these domains yet.

Tested on my local server by loading http://localhost-studio.code.org:3000/pd/post_course_survey/csp with and without having blocked various domains from devtools. I have not fully verified the New Relic reporting behavior yet.

Prior work

if (timeoutKey) {
clearTimeout(timeoutKey);
}
console.log(`JotFormFrameLoaded fired at ${time}ms`);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally leaving console.log statements in, since we may be able to ask teachers to pop open their dev console in the future and this information might be helpful.

Promise.all([
checkJotFormFrameLoaded(context),
// Domains used by Jotform:
// cdn.jotfor.ms https://cdn.jotfor.ms/images/calendar.png 817B
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

817B is the file size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Just wanted to document that we found a small file to check on the target domain.

const timeoutKey = setTimeout(function() {
console.log(`JotForm failed to load in 5s`);
resolve(false);
}, 5000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might take more than 5 seconds on some slower computers and connections...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. What would you recommend? We could turn this up, start gathering timings, and then tune it back down once we have a rough sense of our bell curve.

The risk of waiting too long to report is that the teacher gets impatient and closes/reloads the page before we've reported the failure to our servers. I suppose I could attach an onbeforeunload event to try and catch that, but no guarantees it'll get the request off in time.

We could also set up a 'heartbeat' and record every second since page load that the form hasn't successfully loaded. You'd get a sort of histogram, but I worry it gets complicated to separate actual failures from slow loads - maybe if we can capture sessions that don't include a load success event, that would do it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it's better to not complicate.

My gut says somewhere between 7 to 10 seconds, but I have absolutely no data for that :)

Oh, I apparently chose 10 seconds for video reachability: https://github.com/code-dot-org/code-dot-org/pull/21885/files#diff-5180bfba62d8bd9766dc4a7ea63a943eR22

But the default for our reachability checks is 5 seconds: https://github.com/code-dot-org/code-dot-org/pull/21885/files#diff-96896ecf6bb59632fb9eba07868c0c08R3

In the case of this change, I guess we're not changing user functionality so much as gathering data on problems, and so choosing the lower timing - 5 seconds - will actually gather us more data about issues, so 5 seconds actually sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants