Skip to content
This repository has been archived by the owner on Jan 17, 2023. It is now read-only.

Attempt to make frontend-to-addon comms more robust #1102

Merged

Conversation

lmorchard
Copy link
Contributor

@lmorchard lmorchard commented Jul 19, 2016

  • Perform initial experiments API fetch and add-on installed experiment
    message happen in series, rather than in parallel
  • app.waitForMessage will now retry several times before finally
    timing out with a rejection (default 3 times)
  • If all else fails on app init, bounce to error page instead of blank
    white page

Fixes #1101

@lmorchard
Copy link
Contributor Author

lmorchard commented Jul 19, 2016

I still can't reliably reproduce this issue - and I have never been able to do so on local development. It happens for me like 1 out of 20 attempts on staging. My strong hunch is that this is a race condition, but I don't know where the race is happening.

So, this PR attempts to stop the number of things happening at once and retry things a few times before reporting an error. And then, on error, show the error page reload the page rather than just show a blank screen with an error in the console.

I'm a little worried that that reload might turn into an infinite loop if there's another more persistent issue here that depends on environmental context I don't have on my machine.

});
},

// Send webChannel message to addon, use a Promise to wait for the answer.
waitForMessage(type, data, timeout = 10000) {
waitForMessage(type, data, timeout = 5000, retries = 3) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, reduced timeout to 5 seconds from 10 - because the retries should actually lead to a total timeout of 15 seconds before rejection.

@lmorchard
Copy link
Contributor Author

@meandavejustice @6a68 @chuckharmston - this could use some additional eyes and sanity checking, because I'm kind of handwaving at the problem here 😿

@lmorchard
Copy link
Contributor Author

@ckprice - oh and hey, if you can try this out, it could be helpful since you reported being able to reliably reproduce this!

@chuckharmston
Copy link

If all else fails on app init, just try reloading the page.

I'm not a fan of this; it could lead to a loop of endless refreshes if the user is like @ckprice and can reproduce reliably. Maybe ship them to the error page, instead?

@chuckharmston
Copy link

Otherwise, code looks good and logic is sound. Unfortunately, like you I can't reproduce the original issue, so I'm reticent to merge anything. I'll let someone else check it out.

- Perform initial experiments API fetch and add-on installed experiment
  message happen in series, rather than in parallel

- `app.waitForMessage` will now retry several times before finally
  timing out with a rejection (default 3 times)

- If all else fails on app init, bounce to error page instead of blank
  white page

Fixes mozilla#1101
@lmorchard lmorchard force-pushed the 1101-more-robust-addon-messaging branch from 3b2a286 to 4d033c0 Compare July 19, 2016 20:37
@lmorchard
Copy link
Contributor Author

lmorchard commented Jul 19, 2016

I'm not a fan of this; it could lead to a loop of endless refreshes if the user is like @ckprice and can reproduce reliably. Maybe ship them to the error page, instead?

Yeah, the more I think about that, the less I like it. Just pushed a tweak to bounce to the error view, rather than reload. Better than a blank white page, at least.

app.pageManager = new PageManager({
pageContainer: document.querySelector('[data-hook=page-container]')
});
if (!app.router.history.started()) {
Copy link
Contributor Author

@lmorchard lmorchard Jul 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, moving page manager & router init outside of the experiments fetch means we can show the error page.

Side effect will be that the page will start to render before the API actually returns results, so typically a sub-second lag until filling in with experiments. Previously, the whole page would just stay white until the experiments came in.

@ckprice
Copy link

ckprice commented Jul 19, 2016

@ckprice - oh and hey, if you can try this out, it could be helpful since you reported being able to reliably reproduce this!

Pulled this down locally and was unable to trigger the error.

My "pretty reliable" STR's on production have become less reliable in the past couple of hours FYI. One in every 10-12 tries or so.

@lmorchard
Copy link
Contributor Author

lmorchard commented Jul 20, 2016

Unless there are objections, I may end up self-merging this tomorrow just to get it in there for next deploy.

@meandavejustice
Copy link
Contributor

@lmorchard pulling this down this morning for a quick check then I'm fine to merge for you

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants