Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cordova: Hot code push failing due to main javascript file not found #8063

Closed
matdurand opened this issue Nov 16, 2016 · 16 comments
Closed

Cordova: Hot code push failing due to main javascript file not found #8063

matdurand opened this issue Nov 16, 2016 · 16 comments

Comments

@matdurand
Copy link

Hi. Using meteor 1.4.1.2 on Galaxy, we have intermittent errors with the hot code push on mobile devices, both on Android and iOS. The main symptom is the hot code push reverting after the webapp timeout expires and the new version is being blacklisted. This happens because the minified JS file from my meteor build is not found, so I get the following error

11-16 10:55:06.177 19002 19002 I chromium: [INFO:CONSOLE(1)] "Uncaught SyntaxError: Unexpected token <", source: http://

The reason we get this error is because the requested file http://localhost:12648/4c01dfeef82fca6ece82954ec616be43d34cfba3.js is not found, so the WebAppLocalServer returns index.html as a fallback. The first character of index.html is <.

This is just a guess, but here is the sequence of events I think is happening when the error happens:

  1. We have an app that was version 1 and received a hot code push to version 2 in the past, so it's now running version 2.
  2. A hot code push (version 3) becomes available and is downloaded on the device
  3. The meteor reload package triggers a window reload to engage the new version
  4. A the same time
  • The onReset callback is called
  • The index.html is requested from the webview
  1. The index.html from version 2 is return. This version reference skkj4h23kj.js
  2. The onReset callback completes and the new AssetBundle is now version 3
  3. The webview request skkj4h23kj.js, but it's not found because it's in version 2, and index.html is return
  4. We get the error above
  5. After 20 sec of timeout, the version is blacklisted

My guess is that there is a concurrency issue between step 4a and 4b. When 4b happens first, all is good. When 4a happens first, we get the error. I may be wrong but I'm guessing that the onReset is triggered, but the webview doesn't block and proceed in requesting the url and loading the html.

I tried with both the default android webview and crosswalk, and got the same result. The only time this doesn't happen is for the first hot code push after a fresh app install, but in this particular case, the hot code push also fails because the old minified JS is loaded (and found since it's part of the parent AssetBundle). The new code is download on the device, but not serve, and the old code runs again. Then the next time the app starts, the new bundle is loaded correctly.

To reproduce the issue, start the project in Android Studio, and add a breakpoint on the first line of the onReset callback in WebAppLocalServer. I also added logs to display when the index.html file is returned. With the breakpoint in place, the code will break but you'll see that the index.html file is requested anyway before the AssetBundle are switched in onReset.

We had a similar behavior on iOS, but I haven't debug the issue in XCode.

@matdurand
Copy link
Author

A possible solution would be for the meteor reload package to call WebAppServerLocal via a api to proceed to the current/pending AssetBundle switch before triggering a window reload instead of doing the switch in the onReset code.

@martijnwalraven
Copy link
Contributor

@matdurand: Thanks for the investigation and in depth analysis! If the web view indeed doesn't wait for the onReset callback to complete, that would definitely lead to a concurrency issue. The behavior might even have changed between Cordova versions, because I haven't seen this happen before.

I think the solution you propose makes sense. Could you open an issue on cordova-plugin-meteor-webapp? Or even better of course, a PR :)

@matdurand
Copy link
Author

I'm testing a potential fix in the coming days and if it works, I will create a PR. Thanks

@abernix
Copy link
Contributor

abernix commented Dec 16, 2016

@matdurand Any updates on this? Sorry I don't have more help to provide on this matter, but do you mind opening this (or the PR, if you've already got it sorted!) over with the https://github.com/meteor/cordova-plugin-meteor-webapp repro? I think this is definitely worth keeping track of but want to make sure it doesn't get lost here in the larger Meteor repo. 😄

@matdurand
Copy link
Author

The fix we did helped a bit as we seem to have the problem less often, but we are still having gray screen indicating a failure to load the right javascript file. Since it doesn't happen very often, we haven't found more clues as to what might be happening. I'll report back here once I have new information to help fix this issue.

@rjakobsson
Copy link
Contributor

@matdurand what fix did you test?

@matdurand
Copy link
Author

What we did is add a new callback in the meteor webapp package that is called from the javascript before the window.reload to load the new hot code push, to indicate to the native code that it's now the time to point to the new pending version folder from which the files are served. We did that because we had issues of some old files being served along the new one. This was obviously a racing condition since it didn't happen all the time.

The fix seems to help a bit, but we are still having gray screen, mostly on iOS. There is probably a second issue here.

@levinse
Copy link

levinse commented Jan 12, 2017

We just did our 1.2 to 1.4 upgrade and have similar issues. We've done extensive testing (over 350 HCPs by over 10 people on over 20 Android and iOS devices).

Variations include:
a. The HTML-instead-of-JS + blacklisting issue on Android
b. Calling Reload._reload() sometimes fixes the problem, but that seems to be a downloaded + not web app startup timeout + not blacklisted case.
c. iOS sometimes has an infinite splash and it's unclear if it's these same HCP issues or the wkwebview issues that have been around for a while

I'm attaching one set of test results in the hopes it helps resolve this. We've played around with the autoupdate package and the meteor webapp Cordova plugin with no improvement in the end.

This is a major issue at high release velocity on a large user base. For example, at 10K or 100K users with 1-2 HCPs per day, if 3% don't receive a HCP twice in a row or end up on a stuck splash or on two HCP versions back, that's not good, whereas on 1K users with one HCP every other day, you can't even tell there's an issue.

Overall, we're only getting HCP perfectly 91% of the time, one missed HCP about 3.3% of the time (mix of reload-able and timedout/blacklisted, but all work on next HCP, which is fine), and the rest are a mix of the variants of the issues.
screenshot from 2017-01-12 14 38 08

@lorensr
Copy link
Contributor

lorensr commented Feb 1, 2017

@matdurand Would you mind pasting a screenshot of what that code you suggests looks like? I'd love to try it out and see if it resolves the problem for my Android users

@matdurand
Copy link
Author

Hi,
I created a fork of the webapp plugin here https://github.com/matdurand/cordova-plugin-meteor-webapp
I'm also using the reloader package to trigger reload after a hcp, so I also created a fork here https://github.com/matdurand/reloader

If you're not using reloader, you would have to call
cordova.exec(callback, console.error, 'WebAppLocalServer', 'switchPendingVersion', []);
before the reload. The usual reload is usually done inside the meteor js code, but you can use the reload hook to execute the switchPendingVersion before it reloads. Now sure about the exact code but I guess it would look like this

Reload._onMigrate(function (retry) {
  if (Meteor.isCordova) {
    cordova.exec(callback, console.error, 'WebAppLocalServer', 'switchPendingVersion', []);
  }
  return [true, {}];
});

As I stated before, my fix didn't fix the issue completely, but hcp failures are happening less often with this. I wasn't able to pinpoint the issue more accurately.

Hope this helps.

@matdurand
Copy link
Author

Forgot to mention that my webapp fork also include a retry for blacklisted versions. It would retry 2 times before blacklisting instead of once with the vanilla meteor.

@lorensr
Copy link
Contributor

lorensr commented Feb 3, 2017

Thanks a lot Mat! The retrying is working for me. One thing to note is that the retry count must not be persisted, because when I do a HCP that crashes the app, whenever I restart the app, it retries the crash version.

Here's the correct _onMigrate syntax:

Reload._onMigrate('myMigrateFn', (retry) => {
  if (Meteor.isCordova) {
    const onSuccess = () => console.log('called switchPendingVersion');
    cordova.exec(onSuccess, console.error, 'WebAppLocalServer', 'switchPendingVersion', []);
  }
  return [true, {}];
});

@wojtkowiak
Copy link
Contributor

Hey everybody, just recently it started hitting my product badly, especially on slower tablets. What is very upsetting is that I can not afford frequent HCP which usually helps as my app needs to be running 24/7 with minimum downtime and I really need all the clients to migrate within a certain time window. The scenario is exactly what @matdurand described in the first post and it is really serious issue as my client bought a rather large number of devices on which this seems to happen very often.

@matdurand I will try out your changes soon. Did you made any other changes in the last 3 months?

@abernix could you take a look at this if this is something we could work on to have it merged?

@matdurand
Copy link
Author

We didnt have an failing hot code push for a while now, but we are not actively developing for the moment, so there is a lot less hot code push which could be the reason for the lack of failures. In the last month of development, we had a couple of hcp errors, but less than 1% of users.

@stale
Copy link

stale bot commented Dec 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale-bot label Dec 11, 2019
@stale
Copy link

stale bot commented Dec 19, 2019

This issue has been automatically closed it has not had recent activity.

@stale stale bot closed this as completed Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants