Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Firefox iOS < 20.0 from crashing when pointed to Durable Sync #293

Closed
pjenvey opened this issue Oct 18, 2019 · 15 comments · Fixed by #577
Closed

Prevent Firefox iOS < 20.0 from crashing when pointed to Durable Sync #293

pjenvey opened this issue Oct 18, 2019 · 15 comments · Fixed by #577
Assignees
Labels
5 Estimate - l - Moderately complex, will require some effort but clearly defined. p1 Stuff we gotta do before we ship!

Comments

@pjenvey
Copy link
Member

pjenvey commented Oct 18, 2019

The problem

We found when testing against staging during the week of 10/14 that Durable Sync completely crashes iOS. We did some digging and found it to be a result of how GCP handles HTTP/2 headers. Fixing this causes iOS not to crash (although at the moment Sync still does not work).

We plan to move forward enabling for new users after determining that number of new users on ff-ios < 20.0 (version with fix) is ~20 or so. However, we need to keep this open in order to make sure we address for existing sync users on legacy iOS versions prior to migrating existing users.

Possible Solutions

Current favorite is to adjust server logic to return error code preventing sync from crashing browser (but causing it to still fail) and then message users to upgrade.
Documented here

Requirements:

  • User-Agent string for iOS versions we want to identify

    • Format: Firefox-iOS-Sync/<appVersion>b<buildNumber> (<deviceModel>; iPhone OS <systemVersion>) (<app displayName>). E.g. Firefox-iOS-Sync/18.0b1 (iPhone; iPhone OS 13.2.2) (Fennec (synctesting))
  • An appropriate error code/payload to return

Related links:

Fix for firefox-ios
Related Sync crashing issue
Support case with GCP

@jrconlin
Copy link
Member

Is this for iOS? Are we sure that this is because of HTTP/2 headers?

I feel like we may need to add a bit more detail here.

@tublitzed
Copy link
Contributor

@jrconlin, yes we need a lot more detail. I'm filing this to track progress as we investigate. AFAIK, @fzzzy confirmed last week that it is indeed the headers issue that's crashing iOS. That's the blocker here; that iOS completely crashes.

Sync not working is one thing (which is also bad), but completely crashing Firefox for iOS is going to be a blocker here.

@tublitzed tublitzed added this to Prioritized in Services Engineering Oct 21, 2019
@fzzzy
Copy link
Contributor

fzzzy commented Oct 21, 2019

From the support case with GCP, it looks like this is a solution:

https://cloud.google.com/community/tutorials/nginx-ingress-gke

@tublitzed tublitzed moved this from Prioritized to In Progress in Services Engineering Oct 22, 2019
@tublitzed
Copy link
Contributor

Thanks @fzzzy! I just filed this bugzilla issue for ops to take a look.

@fzzzy
Copy link
Contributor

fzzzy commented Oct 22, 2019

Bob says this arrangement is very undesirable and should only be used as a last resort.

@fzzzy fzzzy self-assigned this Oct 28, 2019
@tublitzed tublitzed changed the title Find a workaround for HTTP/2 headers issue within GCP. Find a workaround for HTTP/2 headers issue within GCP (for existing sync users) Nov 1, 2019
@tublitzed tublitzed added p2 Post "release" follow-up enhancements and removed p1 Stuff we gotta do before we ship! labels Nov 1, 2019
@tublitzed
Copy link
Contributor

Downgrading the severity of this one, and sticking into the TODO list for when we migrate existing users.

@tublitzed tublitzed moved this from In Progress to Prioritized in Services Engineering Nov 1, 2019
@tublitzed tublitzed moved this from Prioritized to Backlog: Sync in Services Engineering Nov 4, 2019
@jchaupitre
Copy link

jchaupitre commented Mar 9, 2020

Alex shared FxA: Firefox iOS WAU by version >20 or <20 indicating 4-5% of iOS users are still on version <20.

@tublitzed
Copy link
Contributor

tublitzed commented Mar 12, 2020

Since it's been awhile since this issue cropped up, I want to test one more time that it's actually still a problem before we dedicate time to solving this one.

Testing locally against the last tagged release prior to 20.x which is 17.3. Looks like we did have other releases between 17.x and 20.x but they weren't tagged.

17.x requires Swift 5.0.1 which in turn requires XCode 10.2.1 which I found here and am downloading now. Also requires command line tools for Xcode 10.2.1

I'll build locally and test using my personal FxA account since I know that one is routed to Spanner. Will add notes here as to what I find.

@tublitzed tublitzed self-assigned this Mar 12, 2020
@tublitzed tublitzed added p1 Stuff we gotta do before we ship! and removed p2 Post "release" follow-up enhancements labels Mar 12, 2020
@tublitzed tublitzed moved this from Backlog: Sync to In Progress in Services Engineering Mar 12, 2020
@tublitzed
Copy link
Contributor

tublitzed commented Mar 13, 2020

And...it looks like I can't actually build this locally because my version of MacOS is too new to install the required Xcode 10.2.1 dev tools. 😢

Will sync up with @rbillings to see if we can get some QA time here otherwise investigate VM options for running an old mac OS version. Or connect with ios team to see if we can't get a version closer to 19.x which may not be so outdated.

@tublitzed tublitzed moved this from In Progress to Scheduled in Services Engineering Mar 13, 2020
@tublitzed
Copy link
Contributor

tublitzed commented Mar 13, 2020

Just found this 19.x branch poking around the ios repo. Going to try that now just in case. 🤞 Building from this commit.

Misc notes:

  • Manually changed the app services dependency here to v0.39.1 which is a more recent release closer to this commit hash timeline which is roughly Sept 23, 2019.

System requirements:

@tublitzed tublitzed moved this from Scheduled to In Progress in Services Engineering Mar 13, 2020
@tublitzed
Copy link
Contributor

tublitzed commented Mar 16, 2020

Ok, it looks like this will need to get some official QA testing. My machine is not old enough to test legacy builds :(. Luckily, it sounds like the QA team has access to older ios builds, which is good news. To be clear, here's what we need specifically:

  • Clarify what happens to the browser when attempting to sync using Firefox-ios 19.x while pointed at Durable Sync. Ie:

    • Does the sync succeed?
    • Does anything happen with the browser that's unexpected during this time?

@rbillings I'm going to re-assign this one to you to coordinate QA testing with your team. Once that's done, feel free to re-assign this one back over to me with the results linked/etc.

@tublitzed tublitzed assigned rbillings and unassigned tublitzed Mar 16, 2020
@tublitzed tublitzed moved this from In Progress to Prioritized in Services Engineering Mar 16, 2020
@tublitzed
Copy link
Contributor

Related PI request: https://jira.mozilla.com/browse/PI-536

@tublitzed
Copy link
Contributor

It's also worth noting that this chart today indicates that we're down to ~3.6% of users on FF iOS < 20.0

@tublitzed
Copy link
Contributor

Confirmed. It's still an issue.

@tublitzed tublitzed moved this from Scheduled to Prioritized in Services Engineering Mar 20, 2020
@tublitzed tublitzed changed the title Find a workaround for HTTP/2 headers issue within GCP (for existing sync users) Prevent Firefox iOS < 20.0 from crashing when pointed to Durable Sync Mar 20, 2020
@tublitzed tublitzed moved this from Prioritized to Scheduled in Services Engineering Mar 23, 2020
@tublitzed tublitzed added the 5 Estimate - l - Moderately complex, will require some effort but clearly defined. label Mar 30, 2020
@pjenvey
Copy link
Member Author

pjenvey commented Apr 7, 2020

Some kind of 5xx would be best here, we'll opt for returning a 503 here because:

  • The nginix frontend converts most 5xx responses to 503s anyway
  • iOS handles 503s similar to other 5xxes (giving up and not retrying for another AFAICT 15 minutes)

Despite returning a 503, a Retry-After header is not applicable here as the client won't be able to read it due to the case insensitivity issue.

pjenvey added a commit that referenced this issue Apr 8, 2020
to prevent them from triggering a bug that crashes them

and update Dockerfile builder to latest rust

Closes #293
Services Engineering automation moved this from Scheduled to Done Apr 8, 2020
@tublitzed tublitzed moved this from Done to Archived in Services Engineering Apr 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 Estimate - l - Moderately complex, will require some effort but clearly defined. p1 Stuff we gotta do before we ship!
Projects
Development

Successfully merging a pull request may close this issue.

6 participants