Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android fetch hangs indefinitely with IPv6 hosts on some devices (Happy Eyeballs) #32730

Open
andreialecu opened this issue Dec 9, 2021 · 48 comments
Labels
Needs: Triage 🔍 🌐Networking Related to a networking API. Platform: Android Android applications.

Comments

@andreialecu
Copy link

andreialecu commented Dec 9, 2021

Description

This is a very bizarre issue that has been previously reported a bunch of times, and this is basically a continuation of:

#29608

I initially started running into this on RN 0.66 with AWS Cognito. Bumping to 0.66.3 didn't help.

I'm also pretty sure this used to work before and I'm not sure when it broke. It's on an app that has been shelved for a while.

The problem is very strange because the network request does not seem to be issued, but simply hitting CMD+S to save any file so that a hot-reload is issued will immediately dispatch the network request.

I discovered the promise hanging issue by adding some logs to the fetch calls the cognito library was doing:
image
Notice how the .then is not executed.

While troubleshooting I came across a mention here of a workaround: #29608 (comment) (courtesy of @danmaas) which seems to completely resolve the issue.

Here's the same .then correctly being executed after applying that patch:
image

Version

0.66.3

Output of react-native info

System:
    OS: macOS 12.0.1
    CPU: (8) arm64 Apple M1
    Memory: 142.27 MB / 16.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 16.13.0 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/node
    Yarn: 2.4.0-git.20210330.hash-ebcd71d5 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/yarn
    npm: 7.20.1 - ~/.nvm/versions/node/v16.13.0/bin/npm
    Watchman: 2021.11.01.00 - /opt/homebrew/bin/watchman
  Managers:
    CocoaPods: 1.11.0 - /Users/andreialecu/.rbenv/shims/pod
  SDKs:
    iOS SDK:
      Platforms: DriverKit 21.0.1, iOS 15.0, macOS 12.0, tvOS 15.0, watchOS 8.0
    Android SDK: Not Found
  IDEs:
    Android Studio: 4.2 AI-202.7660.26.42.7351085
    Xcode: 13.1/13A1030d - /usr/bin/xcodebuild
  Languages:
    Java: 11.0.8 - /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/javac
  npmPackages:
    @react-native-community/cli: Not Found
    react: 17.0.2 => 17.0.2 
    react-native: 0.66.3 => 0.66.3 
    react-native-macos: Not Found
  npmGlobalPackages:
    *react-native*: Not Found

Steps to reproduce

I'm able to reproduce it with this:

  React.useEffect(() => {
    console.log('confirm start', new Date());
    fetch('https://cognito-idp.eu-west-1.amazonaws.com/', {
      method: 'POST',
      mode: 'cors',
    })
      .then(() => console.log('then', new Date()))
      .catch(() => console.log('catch', new Date()));
    setTimeout(() => {
      console.log('5 seconds passed');
    }, 5000);
  }, []);

Output:
Screenshot 2021-12-09 at 19 29 06

After applying #29608 (comment):
Screenshot 2021-12-09 at 19 30 32

Snack, code example, screenshot, or link to a repository

No response

Skip to this comment for the actual cause: #32730 (comment)

@andreialecu
Copy link
Author

andreialecu commented Dec 9, 2021

Note that this doesn't repro on the emulator, this is the output on the emulator:

 LOG  confirm x 2021-12-09T18:44:08.229Z
 LOG  then x 2021-12-09T18:44:08.607Z
 LOG  5 seconds passed

❗ The device where this is happening, and where I initially noticed it on is an Oppo A72 (CPH2067) running Android 11

@andreialecu
Copy link
Author

I'm also not able to reproduce it on a Samsung Galaxy Tab A (SM-T290) device.

@andreialecu andreialecu changed the title Android fetch hangs indefinitely with IPv6 hosts Android fetch hangs indefinitely with IPv6 hosts on some devices Dec 9, 2021
@danmaas
Copy link

danmaas commented Dec 9, 2021

In my case, this issue also affected only some devices. I think it has something to do with the network stack or driver on the device. Or how they interact with IPv6 devices out on the network.

@andreialecu
Copy link
Author

Interesting. I left it running for a while and it actually seems to time out on the connection:

 LOG  confirm x 2021-12-09T19:00:10.557Z
 LOG  5 seconds passed
 LOG  then x 2021-12-09T19:04:23.177Z

It took 4 minutes to time out.

I just noticed that the device does not have IPv6 connectivity at all based on https://test-ipv6.com/

Might be a router problem on my end, usually I have IPv6. However, I'm not sure why it would even attempt to use an IPv6 address considering it does not have IPv6 connectivity at all.

@andreialecu
Copy link
Author

Very interesting, I rebooted the router and now I have IPv6 again.

❗ The Oppo now no longer has the error. It works perfectly:

 LOG  confirm x 2021-12-09T19:28:07.714Z
 LOG  then x 2021-12-09T19:28:09.217Z
 LOG  5 seconds passed

I'm not sure what to make of this. 🤔

@vamper424
Copy link

Yo mismo me encargo de resolver lo de sus vidas, ni hay ni habrá problema alguno, temor alguno o alguien que me agreda saludo sigan igual y cada vez mejoren más, ok. Si se puede. Atte: Jesus Francisco Urias García.

@vamper424
Copy link

Y nuevamente tu gol, esta vez artificial.

@vamper424
Copy link

Jajaja mediocres ellos, sobre ellos.

@andreialecu
Copy link
Author

The root cause of this issue seems to be the lack of "Happy Eyeballs" in the underlying okhttp library that react-native is using on Android.

Here's the issue in okhttp that is tracking this:
square/okhttp#506

I think most of the random networking Android slowdowns reported all over the ecosystem are probably related. This is the one I ran into in particular:
aws-amplify/amplify-js#5539

Possibly related issues:
#32467
#29782

This thread also mentions this exact same issue:
#28283 (comment)

I believe a PR to implement Happy Eyeballs into okhttp shouldn't be too hard, if someone would be able to contribute one.

From my understanding a TCP connection to both the IPv6 and IPv4 address needs to be attempted, and the first one to respond is used, while the other one is closed.

This would solve this ubiquitous Android networking problem once and for all.

@marcesengel
Copy link
Contributor

@andreialecu would you be so kind to review my PR?

@andreialecu
Copy link
Author

andreialecu commented Jan 21, 2022

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

There is some movement there that indicates this problem might finally get some attention.

See:
square/okhttp#7009

@marcesengel
Copy link
Contributor

@andreialecu looking at the old issue regarding this a change on their end seemed unrealistic without a new major release (due to interceptors needing to be raced, see for example this comment, which reads "It's an invasive feature to add because there's extra complexity. If you need it, don't wait for us."), which in turn would mean that it probably takes some time to adapt this to react-native.

I don't see any issue with implementing it for RN in the meantime. I'd have added it to okhttp if it didn't look like it'd take ages to catch on - in case it ever did.

@andreialecu
Copy link
Author

I see. That makes sense.

Not sure who would need to take a look at this. I can't review it myself since I'm not familiar with the RN code dealing with this.

Perhaps @yungsters would be able to take a look?

@cortinico
Copy link
Contributor

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

Essentially this. Please take a look at my answer here: #33045 (comment) as I believe that's the most feasible solution we could take at this stage.

@marcesengel
Copy link
Contributor

I've built a patched version for use until OkHTTP 5.x is stable (now that since the first of February the implementation of Happy Eyeballs is confirmed). It's available as marcesengel/react-native#0.66-patched-3.

@cortinico
Copy link
Contributor

Cross-posting here as I realized I accidentally left the comment on the PR instead of the Issue: #33045 (comment)

Would you be up for opening a draft PR with the OkHTTP bump to 5.x alpha? It could help us spot early breaking changes and we could test it against the internal infra to make sure nothing breaks.
(That's also a task that someone else in the community can pickup if they wish).

@marcesengel
Copy link
Contributor

I'd be open to do the transition once major version 5 is stable, as for now there's a fix and I wouldn't want to change something multiple times in case more breaking changes are introduced before v5 hits stable.
If somebody else would like to do it beforehand and take the risk in order to make it available faster that of course would be much appreciated 👍

Do you know if there's an ETA for the stable release?

@andreialecu andreialecu changed the title Android fetch hangs indefinitely with IPv6 hosts on some devices Android fetch hangs indefinitely with IPv6 hosts on some devices (Happy Eyeballs) Feb 5, 2022
@cortinico
Copy link
Contributor

Nope but you can subscribe here to get updates

@cortinico
Copy link
Contributor

I was wondering how the builds at Facebook are working?

I'm not on a M1 so I haven't faced those issues at all. Thanks for pointing them out and addressing them though 👍

@marcesengel
Copy link
Contributor

marcesengel commented Mar 4, 2022

Sorry for not putting it better - I wasn't meaning the M1 issues but the clang version included in the NDK, as RN is on NDK 21.4.xxx now which ships with clang 9.0.9 while Buck is looking for clang 9.0.8 vor every NDK major > 20.

Regardless I still have issues running the tests, as Buck tries to compile yoga with only -fno-omit-frame-pointer -O3 while in YGEnums.h for example constexpr is used, which would require at least C++ 11. When also passing -std=c++17 to fix this, I get String table does not end at end of file at react-native/buck-out/gen/ReactAndroid/src/main/jni/first-party/yogajni/jni#default,shared/libyoga.dylib 😖

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Edit: On the official Clang website it says by default C++ 14 is used? Why am I having issues with this, maybe some Apple Clang specifics? https://clang.llvm.org/cxx_status.html

@cortinico
Copy link
Contributor

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Have you considered using the docker image instead?

@marcesengel
Copy link
Contributor

Yes, sadly that didn't work. Using the image I got a SIGILL (Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled). According to some sources this can be because the CPU (or in this case the x86 emulator) is missing an instruction, which could be fixed using the java flag -XX:UseSSE=2. I tried supplying it using .buckjavaargs to no avail.
Alternatively hsdis-amd64.so could just actually be missing, which is why I tried building the reactnativecommunity/react-native-android image on my machine. On arch amd64 this got stuck without showing any errors etc. on the step [javac] Compiling 109 source files to /buck/ant-out/src-gen/classes [writing all of this down that might have been due to the problem described above, maybe I should try building the container with -XX:UseSSE=2]. Then I tried building the, on my machine native, arm64v8 arch which completed the step the other arch was stuck at, but ultimately failed to install the android emulator due to missing binaries iirc.

Then I concluded: my machine has an emulator and everything set up, why don't I just do everything on the host? And then I ran into the issues described in my messages above. Thinking about it again the unit tests might not require an emulator, do they? If not I could emit the emulator from my local arm64v8 image and use it to execute the unit tests.

@cortinico
Copy link
Contributor

I'm not sure if you even need buck to try this, honestly. You can do everything on the Gradle side of things (on Buck you'll have to bump the version number of OkHTTP).

I would replicate this commit and fix any build failures you might face via the CI or via a Gradle build: d6db5c5

@marcesengel
Copy link
Contributor

Yup been thinking the same recently, will do so until everything is ready for M1. It also seems like it's a Buck issue with the bumped 3rd party libs as someone else ran into the same problem building something else. Will also look into this.

@andreialecu
Copy link
Author

andreialecu commented Aug 26, 2022

Here's an excellent write-up from Reddit about this issue: https://www.reddit.com/r/RedditEng/comments/v1upr8/ipv6_support_on_android/

As you can see, it can affect things on a large scale.

This really needs to be prioritized by the core react-native team, I can't imagine it's not affecting their own apps at Meta.

@yschimke
Copy link

At a guess, I'd expect Meta will be exclusively using proxygen or mvfst as their client.

@cortinico
Copy link
Contributor

This really needs to be prioritized by the core react-native team

As I already mentioned above, there is nothing to handle here. You can bump your OkHTTP version to 5 alpha and you'll get the Happy Eyeball support out of the box.

Once 5 becomes stable, we'll be bumping the default OkHTTP version in the template.

@andreialecu
Copy link
Author

andreialecu commented Sep 16, 2022

I have this sorted out in my projects.

Still, I think it needs to be prioritized because almost all React Native Android apps will run into this problem sooner or later, and it's not an obvious issue.

As I already mentioned above, there is nothing to handle here.

I think something should be done about it to at least make it obvious to the community that their Android apps will not work consistently 100% of the time. See all the threads I linked, and the Reddit story.

@andreialecu
Copy link
Author

And if I may rehash my previous comment:

Also, as mentioned here: square/okhttp#506 (comment)

If you're prepared to deal with small API changes between now and 5.0 final you can put it into production today. For OkHttp ‘alpha’ means ‘API instability’ but the code is extremely stable.

@hadeskers
Copy link

hadeskers commented Feb 3, 2023

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 
  
        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }

@lukawolf
Copy link

If you are using expo and your project is managed, you can not edit android/app/build.gradle yourself. Since I've not found any expo plugin, which would make the changes for you, I just made one: expo-android-app-gradle-dependencies

@JKKholmatov
Copy link

JKKholmatov commented Aug 30, 2023

@hadeskers @lukawolf @andreialecu @cortinico I created an Expo app and run expo prebuild, how should I implement okhttp:5.0.0-alpha.11 version? is there any instruction? please help!

@JKKholmatov
Copy link

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 
  
        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }

It did not work for me, is there any other way to implement it?

@irisjae
Copy link

irisjae commented Oct 20, 2023

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 
  
        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }

May I ask, having applied this, it is necessary to further modify the OkHttpClientProvider by adding .fastFallback(true) square/okhttp#506 (comment) , and setting React Native to build from source https://reactnative.dev/contributing/how-to-build-from-source#update-your-project-to-build-from-source ?

@yschimke
Copy link

fastFallback is the default in alpha.11 already

https://github.com/square/okhttp/blob/2344b87c59456088ed1346221d45648634f7f0c8/CHANGELOG.md?plain=1#L12C1-L12C1

@irisjae
Copy link

irisjae commented Oct 20, 2023

Strange, I'm still having issues... Fetch with IP works, but not with domain names. Let me look further.

@irisjae
Copy link

irisjae commented Oct 20, 2023

I've gotten to the bottom of my issue (hopefully), leaving it here in case it helps anyone. As I reported, after applying the Happy Eyeballs patch, I still had issues.

Fetching with IP seemed to be more reliable than with domain names. Still, with more systematic testing, I found that even when fetching via IP, my app exhibited stochastic hanging, just less frequently. Still seeing stochasticity, I thought to adjust the connection pool, so I set my app to build from source for Android, and modified OkHttpClientProvider. I knew that OkHttp kept some kind of connection pool, and assumed it was fixed in size. As my app had some long-running HTTP requests, I thought that OkHttp was waiting for the long-running requests to complete before allowing my new requests to use them from the pool. Assuming 'maxIdleConnections' was the number of connections kept in the pool, I turned the connections up. It was to no avail.

Reading more about the connection pool, it seems like instead of keeping a fixed number of connections, OkHttp tries to keep the number of connections close to Number of connections in use + Number of idle connections. It seems that OkHttp tries to somehow place 'similar' requests on to a shared connection to minimise time reestablishing connection, but on the other hand square/okhttp#4354 (comment), sharing may cause failures.

I thought that perhaps with long-running requests in my app, I would be better served with no connection pool. This can be effectively done by setting 'maxIdleConnections' to 0.

My randomly hanging fetches ceased.

Edit: Attempting to disable the Connection Pool did not eliminate the issue.

As of now, I believe that the culprit is OkHttp's connection sharing somehow breaking things. For now, I removed my patch on OkHttpClientProvider.java, and instead applied a patch on NetworkingModule.java, which violently creates a separate OkHttpClient for each request and destroys the client after use (against all the advice of OkHttp), to eliminate any connection sharing. Nothing seems to hang randomly anymore so far (hopefully thats it...).

i have to say, it frustrates me greatly that the network stack on React Native Android is causing me so much trouble and tinkering with network code, when all I'm using is plain fetches with JSON. I wonder if some of the default configurations should need further improvement for React Native Android - contrasting with the networking on iOS, which has worked flawlessly.

Edit: For the record, both disabling the connection pool and violently preventing connection sharing were crucial to making network connections work reliably on low-end devices. Removing either of the patches reintroduces the issue.

@yschimke
Copy link

OkHttp 5 has a connection listener you can configure on the pool. It should expose all this so it's not surprising.

@irisjae
Copy link

irisjae commented Oct 21, 2023

How might the connection listener help?

@yschimke
Copy link

Maybe not solve the problem. But let you see what's going on, when connections are reused, released, fail.

@consuelo-sanna
Copy link

I have this sorted out in my projects.

Still, I think it needs to be prioritized because almost all React Native Android apps will run into this problem sooner or later, and it's not an obvious issue.

As I already mentioned above, there is nothing to handle here.

I think something should be done about it to at least make it obvious to the community that their Android apps will not work consistently 100% of the time. See all the threads I linked, and the Reddit story.

Good morning, my project bump on this issue since Android 14 officially come out and it took me one week before being able to connect the dots and realize it was an IPv6 problem and end up here, so I totally agree that this it's not an obvious issue and should be at least pointed out somewhere in the documentation (if there is, I could not find it) or on troubleshooting/common issues.
The app I work on is for US people only, so in order for me to test it I need to go through a VPN and this made me impossible reproduce the issue since the one I chose does not support IPv6 and this made it difficult to debug

Since Android 14, every user of our app based in the USA could not use the app since the first api call instead of taking less of a second it took more than 2 minutes, I have a reproducible repo with latest RN version here if you are interested.

In order to reproduce the issue, the only thing you have to do is use an Android 14 phone with data connection and an ISP provider that supports IPv6, Verizon and AT&T for example

The only thing that solved it, was leveraging okhttp3 okhttp:5.0.0-alpha.12 (alpha 11 did not solved for us) OR patching following danmaas's example here https://gist.github.com/danmaas/c60af5fed9f55d2bc616ce302696540d

I really think this issue should be more visible, especially since Android 14 is out now and I think more and more people/apps will start experiencing this, let me know if there's something I can do to help with that, I was not sure opening a new issue on the repo was a good idea or not since there are already quite a few on the argument, like:

#28022
#28038
#29608

@shubhkhanna
Copy link

shubhkhanna commented Apr 10, 2024

For me it solved the problem for IPv6 support on android.

dependencies {
      // others dependencies 

      // define a BOM and its version
      implementation(platform("com.squareup.okhttp3:okhttp-bom:5.0.0-alpha.12"))

      // define any required OkHttp artifacts without version
      implementation("com.squareup.okhttp3:okhttp")
      implementation("com.squareup.okhttp3:logging-interceptor")
      implementation("com.squareup.okhttp3:okhttp-urlconnection")
}

Add following dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.12

"react": "18.2.0",
"react-native": "0.71.11"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs: Triage 🔍 🌐Networking Related to a networking API. Platform: Android Android applications.
Projects
None yet
Development

Successfully merging a pull request may close this issue.