Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Firestore] hangs indefinitely #1162

Closed
ryanccn opened this issue Feb 9, 2021 · 26 comments
Closed

[Firestore] hangs indefinitely #1162

ryanccn opened this issue Feb 9, 2021 · 26 comments

Comments

@ryanccn
Copy link

ryanccn commented Feb 9, 2021

[REQUIRED] Step 2: Describe your environment

  • Operating System version: macOS Big Sur v11.2
  • Firebase SDK version: 8.13.0
  • Firebase Product: Firestore
  • Node.js version: v15.8.0
  • NPM version: 7.5.0

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

Doing anything with Firestore hangs indefinitely. After testing with resolutions of @grpc/grpc-js,

  • <0.6.0: returns "Protocol https: unsupported"
  • 0.6.0: API returns {"error":{}}
  • >0.6.0: hangs indefinitely (returns an error message like this:)
node:events:355
      throw er; // Unhandled 'error' event
      ^

Error: 14 UNAVAILABLE: No connection established
    at Object.callErrorFromStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call.js:30:26)
    at Object.onReceiveStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:328:49)
    at Object.onReceiveStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:304:181)
    at Http2CallStream.outputStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:114:27)
    at Http2CallStream.maybeOutputStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:153:22)
    at Http2CallStream.endCall (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:140:18)
    at Http2CallStream.cancelWithStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:442:14)
    at ChannelImplementation.tryPick (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/channel.js:232:32)
    at ChannelImplementation._startCallStream (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/channel.js:262:14)
    at Http2CallStream.start (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:419:22)
    at BaseStreamingInterceptingCall.start (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:276:19)
    at ServiceClientImpl.makeServerStreamRequest (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:311:14)
    at ServiceClientImpl.<anonymous> (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/make-client.js:92:19)
    at /Users/ryanccn/firebase/node_modules/@google-cloud/firestore/build/src/v1/firestore_client.js:201:29
    at /Users/ryanccn/firebase/node_modules/google-gax/build/src/streamingCalls/streamingApiCaller.js:38:28
    at /Users/ryanccn/firebase/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
Emitted 'error' event on ClientReadableStreamImpl instance at:
    at Object.onReceiveStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:328:28)
    at Object.onReceiveStatus (/Users/ryanccn/firebase/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:304:181)
    [... lines matching original stack trace ...]
    at /Users/ryanccn/firebase/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
    at Object.request (/Users/ryanccn/firebase/node_modules/google-gax/build/src/streamingCalls/streaming.js:105:36) {
  code: 14,
  details: 'No connection established',
  metadata: Metadata { internalRepr: Map(0) {}, options: {} }
}

Relevant Code:

http://github.com/leerob/nextjs-vercel-firebase

// lib/firebase.js

import admin from 'firebase-admin';

try {
  admin.initializeApp({
    credential: admin.credential.cert({
      project_id: process.env.FIREBASE_PROJECT_ID,
      private_key: process.env.FIREBASE_PRIVATE_KEY,
      client_email: process.env.FIREBASE_CLIENT_EMAIL
    }),
    databaseURL: 'https://ryanccn-dev-36130-default-rtdb.firebaseio.com/'
  });
} catch (error) {
  /*
   * We skip the "already exists" message which is
   * not an actual error when we're hot-reloading.
   */
  if (!/already exists/u.test(error.message)) {
    // eslint-disable-next-line no-console
    console.error('Firebase admin initialization error', error.stack);
  }
}

export default admin.firestore();

(This is a Next.js API route)

// pages/api/city/[name].js

import firebase from '../../../lib/firebase';

export default (req, res) => {
  console.log()
  firebase
    .collection('cities')
    .doc(req.query.name)
    .get()
    .then((doc) => {
      res.json(doc.data());
    })
    .catch((error) => {
      res.json({ error });
    });
};
@google-oss-bot
Copy link

I found a few problems with this issue:

  • I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.
  • This issue does not seem to follow the issue template. Make sure you provide all the required information.

@ryanccn
Copy link
Author

ryanccn commented Feb 9, 2021

@google-oss-bot I deleted the first step thinking it was unnecessary

@schmidt-sebastian
Copy link
Contributor

Firestore uses @grpc/grpc-js. Are you able to share your output of npm list?

@ryanccn
Copy link
Author

ryanccn commented Feb 11, 2021

Firestore uses @grpc/grpc-js. Are you able to share your output of npm list?

Yeah I meant grpc-js 😅

@ryanccn
Copy link
Author

ryanccn commented Feb 11, 2021

nextjs-vercel-firebase@0.0.0
├── firebase-admin@8.12.1
├── next@9.4.0
├── react-dom@16.13.1
├── react@16.13.1
└── swr@0.2.0

@cprichmond
Copy link

cprichmond commented Feb 12, 2021

I'm having a similar issue.

When I follow the instructions found here: https://firebase.google.com/docs/admin/setup I get the same Error: 14 UNAVAILABLE: No connection established after about a 60-second wait.

I have created a service account and have the resulting json file available to my node project.

This is a very, very simple node app that just tries to auth and get a document from firestore. It uses...

node 12.19.0 (have also tried 15.x)
firebase-admin: ^9.4.2

I am just performing a simple get on a document that exists in a collection that exists (very similar to the code above). I have used firestore successfully in the past with node without issue.

NOTE: In the code below I preset the GOOGLE_APPLICATION_CREDENTIALS to the serviceAccount JSON. I have also tried to import the file and pass it to the creds function. Both have the same issue.

import admin from 'firebase-admin';

admin.initializeApp({
  credential: admin.credential.applicationDefault(),
  databaseURL: 'https://project-name.firebaseio.com', // project-name subbed for my specific project
});

const db = admin.firestore();

export const getDoc = async (): Promise<void> => {

  try {
    const dbRef = db.collection('collection'); // collection subbed for my existing collection
    const result = await dbRef.doc('doc-id').get(); // doc-id subbed for an existing doc-id
    console.log(result);
  } catch (err) {
    console.error("error connecting", err);
  }
};

@ryanccn
Copy link
Author

ryanccn commented Feb 13, 2021

@schmidt-sebastian interestingly this problem doesn't exist on a Vultr server I created for the purpose 🧐

Screen Shot 2021-02-13 at 19 14 41

Perhaps this is because I use a proxy in my development?

@cprichmond
Copy link

@ryanccn perhaps this is mac related? I am on macOS Catalina 10.15.7

I dropped the Firebase Admin SDK back to 7.x and got the following error:

Handshake failed with fatal error SSL_ERROR_SSL: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed.

Could be a red herring, but thought it was worth mentioning.

@ryanccn
Copy link
Author

ryanccn commented Feb 14, 2021

Hmm.

@schmidt-sebastian
Copy link
Contributor

Note that you can turn on additional logging by setting the following environment variables:

GRPC_VERBOSITY= DEBUG GRPC_TRACE=all

This should show extensive logging output and might point us to the root cause.

@cprichmond
Copy link

cprichmond commented Feb 18, 2021

Thanks for the tip @schmidt-sebastian. Here are the subsequent logs and errors which look very similar to those I received when I dropped back to 7.x of the SDK...

I can paste the full logs but here are the relevant repeating parts. Appears the issue is that connection closed with error self signed certificate in certificate chain.

2021-02-18T02:42:45.479Z | pick_first | CONNECTING -> CONNECTING
2021-02-18T02:42:45.479Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-18T02:42:45.479Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-18T02:42:45.480Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-18T02:42:45.649Z | subchannel | 2404:6800:4015:802::200a:443 connection closed with error self signed certificate in certificate chain
2021-02-18T02:42:45.649Z | subchannel | 2404:6800:4015:802::200a:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-18T02:42:45.649Z | pick_first | CONNECTING -> CONNECTING
2021-02-18T02:42:45.649Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-18T02:42:45.649Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-18T02:42:45.649Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-18T02:42:45.650Z | subchannel | 142.250.70.234:443 connection closed with error self signed certificate in certificate chain
2021-02-18T02:42:45.650Z | subchannel | 142.250.70.234:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-18T02:42:45.650Z | pick_first | CONNECTING -> TRANSIENT_FAILURE
2021-02-18T02:42:45.650Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-18T02:42:45.650Z | channel | Pick result: TRANSIENT_FAILURE subchannel: undefined status: 14 No connection established
2021-02-18T02:42:45.650Z | call_stream | [0] cancelWithStatus code: 14 details: "No connection established"
2021-02-18T02:42:45.650Z | call_stream | [0] ended with status: code=14 details="No connection established"
2021-02-18T02:42:45.651Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-18T02:42:45.651Z | call_stream | [0] cancelWithStatus code: 1 details: "Cancelled on client"
2021-02-18T02:42:46.484Z | subchannel | 2404:6800:4015:802::200a:443 TRANSIENT_FAILURE -> CONNECTING
2021-02-18T02:42:46.484Z | pick_first | TRANSIENT_FAILURE -> CONNECTING

@cprichmond
Copy link

cprichmond commented Feb 18, 2021

Upon further reading of the above issue. I think I am the victim of corporate malware. This is a work MacBook and subsequently has Netskope and god knows what else monitoring and messing with SSL traffic. I added export NODE_TLS_REJECT_UNAUTHORIZED=0 and this got me further down the road. I am going to run this on a non-work regulated Mac; if that works then it's pretty obvious where the issue lies. Not sure if there is a fix for it though as policy restricts me from disabling the corporate malware.

Are there other ways to authenticate the firebase admin client if there are no workarounds available? As it seems the issue here is some software between my app and the internet is tampering with the cert.

@ryanccn
Copy link
Author

ryanccn commented Feb 19, 2021

2021-02-19T09:13:14.252Z | resolving_load_balancer | dns:firestore.googleapis.com:443 IDLE -> IDLE
2021-02-19T09:13:14.252Z | connectivity_state | dns:firestore.googleapis.com:443 IDLE -> IDLE
2021-02-19T09:13:14.253Z | dns_resolver | Resolver constructed for target dns:firestore.googleapis.com:443
2021-02-19T09:13:14.262Z | channel | dns:firestore.googleapis.com:443 createCall [0] method="/google.firestore.v1.Firestore/BatchGetDocuments", deadline=Fri Feb 19 2021 17:18:14 GMT+0800 (China Standard Time)
2021-02-19T09:13:14.264Z | call_stream | [0] Sending metadata
2021-02-19T09:13:14.264Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.266Z | call_stream | [0] write() called with message of length 213
2021-02-19T09:13:14.266Z | call_stream | [0] end() called
2021-02-19T09:13:14.267Z | call_stream | [0] deferring writing data chunk of length 218
2021-02-19T09:13:14.268Z | dns_resolver | Resolution update requested for target dns:firestore.googleapis.com:443
2021-02-19T09:13:14.271Z | resolving_load_balancer | dns:firestore.googleapis.com:443 IDLE -> CONNECTING
2021-02-19T09:13:14.271Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.271Z | connectivity_state | dns:firestore.googleapis.com:443 IDLE -> CONNECTING
2021-02-19T09:13:14.271Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-19T09:13:14.271Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.271Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-19T09:13:14.452Z | dns_resolver | Resolved addresses for target dns:firestore.googleapis.com:443: [172.217.4.138:443]
2021-02-19T09:13:14.452Z | pick_first | IDLE -> IDLE
2021-02-19T09:13:14.453Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> IDLE
2021-02-19T09:13:14.453Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.453Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> IDLE
2021-02-19T09:13:14.453Z | pick_first | Connect to address list 172.217.4.138:443
2021-02-19T09:13:14.454Z | subchannel | 172.217.4.138:443 refcount 0 -> 1
2021-02-19T09:13:14.454Z | subchannel | 172.217.4.138:443 refcount 1 -> 2
2021-02-19T09:13:14.454Z | pick_first | Start connecting to subchannel with address 172.217.4.138:443
2021-02-19T09:13:14.454Z | pick_first | IDLE -> CONNECTING
2021-02-19T09:13:14.455Z | resolving_load_balancer | dns:firestore.googleapis.com:443 IDLE -> CONNECTING
2021-02-19T09:13:14.455Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.455Z | connectivity_state | dns:firestore.googleapis.com:443 IDLE -> CONNECTING
2021-02-19T09:13:14.455Z | subchannel | 172.217.4.138:443 IDLE -> CONNECTING
2021-02-19T09:13:14.455Z | pick_first | CONNECTING -> CONNECTING
2021-02-19T09:13:14.456Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-19T09:13:14.456Z | channel | Pick result: QUEUE subchannel: undefined status: undefined undefined
2021-02-19T09:13:14.456Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> CONNECTING
2021-02-19T09:14:29.844Z | subchannel | 172.217.4.138:443 connection closed with error connect ETIMEDOUT 172.217.4.138:443
2021-02-19T09:14:29.847Z | subchannel | 172.217.4.138:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-19T09:14:29.847Z | pick_first | CONNECTING -> TRANSIENT_FAILURE
2021-02-19T09:14:29.848Z | resolving_load_balancer | dns:firestore.googleapis.com:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-19T09:14:29.848Z | channel | Pick result: TRANSIENT_FAILURE subchannel: undefined status: 14 No connection established
2021-02-19T09:14:29.848Z | call_stream | [0] cancelWithStatus code: 14 details: "No connection established"
2021-02-19T09:14:29.849Z | call_stream | [0] ended with status: code=14 details="No connection established"
2021-02-19T09:14:29.850Z | call_stream | [0] cancelWithStatus code: 1 details: "Cancelled on client"
2021-02-19T09:14:29.850Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> TRANSIENT_FAILURE
2021-02-19T09:14:32.654Z | channel | dns:firestore.googleapis.com:443 createCall [1] method="/google.firestore.v1.Firestore/BatchGetDocuments", deadline=Fri Feb 19 2021 17:19:32 GMT+0800 (China Standard Time)
2021-02-19T09:14:32.655Z | call_stream | [1] Sending metadata
2021-02-19T09:14:32.656Z | channel | Pick result: TRANSIENT_FAILURE subchannel: undefined status: 14 No connection established
2021-02-19T09:14:32.656Z | call_stream | [1] cancelWithStatus code: 14 details: "No connection established"
2021-02-19T09:14:32.656Z | call_stream | [1] ended with status: code=14 details="No connection established"

These debug messages, uh, seem to just say "No connection established" because of a timeout?

@ryanccn
Copy link
Author

ryanccn commented Feb 19, 2021

For one thing, I'm on a personal MacBook Pro so my problem can't be because of corporate malware.

@zoheiry
Copy link

zoheiry commented Feb 22, 2021

I am also experiencing the same issue, any requests to firebase through proxy, hand indefinitely then timeout after 60000ms

@lnoir
Copy link

lnoir commented Feb 22, 2021

Just in case it's useful to anyone that comes across this issue, I also had the problem with the connection timing out. After turning on additional logging, as recommended by @schmidt-sebastian, I saw firebase-admin was trying to connect to the host set in the FIRESTORE_EMULATOR_HOST environment variable. Removing it (or setting it to an empty string) fixed it.

@ryanccn
Copy link
Author

ryanccn commented Feb 23, 2021

Looks like lots of people are having this problem but with different reasons 😅

@schmidt-sebastian
Copy link
Contributor

@zoheiry If you are experiencing connection problems with the Web SDK (in a browser), take a look here: firebase/firebase-js-sdk#1674

I am pointing at this because the 60s timeout+Proxy problem used to be very common on our other platforms. It is not applicable to the firebase-admin-node SDK however.

@ryanccn
Copy link
Author

ryanccn commented Feb 27, 2021

After switching to using Shadowsocks there's now a different error:

{
  "code":"ERR_INVALID_PROTOCOL",
  "details":"Getting metadata from plugin failed with error: Protocol \"https:\" not supported. Expected \"http:\"",
  "metadata":{"internalRepr":{},"options":{}}
}

Here's the snippet in my .zshrc that configures the proxies:

export https_proxy=http://127.0.0.1:7890
export http_proxy=http://127.0.0.1:7890
export all_proxy=socks5://127.0.0.1:7890

@Ymirke
Copy link

Ymirke commented Mar 1, 2021

I had the same issue. docRef.set() and .get() methods timing out and getting error 14 messages.

Firebase emulator was setup with: export FIRESTORE_EMULATOR_HOST="localhost:8000" in cli initially.

Using extra logs - thank you @schmidt-sebastian!

GRPC_VERBOSITY=DEBUG
GRPC_TRACE=all

I saw that admin sdk was trying to access port 8000 without emulators running.

Removing it in CLI resolved it. Thanks @lnoir


Setting FIRESTORE_EMULATOR_HOST is currently the recommended (only?) way in the firebase admin sdk docs to enable the emulator. This can cause issues if you run tests with the emulator and set the ENVAR using CLI and don't remember to remove it. - It is fixed by setting FIRESTORE_EMULATOR_HOST programmatically in a beforeAll and removing it in an afterAll

Maybe having .useEmulator() in the admin sdk would be a better dev experience:

export const db =
  process.env.NODE_ENV === 'production' ? admin.firestore() : admin.firestore().useEmulator(port);

At the same time everything is working as it is intended as far as I can tell.

@hiranya911
Copy link
Contributor

Personally I'm not a fan of having to add an environment check to application code. I think what might be useful is if the Firestore client just logged a message out whenever it's connecting to the local emulator. That way there's a clear feedback of what's going on, in case somebody forgot to clear their env variables.

@elucidsoft
Copy link

I have identified the issue causing this, and its been introduced in the past year or so, I do not know at what point. But basically what's happening is it seems admin is spinning up a background task, and runs with that context. When your app shutsdown, that background task can sometimes hang around, so on the next time initializeApp is called it appears to work fine but any method calls will just hang forever. I have played around with app().delete() which does fix the problem but then you get even more strange behavior. Because even though app().delete() was called by the previous instance on app shutdown, and an entirely new instance is spun up and a new initializeApp() is called, the new instance will sometimes throw an exception saying "The ["DEFAULT"] app has been deleted.". I think the only way to handle this is to create instances of the app with a unique id each time, and then dispose of those that way you don't encounter this race condition on the DEFAULT app.

@inspire-lts
Copy link

@ryanccn hello, do you fixed this problem? I also used proxy,I'm in china.

@ryanccn
Copy link
Author

ryanccn commented Dec 14, 2021

@ryanccn hello, do you fixed this problem? I also used proxy,I'm in china.

No... I've stopped using Firebase. But I would certainly like to see this problem solved 😇

@simbathesailor
Copy link

Yes looks like it is some issue with corporate vpn or any other software companies have installed. Worked on personal laptop

@ryanccn
Copy link
Author

ryanccn commented Mar 18, 2022

Using TUN can solve the problem.

@ryanccn ryanccn closed this as completed Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests