Skip to content

[Firestore] #3161

@t0mstah

Description

@t0mstah

[READ] Step 1: Are you in the right place?

I believe this belongs in nodejs-firestore because the issue is specifically with Firestore Admin SDK document reads,
and firebase-admin is using @google-cloud/firestore underneath.

[REQUIRED] Step 2: Describe your environment

* Operating System version: macOS 26.3.1
* Firebase SDK version: `firebase-admin@13.10.0`
  * `@google-cloud/firestore@7.11.6`
  * `@grpc/grpc-js@1.14.4`
  * `firebase-functions@7.2.5`
  * `firebase-tools@15.18.0`
* Firebase Product: Firestore
* Node.js version: v22.22.2
* NPM version: 10.9.7

Runtime environment: Firebase Cloud Functions Gen 2 / Cloud Run, Node.js 22.

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

I am seeing intermittent very slow Firestore Admin SDK reads from Firebase Gen 2 public onRequest functions.

The function handles high-volume redirect/callback traffic. Inside the handler, it performs a simple Firestore document
read:

await admin.firestore().collection('users').doc(userId).get()

Most reads are normal, usually tens of milliseconds. However, a small percentage intermittently take multiple seconds,
sometimes 10s+, and I recently observed one simple users/{userId}.get() taking ~23s.

The slow reads do not appear to be explained by application-level document contention or document size:

- The slow reads are not concentrated on a single userId.
- The slow user documents are normal-sized, roughly 4KB-14KB in the samples checked.
- A fast-user sample had similar or larger documents, including p95 around 20KB.
- Firestore audit logs did not show same-document writes during the slow read windows.
- CPU and memory on Cloud Run are not saturated.
- Cloud Run request concurrency is low.
- The same collection and similar reads from onCall functions appear to return normally.
- The issue seems concentrated in high-traffic public Gen 2 onRequest functions.
- Replacing/flushing the Cloud Run instances with a no-code gcloud run services update clears the issue temporarily.

This makes it look like some warm instances may get into a degraded Firestore client/channel state where individual
reads stall for seconds, but the Node process remains otherwise healthy.

Example timing from production instrumentation:

{
  "operation": "users/{userId}.get()",
  "userGetMs": 23220,
  "documentSizeBytes": 10278,
  "sameDocumentWritesDuringRead": 0
}

Other slow examples in the same window included userGetMs around 3s-6s, while p50/p95 stayed normal.

Expected behavior:

A simple doc().get() for a small Firestore document should not intermittently hang for 5s-20s+ on otherwise healthy warm
Cloud Run / Gen 2 function instances. If the Firestore client/channel becomes unhealthy, I would expect it to recover,
fail clearly, or expose an actionable error rather than continuing to serve rare but very slow reads until the instance
is replaced.

Actual behavior:

Most reads are fast, but a small long tail of reads stalls for seconds. Restarting/replacing Cloud Run instances
temporarily resolves it.

#### Relevant Code:

import * as admin from 'firebase-admin'
import { onRequest } from 'firebase-functions/v2/https'
import { performance } from 'perf_hooks'

admin.initializeApp()

export let routerRedirect = onRequest(async (req, res) => {
  let userId: string = String(req.query.userId)

  let userGetStartMs: number = performance.now()
  let userDoc: admin.firestore.DocumentSnapshot = await admin
    .firestore()
    .collection('users')
    .doc(userId)
    .get()
  let userGetMs: number = performance.now() - userGetStartMs

  console.log({
    message: 'firestoreUserGetTiming',
    userId,
    userGetMs,
    exists: userDoc.exists
  })

  res.status(200).send('ok')
})

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions