Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCP] [Firestore] Instances metadata (database id/location) #8620

Open
Tracked by #7555
gpop63 opened this issue Nov 30, 2023 · 3 comments
Open
Tracked by #7555

[GCP] [Firestore] Instances metadata (database id/location) #8620

gpop63 opened this issue Nov 30, 2023 · 3 comments
Assignees
Labels

Comments

@gpop63
Copy link
Contributor

gpop63 commented Nov 30, 2023

While working on enabling TSDB for GCP metrics data streams, @agithomas found that it's impossible to uniquely identify the Firestore database that produced metrics.

Each GCP Firestore document has labels, but they're generic and don't specify a particular Firestore database.

"labels": {
  "metrics": {
    "module": "__unknown__",
    "type": "QUERY",
    "version": "__unknown__"
  }
}
"labels": {
  "metrics": {
    "op": "CREATE",
    "module": "__unknown__",
    "version": "__unknown__"
  }
}

If a user has several Firestore databases and performs the same action (QUERY/LOOKUP/CREATE) on two or more databases at the same time, it can result in lost documents.

The metrics we collect in the current configuration don't include database metadata.

To solve this, we need to add more metadata about the Firestore databases in the documents, like database id and location.

I see two potential solutions:

  1. Modify the agent configuration for Firestore to gather different metrics that contain these labels. This can be done by:

    • Replacing document/delete_count with document/delete_ops_count
    • Replacing document/read_count with document/read_ops_count
    • Replacing document/write_count with document/write_ops_count

    This approach would resolve the issue as these new metrics include database metadata. The downside is that these new metrics are currently in BETA, since having several Firestore databases under one account seems like a new feature.

  2. Implement additional code in beats to retrieve metadata about Firestore databases (might not be possible).

Test with first approach

Changes I made to test this:

diff --git a/packages/gcp/data_stream/firestore/agent/stream/stream.yml.hbs b/packages/gcp/data_stream/firestore/agent/stream/stream.yml.hbs
index eaf3f9ab68..644dd1889e 100644
--- a/packages/gcp/data_stream/firestore/agent/stream/stream.yml.hbs
+++ b/packages/gcp/data_stream/firestore/agent/stream/stream.yml.hbs
@@ -23,6 +23,6 @@ exclude_labels: {{exclude_labels}}
 metrics:
   - service: firestore
     metric_types:
-        - "document/delete_count"
-        - "document/read_count"
-        - "document/write_count"
\ No newline at end of file
+        - "document/delete_ops_count"
+        - "document/read_ops_count"
+        - "document/write_ops_count"
\ No newline at end of file
diff --git a/packages/gcp/data_stream/firestore/elasticsearch/ingest_pipeline/default.yml b/packages/gcp/data_stream/firestore/elasticsearch/ingest_pipeline/default.yml
index c79d82a622..fa779926c0 100644
--- a/packages/gcp/data_stream/firestore/elasticsearch/ingest_pipeline/default.yml
+++ b/packages/gcp/data_stream/firestore/elasticsearch/ingest_pipeline/default.yml
@@ -2,15 +2,15 @@
 description: Pipeline for parsing GCP Firestore metrics.
 processors:
   - rename:
-      field: gcp.metrics.document.delete.count
+      field: gcp.metrics.document.delete_ops_count.value
       target_field: gcp.firestore.document.delete.count
       ignore_missing: true
   - rename:
-      field: gcp.metrics.document.read.count
+      field: gcp.metrics.document.read_ops_count.value
       target_field: gcp.firestore.document.read.count
       ignore_missing: true
   - rename:
-      field: gcp.metrics.document.write.count
+      field: gcp.metrics.document.write_ops_count.value
       target_field: gcp.firestore.document.write.count
       ignore_missing: true
   - remove:

Document sample:

{
        "_index": ".ds-metrics-gcp.firestore-default-2023.11.30-000001",
        "_id": "1o8HIYwBc7GYHoqGhPZM",
        "_score": 1,
        "_source": {
          "cloud": {
            "provider": "gcp",
            "account": {
              "name": "robust-catalyst-399814",
              "id": "robust-catalyst-399814"
            }
          },
          "agent": {
            "name": "d55861746c32",
            "id": "f32d3e98-fbd0-47d8-af95-b53fd748551d",
            "type": "metricbeat",
            "ephemeral_id": "1f46ddde-367e-42d5-bf7e-584961d71c4a",
            "version": "8.12.0"
          },
          "@timestamp": "2023-11-30T16:15:00.000Z",
          "ecs": {
            "version": "8.0.0"
          },
          "gcp": {
            "firestore": {
              "document": {
                "write": {
                  "count": 21
                }
              }
            },
            "labels": {
              "resource": {
                "database_id": "(default)",
                "location": "eur3"
              },
              "metrics": {
                "op": "CREATE"
              }
            }
          },
          "service": {
            "type": "gcp"
          },
          "data_stream": {
            "namespace": "default",
            "type": "metrics",
            "dataset": "gcp.firestore"
          },
          "elastic_agent": {
            "id": "f32d3e98-fbd0-47d8-af95-b53fd748551d",
            "version": "8.12.0",
            "snapshot": true
          },
          "host": {
            "hostname": "d55861746c32",
            "os": {
              "kernel": "5.10.102.1-microsoft-standard-WSL2",
              "codename": "focal",
              "name": "Ubuntu",
              "type": "linux",
              "family": "debian",
              "version": "20.04.6 LTS (Focal Fossa)",
              "platform": "ubuntu"
            },
            "containerized": true,
            "ip": [
              "172.24.0.4"
            ],
            "name": "d55861746c32",
            "mac": [
              "02-42-AC-18-00-04"
            ],
            "architecture": "x86_64"
          },
          "metricset": {
            "period": 60000,
            "name": "metrics"
          },
          "event": {
            "duration": 554636949,
            "agent_id_status": "verified",
            "ingested": "2023-11-30T16:19:49Z",
            "module": "gcp",
            "dataset": "gcp.firestore"
          }
        }
      }

@agithomas @lalit-satapathy

@agithomas
Copy link
Contributor

With the change -1 , does this support the stable version that is presently in GCP

@gpop63
Copy link
Contributor Author

gpop63 commented Dec 11, 2023

@agithomas Currently, the new metrics are in BETA. Typically, we don't include BETA metrics in GA data streams. Not sure when exceptions are acceptable. We might have to wait for them to become GA. However, since having multiple Firebase databases under a GCP project is a new feature, it might take some time.

@gpop63 gpop63 self-assigned this Dec 13, 2023
@lalit-satapathy
Copy link
Collaborator

This issue is blocked ant not planned now, details from @gpop63

We are blocked until the metrics on GCP side become GA. We can't really enable TSDB right now, we would risk losing some documents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants