Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCP] Add dimensions for metrics data streams #8314

Merged
merged 17 commits into from Dec 6, 2023

Conversation

gpop63
Copy link
Contributor

@gpop63 gpop63 commented Oct 26, 2023

Overview

This change introduces dimension mappings, new fields and fixes, while also changing the gcp field type to group across all metrics data streams.

Without specifying group as the type, Elasticsearch was unable to recognize gcp as an object containing these subfields, resulting in these fields not being included in the mapping.

The data stream period for storage had to be adjusted from 60s to 5m as some metrics have a sampling period of 5m. A shorter period would result in duplicated documents (a metric is generated every 5 minutes on GCP side but we fetch it every minute), which would get dropped as the only difference between them would be event.ingested.

Duplicated storage documents example:

Document 1

{
    "_index": ".ds-metrics-gcp.storage-default-2023.10.17-000001",
    "_id": "Cou9PosBY5p1KHvoLiu7",
    "_score": null,
    "_source": {
        "cloud": {
            "provider": "gcp",
            "account": {
                "name": "elastic-obs-integrations-dev",
                "id": "elastic-obs-integrations-dev"
            }
        },
        "agent": {
            "name": "5506d89e26e7",
            "id": "64d8d07b-1695-4b23-ac74-b10969e07a9c",
            "type": "metricbeat",
            "ephemeral_id": "75afe95d-f1bb-4154-8bfc-50ca3f0b0da0",
            "version": "8.10.4"
        },
        "@timestamp": "2023-10-17T17:30:00.000Z",
        "ecs": {
            "version": "8.0.0"
        },
        "gcp": {
            "storage": {
                "storage": {
                    "total": {
                        "bytes": 2584690366.0
                    },
                    "total_byte_seconds": {
                        "bytes": 223317247622144.0
                    },
                    "object": {
                        "count": 84
                    }
                }
            },
            "labels_hash": "3aOy2p1nLq3mU9OOBkDbjTJYc6u02W8+sWGHVMSTODE=",
            "labels": {
                "resource": {
                    "bucket_name": "elastic-obs-integrations-dev-giz-2-packages-99412",
                    "location": "us"
                },
                "metrics": {
                    "storage_class": "MULTI_REGIONAL"
                }
            }
        },
        "data_stream": {
            "namespace": "default",
            "type": "metrics",
            "dataset": "gcp.storage"
        },
        "service": {
            "type": "gcp"
        },
        "elastic_agent": {
            "id": "64d8d07b-1695-4b23-ac74-b10969e07a9c",
            "version": "8.10.4",
            "snapshot": true
        },
        "host": {
            "hostname": "5506d89e26e7",
            "os": {
                "kernel": "5.10.102.1-microsoft-standard-WSL2",
                "codename": "focal",
                "name": "Ubuntu",
                "type": "linux",
                "family": "debian",
                "version": "20.04.6 LTS (Focal Fossa)",
                "platform": "ubuntu"
            },
            "containerized": true,
            "ip": [
                "172.21.0.4"
            ],
            "name": "5506d89e26e7",
            "mac": [
                "02-42-AC-15-00-04"
            ],
            "architecture": "aarch64"
        },
        "metricset": {
            "period": 60000,
            "name": "metrics"
        },
        "event": {
            "duration": 873564254,
            "agent_id_status": "verified",
            "ingested": "2023-10-17T17:44:27Z",
            "module": "gcp",
            "dataset": "gcp.storage",
            "metric_names_hash": "9eE4YKXcP7hKZqYPVrTCDmySLL4jTymXE0bPcxnLzrs="
        }
    },
    "sort": [
        1697563800000
    ]
}

Document 2

{
    "_index": ".ds-metrics-gcp.storage-default-2023.10.17-000001",
    "_id": "iou8PosBY5p1KHvoQiH8",
    "_score": null,
    "_source": {
        "cloud": {
            "provider": "gcp",
            "account": {
                "name": "elastic-obs-integrations-dev",
                "id": "elastic-obs-integrations-dev"
            }
        },
        "agent": {
            "name": "5506d89e26e7",
            "id": "64d8d07b-1695-4b23-ac74-b10969e07a9c",
            "type": "metricbeat",
            "ephemeral_id": "75afe95d-f1bb-4154-8bfc-50ca3f0b0da0",
            "version": "8.10.4"
        },
        "@timestamp": "2023-10-17T17:30:00.000Z",
        "ecs": {
            "version": "8.0.0"
        },
        "gcp": {
            "storage": {
                "storage": {
                    "total": {
                        "bytes": 2584690366.0
                    },
                    "total_byte_seconds": {
                        "bytes": 223317247622144.0
                    },
                    "object": {
                        "count": 84
                    }
                }
            },
            "labels_hash": "3aOy2p1nLq3mU9OOBkDbjTJYc6u02W8+sWGHVMSTODE=",
            "labels": {
                "resource": {
                    "bucket_name": "elastic-obs-integrations-dev-giz-2-packages-99412",
                    "location": "us"
                },
                "metrics": {
                    "storage_class": "MULTI_REGIONAL"
                }
            }
        },
        "data_stream": {
            "namespace": "default",
            "type": "metrics",
            "dataset": "gcp.storage"
        },
        "service": {
            "type": "gcp"
        },
        "elastic_agent": {
            "id": "64d8d07b-1695-4b23-ac74-b10969e07a9c",
            "version": "8.10.4",
            "snapshot": true
        },
        "host": {
            "hostname": "5506d89e26e7",
            "os": {
                "kernel": "5.10.102.1-microsoft-standard-WSL2",
                "codename": "focal",
                "name": "Ubuntu",
                "type": "linux",
                "family": "debian",
                "version": "20.04.6 LTS (Focal Fossa)",
                "platform": "ubuntu"
            },
            "containerized": true,
            "ip": [
                "172.21.0.4"
            ],
            "name": "5506d89e26e7",
            "mac": [
                "02-42-AC-15-00-04"
            ],
            "architecture": "aarch64"
        },
        "metricset": {
            "period": 60000,
            "name": "metrics"
        },
        "event": {
            "duration": 547719500,
            "agent_id_status": "verified",
            "ingested": "2023-10-17T17:43:27Z",
            "module": "gcp",
            "dataset": "gcp.storage",
            "metric_names_hash": "9eE4YKXcP7hKZqYPVrTCDmySLL4jTymXE0bPcxnLzrs="
        }
    },
    "sort": [
        1697563800000
    ]
}

Dimensions and fields added:

Data streams updated

  • cloudrun_metrics
  • cloudsql_mysql
  • cloudsql_postgresql
  • cloudsql_sqlserver
  • compute
  • dataproc
  • firestore
  • gke
  • loadbalancing_metrics
  • pubsub
  • redis
  • storage

ECS fields:

  • cloud.account.id
  • cloud.account.name
  • cloud.availability_zone
  • cloud.instance.id
  • cloud.machine.type
  • cloud.region

Package fields:

  • gcp.metric_names_fingerprint
    • This field was added as a dimension to ensure that each document, even those with the same timestamp from different batches, can be uniquely identified. The hashing is done on the beats side.
  • gcp.labels_fingerprint
    • Created a unique hash representation of the gcp.labels field using the fingerprint processor.

TSDB toolkit tests (outdated)

pubsub

Testing data stream metrics-gcp.pubsub-default.
Index being used for the documents is .ds-metrics-gcp.pubsub-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.pubsub-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (46 fields):
                - gcp.pubsub.snapshot.backlog.bytes
                - gcp.pubsub.snapshot.backlog_bytes_by_region.bytes
                - gcp.pubsub.snapshot.config_updates.count
                - gcp.pubsub.snapshot.num_messages.value
                - gcp.pubsub.snapshot.num_messages_by_region.value
                - gcp.pubsub.snapshot.oldest_message_age.sec
                - gcp.pubsub.snapshot.oldest_message_age_by_region.sec
                - gcp.pubsub.subscription.ack_message.count
                - gcp.pubsub.subscription.backlog.bytes
                - gcp.pubsub.subscription.byte_cost.bytes
                - gcp.pubsub.subscription.config_updates.count
                - gcp.pubsub.subscription.dead_letter_message.count
                - gcp.pubsub.subscription.mod_ack_deadline_message.count
                - gcp.pubsub.subscription.mod_ack_deadline_message_operation.count
                - gcp.pubsub.subscription.mod_ack_deadline_request.count
                - gcp.pubsub.subscription.num_outstanding_messages.value
                - gcp.pubsub.subscription.num_undelivered_messages.value
                - gcp.pubsub.subscription.oldest_retained_acked_message_age.sec
                - gcp.pubsub.subscription.oldest_retained_acked_message_age_by_region.value
                - gcp.pubsub.subscription.oldest_unacked_message_age.sec
                - gcp.pubsub.subscription.oldest_unacked_message_age_by_region.value
                - gcp.pubsub.subscription.pull_ack_message_operation.count
                - gcp.pubsub.subscription.pull_ack_request.count
                - gcp.pubsub.subscription.pull_message_operation.count
                - gcp.pubsub.subscription.pull_request.count
                - gcp.pubsub.subscription.push_request.count
                - gcp.pubsub.subscription.retained_acked.bytes
                - gcp.pubsub.subscription.retained_acked_bytes_by_region.bytes
                - gcp.pubsub.subscription.seek_request.count
                - gcp.pubsub.subscription.sent_message.count
                - gcp.pubsub.subscription.streaming_pull_ack_message_operation.count
                - gcp.pubsub.subscription.streaming_pull_ack_request.count
                - gcp.pubsub.subscription.streaming_pull_message_operation.count
                - gcp.pubsub.subscription.streaming_pull_mod_ack_deadline_message_operation.count
                - gcp.pubsub.subscription.streaming_pull_mod_ack_deadline_request.count
                - gcp.pubsub.subscription.streaming_pull_response.count
                - gcp.pubsub.subscription.unacked_bytes_by_region.bytes
                - gcp.pubsub.topic.byte_cost.bytes
                - gcp.pubsub.topic.config_updates.count
                - gcp.pubsub.topic.oldest_retained_acked_message_age_by_region.value
                - gcp.pubsub.topic.oldest_unacked_message_age_by_region.value
                - gcp.pubsub.topic.retained_acked_bytes_by_region.bytes
                - gcp.pubsub.topic.send_message_operation.count
                - gcp.pubsub.topic.send_request.count
                - gcp.pubsub.topic.streaming_pull_response.count
                - gcp.pubsub.topic.unacked_bytes_by_region.bytes
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.pubsub-default-2023.10.23-000001 to tsdb-index-enabled...
All 2415 documents taken from index .ds-metrics-gcp.pubsub-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

gke

Testing data stream metrics-gcp.gke-default.
Index being used for the documents is .ds-metrics-gcp.gke-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.gke-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - counter (9 fields):
                - gcp.gke.container.cpu.core_usage_time.sec
                - gcp.gke.container.memory.page_fault.count
                - gcp.gke.container.restart.count
                - gcp.gke.node.cpu.core_usage_time.sec
                - gcp.gke.node.network.received_bytes.count
                - gcp.gke.node.network.sent_bytes.count
                - gcp.gke.node_daemon.cpu.core_usage_time.sec
                - gcp.gke.pod.network.received.bytes
                - gcp.gke.pod.network.sent.bytes
        - gauge (31 fields):
                - gcp.gke.container.cpu.limit_cores.value
                - gcp.gke.container.cpu.limit_utilization.pct
                - gcp.gke.container.cpu.request_cores.value
                - gcp.gke.container.cpu.request_utilization.pct
                - gcp.gke.container.ephemeral_storage.limit.bytes
                - gcp.gke.container.ephemeral_storage.request.bytes
                - gcp.gke.container.ephemeral_storage.used.bytes
                - gcp.gke.container.memory.limit.bytes
                - gcp.gke.container.memory.limit_utilization.pct
                - gcp.gke.container.memory.request.bytes
                - gcp.gke.container.memory.request_utilization.pct
                - gcp.gke.container.memory.used.bytes
                - gcp.gke.container.uptime.sec
                - gcp.gke.node.cpu.allocatable_cores.value
                - gcp.gke.node.cpu.allocatable_utilization.pct
                - gcp.gke.node.cpu.total_cores.value
                - gcp.gke.node.ephemeral_storage.allocatable.bytes
                - gcp.gke.node.ephemeral_storage.inodes_free.value
                - gcp.gke.node.ephemeral_storage.inodes_total.value
                - gcp.gke.node.ephemeral_storage.total.bytes
                - gcp.gke.node.ephemeral_storage.used.bytes
                - gcp.gke.node.memory.allocatable.bytes
                - gcp.gke.node.memory.allocatable_utilization.pct
                - gcp.gke.node.memory.total.bytes
                - gcp.gke.node.memory.used.bytes
                - gcp.gke.node.pid_limit.value
                - gcp.gke.node.pid_used.value
                - gcp.gke.node_daemon.memory.used.bytes
                - gcp.gke.pod.volume.total.bytes
                - gcp.gke.pod.volume.used.bytes
                - gcp.gke.pod.volume.utilization.pct
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.gke-default-2023.10.23-000001 to tsdb-index-enabled...
All 15519 documents taken from index .ds-metrics-gcp.gke-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

compute

Testing data stream metrics-gcp.compute-default.
Index being used for the documents is .ds-metrics-gcp.compute-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.compute-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (19 fields):
                - gcp.compute.firewall.dropped.bytes
                - gcp.compute.firewall.dropped_packets_count.value
                - gcp.compute.instance.cpu.reserved_cores.value
                - gcp.compute.instance.cpu.usage.pct
                - gcp.compute.instance.cpu.usage_time.sec
                - gcp.compute.instance.disk.read.bytes
                - gcp.compute.instance.disk.read_ops_count.value
                - gcp.compute.instance.disk.write.bytes
                - gcp.compute.instance.disk.write_ops_count.value
                - gcp.compute.instance.memory.balloon.ram_size.value
                - gcp.compute.instance.memory.balloon.ram_used.value
                - gcp.compute.instance.memory.balloon.swap_in.bytes
                - gcp.compute.instance.memory.balloon.swap_out.bytes
                - gcp.compute.instance.network.egress.bytes
                - gcp.compute.instance.network.egress.packets.count
                - gcp.compute.instance.network.ingress.bytes
                - gcp.compute.instance.network.ingress.packets.count
                - gcp.compute.instance.uptime.sec
                - gcp.compute.instance.uptime_total.sec
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.compute-default-2023.10.23-000001 to tsdb-index-enabled...
All 1640 documents taken from index .ds-metrics-gcp.compute-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

redis

Testing data stream metrics-gcp.redis-default.
Index being used for the documents is .ds-metrics-gcp.redis-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.redis-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (31 fields):
                - gcp.redis.clients.blocked.count
                - gcp.redis.clients.connected.count
                - gcp.redis.commands.calls.count
                - gcp.redis.commands.total_time.us
                - gcp.redis.commands.usec_per_call.sec
                - gcp.redis.keyspace.avg_ttl.sec
                - gcp.redis.keyspace.keys.count
                - gcp.redis.keyspace.keys_with_expiration.count
                - gcp.redis.persistence.rdb.bgsave_in_progress
                - gcp.redis.replication.master.slaves.lag.sec
                - gcp.redis.replication.master.slaves.offset.bytes
                - gcp.redis.replication.master_repl_offset.bytes
                - gcp.redis.replication.offset_diff.bytes
                - gcp.redis.replication.role
                - gcp.redis.server.uptime.sec
                - gcp.redis.stats.cache_hit_ratio
                - gcp.redis.stats.connections.total.count
                - gcp.redis.stats.cpu_utilization.sec
                - gcp.redis.stats.evicted_keys.count
                - gcp.redis.stats.expired_keys.count
                - gcp.redis.stats.keyspace_hits.count
                - gcp.redis.stats.keyspace_misses.count
                - gcp.redis.stats.memory.maxmemory.mb
                - gcp.redis.stats.memory.system_memory_overload_duration.us
                - gcp.redis.stats.memory.system_memory_usage_ratio
                - gcp.redis.stats.memory.usage.bytes
                - gcp.redis.stats.memory.usage_ratio
                - gcp.redis.stats.network_traffic.bytes
                - gcp.redis.stats.pubsub.channels.count
                - gcp.redis.stats.pubsub.patterns.count
                - gcp.redis.stats.reject_connections.count
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.redis-default-2023.10.23-000001 to tsdb-index-enabled...
All 635 documents taken from index .ds-metrics-gcp.redis-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

cloudrun_metrics

Testing data stream metrics-gcp.cloudrun_metrics-default.
Index being used for the documents is .ds-metrics-gcp.cloudrun_metrics-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.cloudrun_metrics-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (7 fields):
                - gcp.cloudrun_metrics.container.billable_instance_time
                - gcp.cloudrun_metrics.container.cpu.allocation_time.sec
                - gcp.cloudrun_metrics.container.instance.count
                - gcp.cloudrun_metrics.container.memory.allocation_time
                - gcp.cloudrun_metrics.container.network.received.bytes
                - gcp.cloudrun_metrics.container.network.sent.bytes
                - gcp.cloudrun_metrics.request.count
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.cloudrun_metrics-default-2023.10.23-000001 to tsdb-index-enabled...
All 374 documents taken from index .ds-metrics-gcp.cloudrun_metrics-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

loadbalancing_metrics

Testing data stream metrics-gcp.loadbalancing_metrics-default.
Index being used for the documents is .ds-metrics-gcp.loadbalancing_metrics-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.loadbalancing_metrics-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (19 fields):
                - gcp.loadbalancing_metrics.https.backend_request.bytes
                - gcp.loadbalancing_metrics.https.backend_request.count
                - gcp.loadbalancing_metrics.https.backend_response.bytes
                - gcp.loadbalancing_metrics.https.request.bytes
                - gcp.loadbalancing_metrics.https.request.count
                - gcp.loadbalancing_metrics.https.response.bytes
                - gcp.loadbalancing_metrics.l3.external.egress.bytes
                - gcp.loadbalancing_metrics.l3.external.egress_packets.count
                - gcp.loadbalancing_metrics.l3.external.ingress.bytes
                - gcp.loadbalancing_metrics.l3.external.ingress_packets.count
                - gcp.loadbalancing_metrics.l3.internal.egress.bytes
                - gcp.loadbalancing_metrics.l3.internal.egress_packets.count
                - gcp.loadbalancing_metrics.l3.internal.ingress.bytes
                - gcp.loadbalancing_metrics.l3.internal.ingress_packets.count
                - gcp.loadbalancing_metrics.tcp_ssl_proxy.closed_connections.value
                - gcp.loadbalancing_metrics.tcp_ssl_proxy.egress.bytes
                - gcp.loadbalancing_metrics.tcp_ssl_proxy.ingress.bytes
                - gcp.loadbalancing_metrics.tcp_ssl_proxy.new_connections.value
                - gcp.loadbalancing_metrics.tcp_ssl_proxy.open_connections.value
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.loadbalancing_metrics-default-2023.10.23-000001 to tsdb-index-enabled...
All 136 documents taken from index .ds-metrics-gcp.loadbalancing_metrics-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

cloudsql_postgresql

Testing data stream metrics-gcp.cloudsql_postgresql-default.
Index being used for the documents is .ds-metrics-gcp.cloudsql_postgresql-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.cloudsql_postgresql-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - counter (16 fields):
                - gcp.cloudsql_postgresql.database.insights.aggregate.execution_time
                - gcp.cloudsql_postgresql.database.insights.aggregate.io_time
                - gcp.cloudsql_postgresql.database.insights.aggregate.latencies
                - gcp.cloudsql_postgresql.database.insights.aggregate.lock_time
                - gcp.cloudsql_postgresql.database.insights.aggregate.row.count
                - gcp.cloudsql_postgresql.database.insights.aggregate.shared_blk_access.count
                - gcp.cloudsql_postgresql.database.insights.perquery.execution_time
                - gcp.cloudsql_postgresql.database.insights.perquery.io_time
                - gcp.cloudsql_postgresql.database.insights.perquery.lock_time
                - gcp.cloudsql_postgresql.database.insights.perquery.row.count
                - gcp.cloudsql_postgresql.database.insights.perquery.shared_blk_access.count
                - gcp.cloudsql_postgresql.database.insights.pertag.execution_time
                - gcp.cloudsql_postgresql.database.insights.pertag.io_time
                - gcp.cloudsql_postgresql.database.insights.pertag.lock_time
                - gcp.cloudsql_postgresql.database.insights.pertag.row.count
                - gcp.cloudsql_postgresql.database.insights.pertag.shared_blk_access.count
        - gauge (27 fields):
                - gcp.cloudsql_postgresql.database.auto_failover_request.count
                - gcp.cloudsql_postgresql.database.available_for_failover
                - gcp.cloudsql_postgresql.database.cpu.reserved_cores.count
                - gcp.cloudsql_postgresql.database.cpu.usage_time.sec
                - gcp.cloudsql_postgresql.database.cpu.utilization.pct
                - gcp.cloudsql_postgresql.database.disk.bytes_used.bytes
                - gcp.cloudsql_postgresql.database.disk.quota.bytes
                - gcp.cloudsql_postgresql.database.disk.read_ops.count
                - gcp.cloudsql_postgresql.database.disk.utilization.pct
                - gcp.cloudsql_postgresql.database.disk.write_ops.count
                - gcp.cloudsql_postgresql.database.memory.quota.bytes
                - gcp.cloudsql_postgresql.database.memory.total_usage.bytes
                - gcp.cloudsql_postgresql.database.memory.usage.bytes
                - gcp.cloudsql_postgresql.database.memory.utilization.pct
                - gcp.cloudsql_postgresql.database.network.connections.count
                - gcp.cloudsql_postgresql.database.network.received_bytes.count
                - gcp.cloudsql_postgresql.database.network.sent_bytes.count
                - gcp.cloudsql_postgresql.database.num_backends.count
                - gcp.cloudsql_postgresql.database.replication.network_lag.sec
                - gcp.cloudsql_postgresql.database.replication.replica_byte_lag.bytes
                - gcp.cloudsql_postgresql.database.replication.replica_lag.sec
                - gcp.cloudsql_postgresql.database.transaction.count
                - gcp.cloudsql_postgresql.database.transaction_id.count
                - gcp.cloudsql_postgresql.database.transaction_id_utilization.pct
                - gcp.cloudsql_postgresql.database.up
                - gcp.cloudsql_postgresql.database.uptime.sec
                - gcp.cloudsql_postgresql.database.vacuum.oldest_transaction_age
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.cloudsql_postgresql-default-2023.10.23-000001 to tsdb-index-enabled...
All 500 documents taken from index .ds-metrics-gcp.cloudsql_postgresql-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

cloudsql_mysql

Testing data stream metrics-gcp.cloudsql_mysql-default.
Index being used for the documents is .ds-metrics-gcp.cloudsql_mysql-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.cloudsql_mysql-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (35 fields):
                - gcp.cloudsql_mysql.database.auto_failover_request.count
                - gcp.cloudsql_mysql.database.available_for_failover
                - gcp.cloudsql_mysql.database.cpu.reserved_cores.count
                - gcp.cloudsql_mysql.database.cpu.usage_time.sec
                - gcp.cloudsql_mysql.database.cpu.utilization.pct
                - gcp.cloudsql_mysql.database.disk.bytes_used.bytes
                - gcp.cloudsql_mysql.database.disk.quota.bytes
                - gcp.cloudsql_mysql.database.disk.read_ops.count
                - gcp.cloudsql_mysql.database.disk.utilization.pct
                - gcp.cloudsql_mysql.database.disk.write_ops.count
                - gcp.cloudsql_mysql.database.innodb_buffer_pool_pages_dirty.count
                - gcp.cloudsql_mysql.database.innodb_buffer_pool_pages_free.count
                - gcp.cloudsql_mysql.database.innodb_buffer_pool_pages_total.count
                - gcp.cloudsql_mysql.database.innodb_data_fsyncs.count
                - gcp.cloudsql_mysql.database.innodb_os_log_fsyncs.count
                - gcp.cloudsql_mysql.database.innodb_pages_read.count
                - gcp.cloudsql_mysql.database.innodb_pages_written.count
                - gcp.cloudsql_mysql.database.memory.quota.bytes
                - gcp.cloudsql_mysql.database.memory.total_usage.bytes
                - gcp.cloudsql_mysql.database.memory.usage.bytes
                - gcp.cloudsql_mysql.database.memory.utilization.pct
                - gcp.cloudsql_mysql.database.network.connections.count
                - gcp.cloudsql_mysql.database.network.received_bytes.count
                - gcp.cloudsql_mysql.database.network.sent_bytes.count
                - gcp.cloudsql_mysql.database.queries.count
                - gcp.cloudsql_mysql.database.questions.count
                - gcp.cloudsql_mysql.database.received_bytes.count
                - gcp.cloudsql_mysql.database.replication.last_io_errno
                - gcp.cloudsql_mysql.database.replication.last_sql_errno
                - gcp.cloudsql_mysql.database.replication.network_lag.sec
                - gcp.cloudsql_mysql.database.replication.replica_lag.sec
                - gcp.cloudsql_mysql.database.replication.seconds_behind_master.sec
                - gcp.cloudsql_mysql.database.sent_bytes.count
                - gcp.cloudsql_mysql.database.up
                - gcp.cloudsql_mysql.database.uptime.sec
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.cloudsql_mysql-default-2023.10.23-000001 to tsdb-index-enabled...
All 192 documents taken from index .ds-metrics-gcp.cloudsql_mysql-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

cloudsql_sqlserver

Testing data stream metrics-gcp.cloudsql_sqlserver-default.
Index being used for the documents is .ds-metrics-gcp.cloudsql_sqlserver-default-2023.10.23-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.cloudsql_sqlserver-default-2023.10.23-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (23 fields):
                - gcp.cloudsql_sqlserver.database.audits_size.bytes
                - gcp.cloudsql_sqlserver.database.audits_upload.count
                - gcp.cloudsql_sqlserver.database.auto_failover_request.count
                - gcp.cloudsql_sqlserver.database.available_for_failover
                - gcp.cloudsql_sqlserver.database.cpu.reserved_cores.count
                - gcp.cloudsql_sqlserver.database.cpu.usage_time.sec
                - gcp.cloudsql_sqlserver.database.cpu.utilization.pct
                - gcp.cloudsql_sqlserver.database.disk.bytes_used.bytes
                - gcp.cloudsql_sqlserver.database.disk.quota.bytes
                - gcp.cloudsql_sqlserver.database.disk.read_ops.count
                - gcp.cloudsql_sqlserver.database.disk.utilization.pct
                - gcp.cloudsql_sqlserver.database.disk.write_ops.count
                - gcp.cloudsql_sqlserver.database.memory.quota.bytes
                - gcp.cloudsql_sqlserver.database.memory.total_usage.bytes
                - gcp.cloudsql_sqlserver.database.memory.usage.bytes
                - gcp.cloudsql_sqlserver.database.memory.utilization.pct
                - gcp.cloudsql_sqlserver.database.network.connections.count
                - gcp.cloudsql_sqlserver.database.network.received_bytes.count
                - gcp.cloudsql_sqlserver.database.network.sent_bytes.count
                - gcp.cloudsql_sqlserver.database.replication.network_lag.sec
                - gcp.cloudsql_sqlserver.database.replication.replica_lag.sec
                - gcp.cloudsql_sqlserver.database.up
                - gcp.cloudsql_sqlserver.database.uptime.sec
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.cloudsql_sqlserver-default-2023.10.23-000001 to tsdb-index-enabled...
All 216 documents taken from index .ds-metrics-gcp.cloudsql_sqlserver-default-2023.10.23-000001 were successfully placed to index tsdb-index-enabled.

dataproc

Testing data stream metrics-gcp.dataproc-default.
Index being used for the documents is .ds-metrics-gcp.dataproc-default-2023.10.24-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.dataproc-default-2023.10.24-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (18 fields):
                - gcp.dataproc.batch.spark.executors.count
                - gcp.dataproc.cluster.hdfs.datanodes.count
                - gcp.dataproc.cluster.hdfs.storage_capacity.value
                - gcp.dataproc.cluster.hdfs.storage_utilization.value
                - gcp.dataproc.cluster.hdfs.unhealthy_blocks.count
                - gcp.dataproc.cluster.job.failed.count
                - gcp.dataproc.cluster.job.running.count
                - gcp.dataproc.cluster.job.submitted.count
                - gcp.dataproc.cluster.operation.failed.count
                - gcp.dataproc.cluster.operation.running.count
                - gcp.dataproc.cluster.operation.submitted.count
                - gcp.dataproc.cluster.yarn.allocated_memory_percentage.value
                - gcp.dataproc.cluster.yarn.apps.count
                - gcp.dataproc.cluster.yarn.containers.count
                - gcp.dataproc.cluster.yarn.memory_size.value
                - gcp.dataproc.cluster.yarn.nodemanagers.count
                - gcp.dataproc.cluster.yarn.pending_memory_size.value
                - gcp.dataproc.cluster.yarn.virtual_cores.count
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.dataproc-default-2023.10.24-000001 to tsdb-index-enabled...
All 576 documents taken from index .ds-metrics-gcp.dataproc-default-2023.10.24-000001 were successfully placed to index tsdb-index-enabled.

firestore

Testing data stream metrics-gcp.firestore-default.
Index being used for the documents is .ds-metrics-gcp.firestore-default-2023.10.24-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.firestore-default-2023.10.24-000001.

The time series fields for the TSDB index are: 
        - dimension (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash
        - gauge (3 fields):
                - gcp.firestore.document.delete.count
                - gcp.firestore.document.read.count
                - gcp.firestore.document.write.count
        - routing_path (8 fields):
                - cloud.account.id
                - cloud.account.name
                - cloud.availability_zone
                - cloud.instance.id
                - cloud.machine.type
                - cloud.region
                - event.metric_names_hash
                - gcp.labels_hash

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.firestore-default-2023.10.24-000001 to tsdb-index-enabled...
All 142 documents taken from index .ds-metrics-gcp.firestore-default-2023.10.24-000001 were successfully placed to index tsdb-index-enabled.

storage

All 152 documents taken from index .ds-metrics-gcp.storage-default-2023.10.24-000001 were successfully placed to index tsdb-index-enabled.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Screenshots

@elasticmachine
Copy link

elasticmachine commented Oct 26, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-11-30T10:41:24.498+0000

  • Duration: 20 min 27 sec

Test stats 🧪

Test Results
Failed 0
Passed 69
Skipped 0
Total 69

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@gpop63 gpop63 self-assigned this Oct 26, 2023
changes gcp field type to group
add labels ffingerprint in ingest pipeline
@elasticmachine
Copy link

elasticmachine commented Oct 26, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (6/6) 💚
Files 100.0% (6/6) 💚 13.636
Classes 100.0% (6/6) 💚 13.636
Methods 87.931% (102/116) 👍 4.598
Lines 95.139% (1507/1584) 👍 10.118
Conditionals 100.0% (0/0) 💚

@gpop63 gpop63 marked this pull request as ready for review October 27, 2023 00:19
@gpop63 gpop63 requested review from a team as code owners October 27, 2023 00:19
@constanca-m
Copy link
Contributor

constanca-m commented Oct 27, 2023

Hey @gpop63 , thanks for opening this PR! This is the related issue for GCP: #7555.

Do you think we need all those ECS fields in every data stream? I checked (not for all data streams), and I found this:

  • GKE:

    • cloud.account.id
    • cloud.provider
    • Other cloud.* fields never hold data
  • Load Balancing

    • cloud.account.id
    • cloud.provider
    • Other cloud.* don't hold data
  • PubSub

    • cloud.account.id
    • cloud.provider
    • Other cloud.* fields never hold data
  • Storage

    • cloud.account.id
    • cloud.provider
    • Other cloud.* fields never hold data

And then we had some other data streams with values for all those, like Compute data stream.

@constanca-m constanca-m mentioned this pull request Oct 27, 2023
4 tasks
@zmoog
Copy link
Contributor

zmoog commented Oct 27, 2023

Does the cloud.account.id field contain the account ID [of the host where] the agent [is running] or the metric event?

@gpop63
Copy link
Contributor Author

gpop63 commented Oct 27, 2023

Does the cloud.account.id field contain the account ID for the agent or the metric event?

It should be the account ID for the metric, basically the GCP project id. I actually should have added agent.id as a dimension in ECS fields, I missed that one.

 "cloud": {
            "provider": "gcp",
            "account": {
                "name": "elastic-obs-integrations-dev",
                "id": "elastic-obs-integrations-dev"
            }
        },

@zmoog
Copy link
Contributor

zmoog commented Oct 27, 2023

Good to know.

I noticed some inputs/metricsets set the cloud.account.id field with the account ID where the agent runs I'm glad it is not the case for this metricset.

@zmoog
Copy link
Contributor

zmoog commented Oct 27, 2023

I actually should have added agent.id as a dimension in ECS fields, I missed that one.

Are we adding the agent.id as a dimension? I never considered the agent identity for dimensions. Is this a common practice?

@gpop63
Copy link
Contributor Author

gpop63 commented Oct 27, 2023

Are we adding the agent.id as a dimension? I never considered the agent identity for dimensions. Is this a common practice?

Yes, https://github.com/elastic/integrations/blob/main/docs/developer_tsdb_migration_guidelines.md#ecs-fiels

agent.id is added as a dimension in all TSDB enabled data streams I think.

@gpop63
Copy link
Contributor Author

gpop63 commented Oct 27, 2023

Do you think we need all those ECS fields in every data stream? I checked (not for all data streams), and I found this:

@constanca-m you're right. Could be due to having additional metadata logic in beats for compute. I added them for all data streams because that's what we consider "dimensions" and what we use for grouping on the beats side.

Should we maybe keep all of them as dimensions only for compute while removing the ones that have no values from other data streams? The tests would still pass since there were never values in the first place.

@constanca-m
Copy link
Contributor

Should we maybe keep all of them as dimensions only for compute while removing the ones that have no values from other data streams? The tests would still pass since there were never values in the first place.

I think it would be best. I remember that some data streams in other packages already followed this logic.

@mlunadia
Copy link

@lalit-satapathy can you please help getting this one reviewed to unblock GCP packages

@lalit-satapathy
Copy link
Collaborator

@lalit-satapathy can you please help getting this one reviewed to unblock GCP packages

CC: @ishleenk17 @agithomas

@agithomas
Copy link
Contributor

@gpop63 , kindly refer to the comment here to find the reasoning behind selecting some of the dimension PRs. Also, you can find that , GCP resource identifier such as instance_id is unique within an availability_zone. I see you have used subset of fields under For Cloud-only Integration Packages / Managed Services Packages. Please explain why such an approach is taken?

@gpop63
Copy link
Contributor Author

gpop63 commented Nov 24, 2023

@gpop63 , kindly refer to the comment here to find the reasoning behind selecting some of the dimension PRs. Also, you can find that , GCP resource identifier such as instance_id is unique within an availability_zone. I see you have used subset of fields under For Cloud-only Integration Packages / Managed Services Packages. Please explain why such an approach is taken?

I guess you are referring to the compute and redis data streams. Documents are dropped if cloud.instance.id is not a dimension.

Copying documents from .ds-metrics-gcp.compute-default-2023.11.24-000001 to tsdb-index-enabled...
WARNING: Out of 126 documents from the index .ds-metrics-gcp.compute-default-2023.11.24-000001, 36 of them were discarded.

@@ -28,7 +28,7 @@ streams:
- name: period
type: text
title: Period
default: 60s
default: 5m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the default perios of collection 5 minutes and not 1 minute unlike other GCP datastreams ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data stream period for storage had to be adjusted from 60s to 5m as some metrics have a sampling period of 5m. A shorter period would result in duplicated documents (a metric is generated every 5 minutes on GCP side but we fetch it every minute), which would get dropped as the only difference between them would be event.ingested.

@agithomas
Copy link
Contributor

@gpop63 ,

We discussed that not all of the common dimension fields are not included as dimensions. I learnt from you that, GCP does not have values populated for these common dimension fields even when there exist entry in fields.yml.

As discussed, kindly share the list of fields that have values and those which doesn't have value for the fields in the common dimension fields. You may choose a table format to share this information.

@gpop63
Copy link
Contributor Author

gpop63 commented Nov 28, 2023

@agithomas

Data stream cloud.account.id cloud.region cloud.availability_zone agent.id
compute
gke
loadbalancing_metrics
pubsub
redis
storage
cloudrun_metrics
cloudsql_mysql
cloudsql_postgresql
cloudsql_sqlserver
firestore
dataproc

I will add actual documents for each data stream so we can check them at a later time if needed.

compute

{
    "cloud": {
      "availability_zone": "us-central1-c",
      "instance": {
        "name": "gke-miguel-kubecon-default-pool-8d586473-uj1d",
        "id": "7442613444431818544"
      },
      "provider": "gcp",
      "machine": {
        "type": "e2-medium"
      },
      "region": "us-central1",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:13:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "compute": {
        "instance": {
          "network": {
            "ingress": {
              "bytes": 9414,
              "packets": {
                "count": 134
              }
            },
            "egress": {
              "bytes": 11485,
              "packets": {
                "count": 89
              }
            }
          }
        }
      },
      "labels": {
        "metrics": {
          "loadbalanced": "true"
        },
        "user": {
          "division": "engineering",
          "goog-k8s-cluster-name": "miguel-kubecon",
          "org": "obs",
          "goog-k8s-node-pool-name": "default-pool",
          "goog-k8s-cluster-location": "us-central1-c",
          "project": "miguelluna",
          "team": "cloud-native-monitoring",
          "goog-gke-node": ""
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.compute"
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 3126038318,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:17:09Z",
      "module": "gcp",
      "dataset": "gcp.compute"
    }
  }

gke

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:14:05.736Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "gke": {
        "container": {
          "memory": {
            "request": {
              "bytes": 10485760
            },
            "limit": {
              "bytes": 52428800
            }
          },
          "restart": {
            "count": 0
          },
          "cpu": {
            "request_cores": {
              "value": 0.005
            },
            "limit_cores": {
              "value": 0
            }
          },
          "ephemeral_storage": {
            "request": {
              "bytes": 0
            },
            "limit": {
              "bytes": 0
            }
          }
        }
      },
      "labels": {
        "resource": {
          "cluster_name": "tetiana-prometheus",
          "container_name": "csi-driver-registrar",
          "location": "europe-west1",
          "pod_name": "pdcsi-node-7b6hq",
          "namespace_name": "kube-system"
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.gke"
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 1244946929,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:18:19Z",
      "module": "gcp",
      "dataset": "gcp.gke"
    }
  }

loadbalancing_metrics

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:11:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "loadbalancing_metrics": {
        "https": {
          "backend_request": {
            "bytes": 417,
            "count": 1
          },
          "backend_response": {
            "bytes": 488
          }
        }
      },
      "labels": {
        "resource": {
          "backend_scope_type": "INVALID_BACKEND",
          "matched_url_path_rule": "UNMATCHED",
          "backend_target_name": "tas-demo-elastic-http-lb",
          "backend_type": "INVALID_BACKEND",
          "backend_scope": "INVALID_BACKEND",
          "target_proxy_name": "tas-demo-elastic-http-lb",
          "forwarding_rule_name": "tas-demo-elastic-http-lb",
          "backend_name": "INVALID_BACKEND",
          "url_map_name": "tas-demo-elastic-https-lb",
          "backend_target_type": "BACKEND_SERVICE",
          "region": "global"
        },
        "metrics": {
          "response_code": "502",
          "proxy_continent": "America",
          "cache_result": "DISABLED",
          "response_code_class": "500"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.loadbalancing_metrics"
    },
    "service": {
      "type": "gcp"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "family": "debian",
        "type": "linux",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 476084876,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:15:18Z",
      "module": "gcp",
      "dataset": "gcp.loadbalancing_metrics"
    }
  }

pubsub

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:15:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "pubsub": {
        "subscription": {
          "retained_acked_bytes_by_region": {
            "bytes": 0
          },
          "unacked_bytes_by_region": {
            "bytes": 0
          },
          "oldest_unacked_message_age_by_region": {
            "value": 0
          },
          "oldest_retained_acked_message_age_by_region": {
            "value": 0
          }
        }
      },
      "labels": {
        "resource": {
          "subscription_id": "filebeat-gcp-audit"
        },
        "metrics": {
          "region": "northamerica-northeast2"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.pubsub"
    },
    "service": {
      "type": "gcp"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "family": "debian",
        "type": "linux",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 434512489,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:19:18Z",
      "module": "gcp",
      "dataset": "gcp.pubsub"
    }
  }

redis

{
    "cloud": {
      "instance": {
        "name": "redis1",
        "id": "projects/elastic-obs-integrations-dev/locations/us-central1/instances/redis1"
      },
      "provider": "gcp",
      "machine": {
        "type": "BASIC"
      },
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "type": "metricbeat",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:15:26.810Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "metrics": {
        "stats": {
          "reject_connections_count": {}
        }
      },
      "redis": {
        "stats": {
          "reject_connections": {
            "count": 0
          }
        }
      },
      "labels": {
        "resource": {
          "region": "us-central1",
          "node_id": "node-0"
        },
        "metrics": {
          "role": "primary"
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.redis"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "family": "debian",
        "type": "linux",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 1381430449,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:19:39Z",
      "module": "gcp",
      "dataset": "gcp.redis"
    }
  }

storage

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "type": "metricbeat",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:10:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "storage": {
        "storage": {
          "object": {
            "count": 6
          }
        }
      },
      "labels": {
        "resource": {
          "bucket_name": "dataproc-temp-us-central1-774712120909-e8nuxv73",
          "location": "us-central1"
        },
        "metrics": {
          "storage_class": "REGIONAL"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.storage"
    },
    "service": {
      "type": "gcp"
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 815694657,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:20:18Z",
      "module": "gcp",
      "dataset": "gcp.storage"
    }
  }

cloudrun_metrics

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "type": "metricbeat",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:18:21.590Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "cloudrun_metrics": {
        "container": {
          "billable_instance_time": 2.7
        }
      },
      "labels": {
        "resource": {
          "revision_name": "damien-test-hello-00001-ped",
          "service_name": "damien-test-hello",
          "location": "us-central1",
          "configuration_name": "damien-test-hello"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.cloudrun_metrics"
    },
    "service": {
      "type": "gcp"
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "ip": [
        "172.22.0.7"
      ],
      "containerized": true,
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 503983018,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:21:34Z",
      "module": "gcp",
      "dataset": "gcp.cloudrun_metrics"
    }
  }

cloudsql_mysql

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-observability",
        "id": "elastic-observability"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:20:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "cloudsql_mysql": {
        "database": {
          "instance_state": false
        }
      },
      "labels": {
        "resource": {
          "database_id": "elastic-observability:mysql",
          "region": "us-central"
        },
        "cloudsql": {
          "name": "mysql",
          "version": "8.0.31"
        },
        "metrics": {
          "state": "RUNNABLE"
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.cloudsql_mysql"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "family": "debian",
        "type": "linux",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 1074356670,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:23:29Z",
      "module": "gcp",
      "dataset": "gcp.cloudsql_mysql"
    }
  }

cloudsql_postgresql

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-observability",
        "id": "elastic-observability"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:20:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "cloudsql_postgresql": {
        "database": {
          "num_backends": {
            "count": 2
          }
        }
      },
      "labels": {
        "resource": {
          "database_id": "elastic-observability:postgres",
          "region": "us-central"
        },
        "cloudsql": {
          "name": "postgres",
          "version": "14"
        },
        "metrics": {
          "database": "cloudsqladmin"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.cloudsql_postgresql"
    },
    "service": {
      "type": "gcp"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 1140922692,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:23:18Z",
      "module": "gcp",
      "dataset": "gcp.cloudsql_postgresql"
    }
  }

cloudsql_sqlserver

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-observability",
        "id": "elastic-observability"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "type": "metricbeat",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:20:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "cloudsql_sqlserver": {
        "database": {
          "disk": {
            "write_ops": {
              "count": 110
            },
            "bytes_used": {
              "bytes": 651927552
            },
            "read_ops": {
              "count": 0
            },
            "quota": {
              "bytes": 105089261568
            },
            "utilization": {
              "pct": 0.006203560119015185
            }
          },
          "memory": {
            "quota": {
              "bytes": 27331235840
            },
            "total_usage": {
              "bytes": 3081568256
            },
            "usage": {
              "bytes": 2311217152
            },
            "utilization": {
              "pct": 0.08456321424798038
            }
          },
          "cpu": {
            "usage_time": {
              "sec": 11.019461885036435
            },
            "utilization": {
              "pct": 0.045281727500696436
            },
            "reserved_cores": {
              "count": 4
            }
          },
          "up": 1,
          "network": {
            "received_bytes": {
              "count": 315087
            },
            "connections": {
              "count": 9
            },
            "sent_bytes": {
              "count": 3681782
            }
          },
          "uptime": {
            "sec": 60
          }
        }
      },
      "labels": {
        "resource": {
          "database_id": "elastic-observability:ms-sql",
          "region": "us-west1"
        },
        "cloudsql": {
          "name": "sqlserver",
          "version": "2019_standard"
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.cloudsql_sqlserver"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "ip": [
        "172.22.0.7"
      ],
      "containerized": true,
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 1086857363,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:23:29Z",
      "module": "gcp",
      "dataset": "gcp.cloudsql_sqlserver"
    }
  }

firestore

{
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "robust-catalyst-399814",
        "id": "robust-catalyst-399814"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "type": "metricbeat",
      "ephemeral_id": "1de59397-2d84-4e9e-8ec2-8becfa2dccd3",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:28:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "firestore": {
        "document": {
          "delete": {
            "count": 18
          }
        }
      },
      "labels": {
        "metrics": {
          "module": "__unknown__",
          "version": "__unknown__"
        }
      }
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.firestore"
    },
    "service": {
      "type": "gcp"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 397651706,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:32:39Z",
      "module": "gcp",
      "dataset": "gcp.firestore"
    }
  }

dataproc

  {
    "cloud": {
      "provider": "gcp",
      "account": {
        "name": "elastic-obs-integrations-dev",
        "id": "elastic-obs-integrations-dev"
      }
    },
    "agent": {
      "name": "docker-fleet-agent",
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "ephemeral_id": "1f9a913c-dd2f-4347-a5fc-5746e4b9f819",
      "type": "metricbeat",
      "version": "8.12.0"
    },
    "@timestamp": "2023-11-28T11:47:00.000Z",
    "ecs": {
      "version": "8.0.0"
    },
    "gcp": {
      "labels": {
        "resource": {
          "cluster_name": "cluster-aa34",
          "cluster_uuid": "89f8d591-cb18-4778-9c14-8be71fb578f1",
          "region": "us-central1"
        },
        "metrics": {
          "operation_type": "START_CLUSTER",
          "error_type": "UNKNOWN_ERROR"
        }
      },
      "dataproc": {
        "cluster": {
          "operation": {
            "failed": {
              "count": 0
            }
          }
        }
      }
    },
    "service": {
      "type": "gcp"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "gcp.dataproc"
    },
    "elastic_agent": {
      "id": "8442dd66-0f0b-4b54-93f6-6dcf25f45296",
      "version": "8.12.0",
      "snapshot": true
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.6 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": true,
      "ip": [
        "172.22.0.7"
      ],
      "name": "docker-fleet-agent",
      "id": "d03b51e638e64b05b5cf16c41d2058c0",
      "mac": [
        "02-42-AC-16-00-07"
      ],
      "architecture": "x86_64"
    },
    "metricset": {
      "period": 60000,
      "name": "metrics"
    },
    "event": {
      "duration": 539443435,
      "agent_id_status": "verified",
      "ingested": "2023-11-28T11:50:03Z",
      "module": "gcp",
      "dataset": "gcp.dataproc"
    }
  }

Some data streams have the region/AZ under gcp.labels.

@agithomas
Copy link
Contributor

Hi @gpop63 ,

I have created two resources as below

image

Two services having the same name nginx-elastic are created in two regions. Please verify if you are receiving two separate documents without document drop because of not having cloud.region field.

@agithomas
Copy link
Contributor

Two services having the same name nginx-elastic are created in two regions. Please verify if you are receiving two separate documents without document drop because of not having cloud.region field.

I see that the labels are having the location information as below. As labels.* are part of dimensions, i believe, this case is covered.

    "labels": {
        "resource": {
          "revision_name": "damien-test-hello-00001-ped",
          "service_name": "damien-test-hello",
          "location": "us-central1",
          "configuration_name": "damien-test-hello"
        }
      }

@agithomas
Copy link
Contributor

@gpop63 ,

Can you check firestore? I see no labels here.


      "labels": {
        "metrics": {
          "module": "__unknown__",
          "version": "__unknown__"
        }
      }


@gpop63
Copy link
Contributor Author

gpop63 commented Nov 29, 2023

@agithomas

Testing Firestore:

Our internal GCP projects use Firestore in Datastore mode so I had to use my personal GCP account.

Within a GCP project, you can create multiple Firestore databases (we are using native mode). Each project comes with a default Firestore database named (default).

I added two additional databases, so my current databases are:

  • (default)
  • mydb2
  • mydb3

image

Command to create a Firestore database:
gcloud alpha firestore databases create --database=<name> --location=<loc> --type=firestore-native --delete-protection

Generating Metrics with Go:

Firestore Library Code

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	"cloud.google.com/go/firestore"
	"google.golang.org/api/option"
)

func main() {
	ctx := context.Background()

	// service account key
	saKey := "robust-catalyst-399814-e6dbc28a6a02.json"
	projectID := "robust-catalyst-399814"

	client, err := firestore.NewClient(ctx, projectID, option.WithCredentialsFile(saKey))
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}

	defer client.Close()

	for {
		time.Sleep(3 * time.Second)

		docRef, _, err := client.Collection("users").Add(ctx, map[string]interface{}{
			"first": "Ada",
			"last":  "Lovelace",
			"born":  1815,
		})
		if err != nil {
			log.Fatalf("Failed adding a new user: %v", err)
		}

		_, err = docRef.Delete(ctx)
		if err != nil {
			log.Fatalf("Failed deleting user: %v", err)
		}

		iter := client.Collection("users").Documents(ctx)
		for {
			doc, err := iter.Next()
			if err != nil {
				break
			}
			fmt.Println(doc.Data())
		}
	}
}

HTTP Requests Code

package main

import (
	"bytes"
	"context"
	"fmt"
	"io"
	"log"
	"net/http"
	"os"
	"time"

	"golang.org/x/oauth2/google"
)

var db = "mydb2"

func createJWT(serviceAccountPath string) (string, error) {
	data, err := os.ReadFile(serviceAccountPath)
	if err != nil {
		return "", err
	}

	config, err := google.JWTConfigFromJSON(data, "https://www.googleapis.com/auth/datastore")
	if err != nil {
		return "", err
	}

	token, err := config.TokenSource(context.Background()).Token()
	if err != nil {
		return "", err
	}

	return token.AccessToken, nil
}

func getDocuments(url string, data []byte, jwt string) error {
	req, err := http.NewRequest("GET", url, nil)
	if err != nil {
		return err
	}

	req.Header.Set("Authorization", "Bearer "+jwt)
	req.Header.Set("x-goog-request-params", "project_id=robust-catalyst-399814&database_id="+db)
	resp, err := http.DefaultClient.Do(req)

	if err != nil {
		return err
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return err
	}

	fmt.Println(len(body))

	return nil
}

func createDocument(url string, data []byte, jwt string) error {
	req, err := http.NewRequest("POST", url, bytes.NewBuffer(data))
	if err != nil {
		return err
	}

	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Authorization", "Bearer "+jwt)
	req.Header.Set("x-goog-request-params", "project_id=robust-catalyst-399814&database_id="+db)
	resp, err := http.DefaultClient.Do(req)

	if err != nil {
		return err
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return err
	}

	fmt.Println(string(body))

	return nil
}

func main() {
	jwt, err := createJWT("robust-catalyst-399814-e6dbc28a6a02.json")
	if err != nil {
		log.Fatal(err)
	}

	for {
		time.Sleep(3 * time.Second)

		err := createDocument(fmt.Sprintf(`https://firestore.googleapis.com/v1/projects/robust-catalyst-399814/databases/%s/documents/artists`, db), []byte(`{"fields": {"first": {"stringValue": "Ada"}, "last": {"stringValue": "Lovelace"}}}`), jwt)
		if err != nil {
			log.Fatal(err)
		}

		err = getDocuments(fmt.Sprintf(`https://firestore.googleapis.com/v1/projects/robust-catalyst-399814/databases/%s/documents/artists`, db), nil, jwt)
		if err != nil {
			log.Fatal(err)
		}
	}

}

I executed two instances of direct HTTP requests for mydb2 and mydb3, and one instance using the official Firestore Go library for the (default) database, all of them ran for a few hours.

I also noticed a variation of gcp.labels:

"labels": {
  "metrics": {
    "module": "__unknown__",
    "type": "QUERY",
    "version": "__unknown__"
  }
}
"labels": {
  "metrics": {
    "op": "CREATE",
    "module": "__unknown__",
    "version": "__unknown__"
  }
}
TSDB Test

Testing data stream metrics-gcp.firestore-default.
Index being used for the documents is .ds-metrics-gcp.firestore-default-2023.11.29-000001.
Index being used for the settings and mappings is .ds-metrics-gcp.firestore-default-2023.11.29-000001.

The time series fields for the TSDB index are: 
        - dimension (3 fields):
                - agent.id
                - cloud.account.id
                - gcp.labels_fingerprint
        - gauge (3 fields):
                - gcp.firestore.document.delete.count
                - gcp.firestore.document.read.count
                - gcp.firestore.document.write.count
        - routing_path (3 fields):
                - agent.id
                - cloud.account.id
                - gcp.labels_fingerprint

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-gcp.firestore-default-2023.11.29-000001 to tsdb-index-enabled...
All 442 documents taken from index .ds-metrics-gcp.firestore-default-2023.11.29-000001 were successfully placed to index tsdb-index-enabled.

@gpop63
Copy link
Contributor Author

gpop63 commented Nov 30, 2023

@lalit-satapathy

Regarding the Firestore data stream, @agithomas identified a significant issue: we lack a database or location identifier in either cloud.* fields or gcp.labels. Even though TSDB tests are successful, there could be instances where documents are dropped.

If a user has multiple Firestore databases and performs the same action (QUERY/LOOKUP/CREATE) on two or more databases at any given time, it could lead to dropped documents.

"labels": {
  "metrics": {
    "module": "__unknown__",
    "type": "QUERY",
    "version": "__unknown__"
  }
}

Having several Firestore databases under one account is a new feature currently in preview mode. However, it's safer to not enable TSDB for this data stream until we can add database id and location identifiers.

@agithomas
Copy link
Contributor

Having several Firestore databases under one account is a new feature currently in preview mode. However, it's safer to not enable TSDB for this data stream until we can add database id and location identifiers

@gpop63 , Can you create a backlog issue for Firestore and include it under the meta under the section Blocked packages? Please consider including the reason for the blocked status, the same as you mentioned in the previous comment.

This will help the team be aware of the pending TSDB packages/dataset.

Copy link
Contributor

@agithomas agithomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@agithomas
Copy link
Contributor

@ishleenk17 , can you please do a final review of the PR?

@lalit-satapathy
Copy link
Collaborator

@gpop63 , Can you create a backlog issue for Firestore and include it under the meta under the section Blocked packages? Please consider including the reason for the blocked status, the same as you mentioned in the previous comment.

This will help the team be aware of the pending TSDB packages/dataset.

Agree. @gpop63, I am fine to keep "firestore" as a backlog. Let's also update the top-level gcp issue, with the details, so that looking at it, we get a clear picture.

@lalit-satapathy
Copy link
Collaborator

Having several Firestore databases under one account is a new feature currently in preview mode. However, it's safer to not enable TSDB for this data stream until we can add database id and location identifiers.

Are these new fields to be added? We can track in the backlog issue, what needs to be done to have TSDB working on this data stream.

@gpop63
Copy link
Contributor Author

gpop63 commented Nov 30, 2023

Are these new fields to be added? We can track in the backlog issue, what needs to be done to have TSDB working on this data stream.

Yes, we would have to add new fields in beats specifically for the firestore data stream. I'm not sure if this is possible yet, maybe by collecting additional metadata about each Firestore database instance. I will look into it.

@gpop63
Copy link
Contributor Author

gpop63 commented Dec 4, 2023

@elastic/obs-cloud-monitoring could someone please take a look? TSDB review was done by @agithomas

@gpop63 gpop63 merged commit 98f4a01 into elastic:main Dec 6, 2023
4 checks passed
@elasticmachine
Copy link

Package gcp - 2.32.1 containing this change is available at https://epr.elastic.co/search?package=gcp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants