Sudden high cpu usage from stackdriver-metadata-agent-cluster-level pod #95

babygoat · 2021-06-11T10:02:09Z

I would like to report an issue from the stackdriver-metadata-agent on our production GKE 1.18.17-gke.700 with cloud loggin and monitoring enabled. The machine type of node is n1-standard-1 (1 vCpu, 3.75GB mem)

A few days ago (2021-06-08 9:31:xx GMT+08:00), the cpu usage of the stackdriver-metadata-agent-cluster-level pod suddenly grew drastically. Thus, my production services within the same suffered from severe timeout issues. See the attached CPU chart for reference.

The containers within pod are:
metadata-agent: gcr.io/stackdriver-agents/metadata-agent-go:1.2.0
metadata-agent-nanny: gke.gcr.io/addon-resizer:1.8.11-gke.1

During that time, no suspicious logs from the containers are reported.

metadata-agent logs

Since I could not find the corresponding repository for the metadata agent, I would like to know if any possible issue regarding the CPU load issue was raised and possible resolutions. Owing to the lack of a concrete root cause, I'm concerned about it might happen once again. Or if my report should be created on the specific repository for the issue, please let me know.

Thanks for your consideration!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sudden high cpu usage from stackdriver-metadata-agent-cluster-level pod #95

Sudden high cpu usage from stackdriver-metadata-agent-cluster-level pod #95

babygoat commented Jun 11, 2021 •

edited

Loading

Sudden high cpu usage from stackdriver-metadata-agent-cluster-level pod #95

Sudden high cpu usage from stackdriver-metadata-agent-cluster-level pod #95

Comments

babygoat commented Jun 11, 2021 • edited Loading

babygoat commented Jun 11, 2021 •

edited

Loading