Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common Fields for Container Inventory Schema #22179

Open
jsoriano opened this issue Oct 27, 2020 · 22 comments
Open

Common Fields for Container Inventory Schema #22179

jsoriano opened this issue Oct 27, 2020 · 22 comments
Assignees
Labels
ext-goal External goal of an iteration meta Team:Platforms Label for the Integrations - Platforms team

Comments

@jsoriano
Copy link
Member

jsoriano commented Oct 27, 2020

This issue tracks work related to the definition of common fields for container inventory schema.

The output will be a set of recommended or required fields to be added to any event related to containers.

The purpose of these fields is to have a minimal set of valuable data that can be used for inventory. The focus will be in metadata and metrics fields.

Integrations possibly affected:

Related issues

@jsoriano jsoriano added meta Team:Platforms Label for the Integrations - Platforms team ext-goal External goal of an iteration labels Oct 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@jsoriano jsoriano self-assigned this Oct 27, 2020
@exekias
Copy link
Contributor

exekias commented Oct 27, 2020

We may also want to check what Cloudwatch/Stackdriver/Azure monitor provide ootb

@kaiyan-sheng
Copy link
Contributor

@jsoriano @ChrsMark Question about docker network metrics: I see we have both docker.network.in.bytes and docker.network.inbound.bytes. My understanding is docker.network.in.bytes is a gauge and docker.network.inbound.bytes is a counter?

Also do you think it's useful to calculate one value container.network.ingress.bytes to represent an aggregated value across all network interfaces?

@jsoriano
Copy link
Member Author

jsoriano commented Feb 5, 2021

My understanding is docker.network.in.bytes is a gauge and docker.network.inbound.bytes is a counter?

Yes, docker.network.in.bytes are actually bytes per second during the last collection period. Calculated using current and previous docker.network.inbound.bytes. The idea was to deprecate docker.network.in.bytes, but this could be reconsidered.

Also do you think it's useful to calculate one value container.network.ingress.bytes to represent an aggregated value across all network interfaces?

Yes, this can be interesting, specially for inventory UIs.

@kaiyan-sheng
Copy link
Contributor

@simianhacker Does UI do any aggregation right now for docker network metrics?

@sorantis
Copy link
Contributor

@neptunian FYI

@simianhacker
Copy link
Member

@kaiyan-sheng Yes... we use docker.network.inbound.bytes and docker.network.outbound.bytes and we use them as counters.

@kaiyan-sheng
Copy link
Contributor

@simianhacker Thanks! Is Kibana doing aggregation across all network interfaces for these two counters?

@simianhacker
Copy link
Member

Yes, we take the derivative of the max of each interface and sum the rates together.

@kaiyan-sheng
Copy link
Contributor

Proposing new container fields to add into ECS:

  • container.cpu.usage: Percent CPU normalized by the number of CPU cores and it ranges from 0 to 1.
    (same as docker.cpu.total.norm.pct)
  • container.memory.usage: Memory usage percentage.
    (same as docker.memory.usage.pct)
  • container.disk.read.bytes: Bytes read during this collection period.
    (derived from docker.diskio.read.bytes)
  • container.disk.write.bytes: Bytes written during this collection period.
    (derived from docker.diskio.read.bytes)
  • container.network.ingress.bytes: The number of bytes (gauge) received on all network interfaces(aggregated) by the container in a given period of time.
    (derived from docker.network.inbound.bytes)
  • container.network.egress.bytes: The number of bytes (gauge) sent out on all network interfaces(aggregated) by the container in a given period of time.
    (derived from docker.network.outbound.bytes)

@simianhacker Will container.network.ingress.bytes and container.network.egress.bytes be useful to the UI?

@ChrsMark @jsoriano @sorantis I would like to hear your opinion on this before starting RFC in ECS. Thanks!!

@sorantis
Copy link
Contributor

Aligning container and hosts ECS metric fields sounds like a good start.
I'm wondering if it's possible extend the list in a generic way with

  • container.state (up/down)
  • container.uptime

Question about container.memory.usage. What memory metric will this field be derived from? For both kubernetes and docker we collect memory usage as well as memory rss metrics. Based on the proposed mapping we will only be exposing memory usage at the ECS level. Should we also consider adding memory rss to the list? Both metrics are reported by our Kubernetes/Docker/cgroup integrations.

@jsoriano
Copy link
Member Author

I would like to hear your opinion on this before starting RFC in ECS.

LGTM, thanks!

I'm wondering if it's possible extend the list in a generic way with

  • container.state (up/down)
  • container.uptime

It can be interesting to add them, but we would need to define them well. For the state we would need to define valid values, that can be different between platforms, and if healthchecks should be considered. Uptime can be also different in different platforms, perhaps we should report creation and/or start times instead.

Question about container.memory.usage. What memory metric will this field be derived from? For both kubernetes and docker we collect memory usage as well as memory rss metrics. Based on the proposed mapping we will only be exposing memory usage at the ECS level. Should we also consider adding memory rss to the list? Both metrics are reported by our Kubernetes/Docker/cgroup integrations.

As these metrics are going to be used for UIs and inventory purposes I think we should keep it simple and report a single value for memory usage. I guess that other more specific metrics will still be available in reported events (depending on the availability in the platform), so users can still check them if needed.

Regarding the memory metric to derive the common field from, I think it should be derived from the same metric the platform uses to enforce memory limits. This way what users see is consistent with how the platform behaves. For example in #25428 we saw that the only metric kubernetes reports for Windows containers is workingSetBytes, and we decided to use this to calculate the memory percentage usage.

@sorantis
Copy link
Contributor

+1 on well defining the state/update metrics before adding them to ECS.

The reason I was asking about including RSS is because I'm seeing it being used as an equally important metric for monitoring container memory. We can review this once we'll have more customer feedback.

@ChrsMark
Copy link
Member

ChrsMark commented Jul 14, 2021

@kaiyan-sheng @sorantis I guess this will help us solving elastic/kibana#100229, right?

@kaiyan-sheng
Copy link
Contributor

@ChrsMark Yes I think this is a perfect use case for defining and adopting inventory schema.

@ChrsMark ChrsMark mentioned this issue Jul 22, 2021
13 tasks
@kaiyan-sheng
Copy link
Contributor

Hi @akshay-saraswat and @ChrsMark, here are the two issues I created:
Testing new container fields in docker: elastic/integrations#2119
Testing new container fields in kubernetes: elastic/integrations#2120

@ChrsMark
Copy link
Member

@MichaelKatsoulis since you are planning to working on this do you think we can take the ownership? @jsoriano @kaiyan-sheng any objections on this?

@jsoriano
Copy link
Member Author

No objectios 🙂 Thanks a lot!

@MichaelKatsoulis MichaelKatsoulis self-assigned this Nov 15, 2021
@simianhacker
Copy link
Member

Will container.network.ingress.bytes and container.network.egress.bytes be useful to the UI?

Yes

@MichaelKatsoulis
Copy link
Contributor

Continuing the discussion on this.
Currently we have suggested the following ECS fields

container.cpu.usage
container.memory.usage
container.network.ingress.bytes
container.network.egress.bytes
container.disk.read.bytes
container.disk.write.bytes

After looking at Metrics UI Inventory page , I can see that in the dropdown list there are kubernetes pods hardcoded where fields coming from kubernetes.pod.* are used and also Docker Containers. In Docker containers, fields from docker.* are used which are populated by the docker module. More specifically the fields used are:

docker.cpu.total.pct
docker.memory.usage.pct
docker.network.inbound.bytes
docker.network.interface
docker.network.outbound.bytes
docker.diskio.read.bytes
docker.diskio.write.bytes
docker.diskio.read.ops
docker.diskio.write.ops

Comparing the cpu percentages of docker containers and kubernetes pods inventory I noticed wide differences for the same containers. Investigating this I figured out that in case of kubernetes pods, we are using a normalised (per node cpus) value for the pod cpu percentage (kubernetes.pod.cpu.usage.node.pct ). While for docker the non normalised values are used(docker.cpu.total.pct), although we also calculate the docker.cpu.total.norm.pct. The difference is that the non normalised equals to normalised * node_cpus.

The best step forward would be to make the docker containers inventory generic. We could rename it into Containers and use only ECS fields.

Those fields will be populated by either docker in case docker runtime is used, kubelet in case we are on k8s, containerd in case containerd runtime is used.

As different modules can populated same fields at the same time it is vital all the different modules (currently kubernetes and docker. In the future containerd as well) will report same values for those fields.

So I suggest that container.cpu.usage ECS field to be renamed into container.cpu.usage.pct to be more clear what it is and it will be the normalised cpu percentage.
Also it would be nice to add some more ECS fields so we can use the Docker Containers inventory only with ECS.

Those additional fields are:

container.memory.usage.pct
container.disk.write.ops
container.disk.read.ops
container.network.interface

@ChrsMark
Copy link
Member

ChrsMark commented Nov 23, 2021

Good job in breaking this down @MichaelKatsoulis !

Regarding the new fields you propose, can you also share the types for them? We can upvote for them in this issue and then I guess we can go ahead and open a PR to ECS.

Regarding the views. I find your approach valid. The Docker view could be renamed to Containers and use container ECS fields, which will be populated by docker and containerd modules mainly. In addition this view can be populated by k8s module also but we need to think if k8s module will be reporting kubernetes.* fields or container.* ECS fields or both. I think we would need both since the Kubernetes view should be populated only when we have actual k8s metrics and not just metrics about containers which could be coming from only the runtime. Also what will happen if both k8s and docker modules are running, which one would populate the Containers view?

@jasonrhodes I think that the UI team need to pair with us on this decision making. Anyone available to work on this?

@MichaelKatsoulis
Copy link
Contributor

Regarding the new fields you propose, can you also share the types for them? We can upvote for them in this issue and then I guess we can go ahead and open a PR to ECS.

@ChrsMark,
The types of the new fields are:

container.memory.usage.pct        type: scaled_float        format: percent
container.disk.write.ops          type: long
container.disk.read.ops.          type: long
container.network.interface       type: keyword

and the updated container.cpu.usage.pct is type: scaled_float

I also believe that k8s module should populate both fields (kubernetes.* and container.* ).
The Kubernetes view will remain as is and use the values from kubernetes.*.
Regarding what will happen in case both docker and k8s module populate the same filed, I think we will be ok as long as the values match.
Or is there any way in Kibana to specify priorities like from which dataset the fields are coming from? That way if docker or containerd modules are running the view will use them . If not it will use the ones from kubernetes. Maybe @jasonrhodes could answer this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext-goal External goal of an iteration meta Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

No branches or pull requests

8 participants