Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MongoDB Atlas] Add Hardware data stream #9689

Merged
merged 4 commits into from May 9, 2024

Conversation

niraj-elastic
Copy link
Contributor

@niraj-elastic niraj-elastic commented Apr 23, 2024

What does this PR do?

  • Added 1 data stream (Hardware Metrics).
  • Added data collection logic for the data streams.
  • Added the ingest pipeline for the data streams.
  • Mapped fields according to the ECS schema and added Fields metadata in the appropriate YAML files.
  • Added dashboards and visualizations.
  • Added system test cases for the data stream.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  • Clone integrations repo.
  • Install elastic-package locally.
  • Start elastic stack using elastic-package.
  • Move to integrations/packages/mongodb_atlas) directory.
  • Run the following command to run tests. elastic-package test

Screenshots

mongodb-atlas-hardware-dashboard

@niraj-elastic niraj-elastic requested a review from a team as a code owner April 23, 2024 17:19
@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

Comment on lines 24 to 25
- `mongod_database`: This data stream collects a running log of events, including entries such as incoming connections, commands run, and issues encountered. Generally, database log messages are useful for diagnosing issues, monitoring your deployment, and tuning performance.
- `process`: This data stream collects host metrics per process for all the hosts of the specified group. Metrics like measurements for the host, such as CPU usage, number of I/O operations and memory are available on this data stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `mongod_database`: This data stream collects a running log of events, including entries such as incoming connections, commands run, and issues encountered. Generally, database log messages are useful for diagnosing issues, monitoring your deployment, and tuning performance.
- `process`: This data stream collects host metrics per process for all the hosts of the specified group. Metrics like measurements for the host, such as CPU usage, number of I/O operations and memory are available on this data stream.
- `mongod_database`: This data stream collects a running log of events, including entries such as incoming connections, commands run, and issues encountered. Generally, database log messages are useful for diagnosing issues, monitoring your deployment, and tuning performance.
- `process`: This data stream collects host metrics per process for all the hosts in the specified group. Metrics like measurements for the host, such as CPU usage, number of I/O operations, and memory usage are available in this data stream.

@@ -101,6 +100,13 @@ This is the `mongod_database` data stream. This datastream collects a running lo

## Metrics reference

### Hardware
This data stream collects hardware and status metrics per process of the specified group. Metrics like measurements for the hardware and status, such as CPU usage and JVM memory usage are available on this data stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This data stream collects hardware and status metrics per process of the specified group. Metrics like measurements for the hardware and status, such as CPU usage and JVM memory usage are available on this data stream.
This data stream collects hardware and status metrics for each process in the specified group. It includes measurements such as CPU usage, memory consumption, JVM memory usage, disk usage, etc.

@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.0.4"
changes:
- description: MongoDB Atlas integration package with "hardware" data stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- description: MongoDB Atlas integration package with "hardware" data stream.
- description: Add "hardware" data stream to MongoDB Atlas package.

value: mongodb_atlas
- set:
field: event.category
value: ["driver"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
value: ["driver"]
value: ["driver"]

shouldn't this category be "database"?

https://www.elastic.co/guide/en/ecs/current/ecs-allowed-values-event-category.html#ecs-event-category-database

accordingly find the correct event.type too.

Comment on lines 59 to 66
- rename:
field: status.JVM_MAX_MEMORY
target_field: mongodb_atlas.hardware.jvm.memory.heap.available.mb
ignore_missing: true
- rename:
field: status.JVM_CURRENT_MEMORY
target_field: mongodb_atlas.hardware.jvm.memory.heap.used.mb
ignore_missing: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you thinking we should stick to mongodb_atlas.hardware.jvm.max.memory and mongodb_atlas.hardware.jvm.current.memory? I know by definition it is correct but what's the problem to keep the target fields similar to source fields?

Also in the docs there's no mention of the unit. Probably unit could be anything. We shouldn't assume that it is always mb. Also, there's difference between {m,M}{b,B} i.e., megabits and megabytes. We should avoid this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem with mongodb_atlas.hardware.jvm.max.memory is that even though it is similar to source field it's name is very different than definition, the metric it self collects Total amount of available memory in the JVM heap. so including max in its name and excluding available may lead to confusion for users. mongodb_atlas.hardware.jvm.current.memory does not have the same problem so we can use something like mongodb_atlas.hardware.jvm.memory.heap.current.mb. The main issue with keeping field names similar to source field is that we have our own approach for field names such as using suffix and grouping similar fields together, which results in to distinction between the source field name and our field name.

We receive the unit type of all the fields in raw response we get from MongoDB Atlas. So we have valid source to identify that these fields are megabytes. but I understand your concern with Mb & MB, to solve that we can add Megabytes in description of the fields.

Let me know your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your first point the the definition does not exactly match with the field name. We can keep as it is then. But we have to remove the unit.

We receive the unit type of all the fields in raw response we get from MongoDB Atlas.

But it supports other types too. Did you test it in a setup where GB's of memory is available to JVM? What if then the response has gb as the unit? I could find the unit neither in the field not in the definition. My +1 would be to remove the unit.

but I understand your concern with Mb & MB, to solve that we can add Megabytes in the description of the fields.

Yeah, cool. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it supports other types too. Did you test it in a setup where GB's of memory is available to JVM?

i agree with your point. let me remove the unit type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @shmsr i checked the behavior of the metric, Even if size of data increases the metrics is still pretionted in its specified unit. below is an example.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking.

description: Average rate of page faults on this process per second over the selected sample period.
- name: process_id
type: keyword
description: Combination of hostname and Internet Assigned Numbers Authority (IANA) port that serves the MongoDB process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not use IANA port in the description. Pretty sure, people use it on ports that are not IANA port i.e., 27017.

Bettter use something like: "MongoDB process port"

metric_type: gauge
unit: percent
description: Percentage of time that the CPU spent servicing user calls for the search process.
- name: jvm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be under status group? page_faults too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can create one group for status under hardware.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Comment on lines 77 to 84
- name: available.mb
type: long
metric_type: counter
description: Total amount of available memory in the JVM heap.
- name: used.mb
type: long
metric_type: gauge
description: Amount of memory that the JVM heap is currently using.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

units not mentioned here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont have mb as supported unit currently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it should be byte.

As per this, they call it "byte value".

See if you can verify? I did not double-check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think setting unit to byte will create confusion for users, since if the real data is not in bytes, but in read me it will show as byte. Also I can not find any solid documentation which would define unit type byte and if it can be used for mb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, cool.

@@ -0,0 +1,54 @@
title: Collect Hardware metrics from MongoDB Atlas
type: logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, The data we are collecting is metrics, but we are using CEL input here, which falls under filebeat. So unfortunately the type here can not be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

default: 10m
multi: false
required: true
show_user: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period should ideally shown to user by default?

@shmsr shmsr added the enhancement New feature or request label May 3, 2024
type: group
fields:
- name: group_id
description: Identifier for the project of the event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace the description for the group_id. Unique identifier that identifies the project.


Data streams:
- `hardware`: This data stream collects all the Atlas Search hardware and status data series within the provided time range for one process in the specified project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand the term Atlas Search hardware and status data series. While going through the Atlas documentation it has search metrics and hardware metrics. Are we referring to the search metrics and hardware metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we are not referring to search and hardware metrics here. Here is documentation which will help you understand the hardware and status metrics.

@niraj-elastic niraj-elastic requested review from a team as code owners May 8, 2024 06:59
@shmsr
Copy link
Member

shmsr commented May 8, 2024

@niraj-elastic Hey, let's fix the merge conflicts and rebase with main as other changes also came to this PR. Because of the unrelated changes multiple teams are tagged to this. Unassigning them for now to whom I can until this is fixed.

@shmsr shmsr removed request for gizas and constanca-m May 8, 2024 07:19
@shmsr shmsr changed the title [MongoDB Atlas] Hardware data stream [MongoDB Atlas] Add Hardware data stream May 8, 2024
@shmsr shmsr removed request for a team May 8, 2024 09:41
@shmsr
Copy link
Member

shmsr commented May 8, 2024

Removed review requests for unrelated teams. Also, looks like you have addressed all the review comments. The changes look good.

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @niraj-elastic

Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shmsr shmsr merged commit 47b76fa into elastic:main May 9, 2024
5 checks passed
@elasticmachine
Copy link

Package mongodb_atlas - 0.0.4 containing this change is available at https://epr.elastic.co/search?package=mongodb_atlas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants