Skip to content

Support monitoring DynamoDB.#10380

Closed
yswdqz wants to merge 19 commits intoapache:masterfrom
yswdqz:DynamoDB
Closed

Support monitoring DynamoDB.#10380
yswdqz wants to merge 19 commits intoapache:masterfrom
yswdqz:DynamoDB

Conversation

@yswdqz
Copy link
Copy Markdown
Member

@yswdqz yswdqz commented Feb 11, 2023

  • If this is non-trivial feature, paste the links/URLs to the design doc.

  • Update the documentation to include this new feature.

  • Tests(including UT, IT, E2E) are added to verify the new feature.

  • If it's UI related, attach the screenshots below.

  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #.

  • Update the CHANGES log.

RCIUG A4J{DD~ JP{G`A%{W

25~WP0UK JZ`FR }4KLF 0

This PR is not finished yet, but I've encountered some problems, so I'm sending this PR first in order to ask some questions and report on my progress.

Problems:

  1. I design the ui with reference to Amazon CloudWatch, and they use a method called PERIOD, Is there such a function in mal?
    image
    doc is here
  2. The read usage meter is like
    image
    But the real data is always 1
    image
    I think it should be a horizontal line rather than a diagonal line? And is it because of my mal's avg function?
    (also, I find I'm still not familiar with the metric model, are there any articles that go into detail about the metric model?)
  3. About how to set service level and instance level, I think that DynamoDB is based on table. So I set table as instance level, and set AWS account ID as service level , is that ok?

TODO:

  1. set up service level dashboard.
  2. try to test left metrics :
    metrics I have not tested:
    ①read/write_throttled_requests
    ②read/write_throttle_events
    ③read/wirte system error
    ④user error
    ⑤conditional_check_failed_requests
    ⑥transaction_conflict
  3. add ui menu
  4. remove redundant comments

@yswdqz yswdqz added backend OAP backend related. enhancement Enhancement on performance or codes AWS AWS Infrastructure Monitoring labels Feb 11, 2023
Comment thread docs/menu.yml Outdated
path: "/en/setup/backend/backend-mysql-monitoring"
- name: "PostgreSQL Server"
path: "/en/setup/backend/backend-postgresql-monitoring"
- name: "AWS-DynamoDB Server"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: "AWS-DynamoDB Server"
- name: "AWS DynamoDB"

This should be placed in the AWS monitoring too.

image

Also, you should update UI repository to add this feature into two places(database and AWS catalogs).

1. Enable [AWS CloudWatch](https://aws.amazon.com/cn/cloudwatch/)
2. Create [Amazon Kinesis Data Filehose](https://aws.amazon.com/cn/kinesis/data-firehose/), set source to `Direct PUT`, set destination to `HTTP Endpoint`, and set `HTTP EndPoint url` to `aws-firehose-receiver`'s port (refer to [aws-firehose-receiver](aws-firehose-receiver.md))

Note that AWS requires that the `HTTP Endpoint url` must be HTTPS and the port needs to be 443, so you can load the certificate in [aws-firehose-receiver](aws-firehose-receiver.md) and set the port to 443. Also, you can use another gateway to accept the request and route it to `aws-filehose-receiver`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that AWS requires that the `HTTP Endpoint url` must be HTTPS and the port needs to be 443, so you can load the certificate in [aws-firehose-receiver](aws-firehose-receiver.md) and set the port to 443. Also, you can use another gateway to accept the request and route it to `aws-filehose-receiver`.
Note that AWS requires that the `HTTP Endpoint URL` must be through HTTPS listening at 443, therefore need to load the certificate in [aws-firehose-receiver](aws-firehose-receiver.md) and set the port to 443.
Or, you can use another gateway to accept the requests and route them to `aws-filehose-receiver`.

I remember @pg-yang mentioned 443 needs to be some OS user group. Should we mention this? @kezhenxu94


### DynamoDB Monitoring
DynamoDB monitoring provides monitoring of the status and resources of the DynamoDB server. AWS user id is cataloged as a `Layer: AWS_DYNAMODB` `Service` in OAP.
Each DynamoDB table is cataloged as an `Instance` in OAP.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table should be closer to the endpoint concept. Because table is a logic concept rather than a physical deployment concept like pods or processes in a cluster.

@wu-sheng
Copy link
Copy Markdown
Member

Is #10302 (comment) addressed and fixed? I think we need that fix.

@pg-yang Is the e2e mock data load for AWS firehose ready to use? This PR should be covered by that style e2e.

@pg-yang
Copy link
Copy Markdown
Member

pg-yang commented Feb 12, 2023

@pg-yang Is the e2e mock data load for AWS firehose ready to use? This PR should be covered by that style e2e.

I use OTEL mock service to test S3 just like EKS. Because if we directly mock firehose service, we have to copy OTEL v0.7.0 into mock service for rewriting some fields such as time, and we already have OtelMetricsConvertorTest to test conversion logic.

wu-sheng and others added 8 commits February 14, 2023 07:53
…pache#10390)

- `./mvnw test ...` by its nature only runs unit tests, whose name is pattern of `*Test`, and it does never runs integration tests, whose name is patter of `IT*` or `*IT`, so in this PR we use `./mvnw clean test ...` to only run unit tests.
- `./mvnw integration-test ...` will run integration tests and unit tests, so we have `skipUTs` to control whether to skip the UT when running ITs, we already had this before.
  - As for ITs, we have two groups, one is normal tests without any `@Tag`s, the other one is slow integration tests, which is annotated with `@Tag("slow")`, so we divided the integration tests into two workflow jobs:
  - `./mvnw -DskipUTs=true clean integration-test -DexcludedGroups=slow ...` run the ITs but don't run the slow ITs and UTs, `-DexcludedGroups=slow` excludes tests annotated with `@Tag("slow")`
  - `./mvnw -DskipUTs=true clean integration-test -Dcheckstyle.skip -Dtest=${{ matrix.test.class }}` run the slow ITs (because `excludedGroups` is not set) one case at a time, by setting `-Dtest={{ class }}`, not run UTs,
@wu-sheng
Copy link
Copy Markdown
Member

This PR context seems broken. It reports codes have been changed on upstream.

@yswdqz yswdqz closed this Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AWS AWS Infrastructure Monitoring backend OAP backend related. enhancement Enhancement on performance or codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants