Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Celerdata] integration with CelerData #2326

Merged

Conversation

yandongxiao
Copy link
Contributor

What does this PR do?

The integration work between Celerdata and Datadog has been completed.

Motivation

We want to submit an official integration.

Review checklist

  • PR has a meaningful title or PR has the no-changelog label attached
  • Feature or bugfix has tests
  • Git history is clean
  • If PR impacts documentation, docs team has been notified or an issue has been opened on the documentation repo
  • If this PR includes a log pipeline, please add a description describing the remappers and processors.

Additional Notes

Anything else we should know when reviewing?

This is the first version of our integrated work, mainly completing:

  1. Reporting all Prometheus metrics to Datadog
  2. Providing comments in conf.yaml.example explaining the log file paths that need to be collected.

@yandongxiao yandongxiao requested review from a team as code owners March 19, 2024 02:24
@yandongxiao yandongxiao force-pushed the celerdata-datadog-integration2 branch from f53931f to a83816c Compare March 19, 2024 03:07
Copy link

github-actions bot commented Mar 19, 2024

Test Results

2 tests   2 ✅  1m 38s ⏱️
1 suites  0 💤
1 files    0 ❌

Results for commit 27f512d.

♻️ This comment has been updated with latest results.

@yandongxiao yandongxiao force-pushed the celerdata-datadog-integration2 branch 6 times, most recently from 2ba0375 to 04f47ea Compare March 19, 2024 12:30
@jhgilbert jhgilbert added the editorial review Waiting on a more in-depth review from a docs team editor label Mar 19, 2024
Signed-off-by: yandongxiao <yandongxiao@starrocks.com>
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/images/IMAGES_README.md Outdated Show resolved Hide resolved
celerdata/manifest.json Outdated Show resolved Hide resolved
celerdata/manifest.json Show resolved Hide resolved
Signed-off-by: yandongxiao <dxyan06@gmail.com>
Copy link
Contributor

@alai97 alai97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some doc feedback~

celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
@@ -0,0 +1,543 @@
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name,curated_metric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformat the description as complete sentences, for example:

Job statistics.
The maximum journal ID of this frontend.

Also noting that there are missing metric descriptions in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‌‌‌‌‌‌‌As we do not have documentation to fully describe these metrics, the descriptions of these metrics are missing and may not be immediately fixable at this time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‌‌The reformatting issue has been resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, it looks like a lot of these have 11 columns instead of 10 and isn't a valid CSV file, could you please fix this?
(Validations have been improved, so future commits on this PR should catch those and report errors back to you)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, have Fixed it

yandongxiao and others added 6 commits March 22, 2024 15:38
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
@thibaultkrebs thibaultkrebs added assets/deploy-logs-staging ONLY USED BY Logs Backend - Validates that a PR is OK to go to staging and removed assets/no-deploy Prevents APW from deploying this PR in staging labels Apr 22, 2024
- "source:LOGS_SOURCE"
timestamp: 1712716343937
-
sample: "I0328 10:22:09.303526 1066 olap_server.cpp:844] try to perform path gc by tablet!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an issue to test this one in staging. There seems there is no date (MMdd) in this log sample, but there is an expected one in the grok parser?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

39003729303 for the timestamp means it is definitely wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date (MMdd) does exist, 'I0328' includes the log level and date information.

During the deployment process of Celerdata, different components, FE/BE, will be deployed on different host machines. This type of log is from the BE component.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your solid review! I will check why the timestamp is wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, I got mislead by believing the number after was just a code. But this is correct, it is just the date MMdd being parsed.

I replicated the grok parser with your sample in Datadog and indeed this is parsing the timestamp well.
image

We have a problem with our internal tool to generate this result in the CI. I will need to investigate. Is your integration urgent to be deployed, or if we delay for few days the time I find the issue is fine on your side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can accept to wait few days. During the waiting period, Can you continue to review other parts of this PR? I don't know how long it will take to merge PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give you an update by this week on the log part. For other files included in this PR, some other people will be in charge to review. I invite you to check with the partner team if you think some areas are still pending reviews.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have discussed internally with our team, and it appears that we won't be addressing this issue immediately. To avoid holding up this PR, please go ahead and remove this test for now. We plan to revisit this within a few weeks to resolve the issue and reintroduce the test.

I apologize for any inconvenience, especially after I had requested you to increase the test coverage...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you for your feedback

Signed-off-by: yandongxiao <dxyan06@gmail.com>
Signed-off-by: yandongxiao <dxyan06@gmail.com>
Signed-off-by: yandongxiao <dxyan06@gmail.com>
@thibaultkrebs
Copy link
Contributor

Looks good for me in our staging environment. I stamp my approval!

image
image

@nmuesch nmuesch added the assets/no-deploy Prevents APW from deploying this PR in staging label Apr 25, 2024
Signed-off-by: yandongxiao <dxyan06@gmail.com>
Signed-off-by: yandongxiao <dxyan06@gmail.com>
@emarsha94 emarsha94 assigned nmuesch and bgoldberg122 and unassigned nmuesch May 2, 2024
@bgoldberg122 bgoldberg122 removed the assets/no-deploy Prevents APW from deploying this PR in staging label May 2, 2024
.github/CODEOWNERS Outdated Show resolved Hide resolved
.github/CODEOWNERS Outdated Show resolved Hide resolved
celerdata/README.md Outdated Show resolved Hide resolved
yandongxiao and others added 3 commits May 4, 2024 08:39
Co-authored-by: bgoldberg122 <ben.goldberg@datadoghq.com>
Co-authored-by: bgoldberg122 <ben.goldberg@datadoghq.com>
Co-authored-by: bgoldberg122 <ben.goldberg@datadoghq.com>
@bgoldberg122 bgoldberg122 merged commit 663a7ef into DataDog:master May 14, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet