Skip to content

Conversation

@hehe7318
Copy link
Contributor

@hehe7318 hehe7318 commented Dec 9, 2025

Description of changes

The CloudWatch Agent configuration was using the default timestamp format (%Y-%m-%d %H:%M:%S,%f) for chef-client.log, but Chef/Cinc outputs ISO 8601 timestamps in format: [YYYY-MM-DDTHH:MM:SS+TZ].

This mismatch caused CloudWatch to fail parsing timestamps, resulting in log lines being associated with incorrect timestamps.

  • Add new 'chef' timestamp format: [%Y-%m-%dT%H:%M:%S
    (Note: CloudWatch Agent's %z only supports timezone without colon like -0700, but Chef outputs +02:00 format. We only match up to seconds and let CloudWatch handle the rest.)
  • Update chef-client.log configuration to use the new 'chef' format

Tests

Manually tests done, I create a cluster and export the cluster logs. Now in the chef-client log, I can see CW timestamp matches chef timestamp, e.g.:

2025-12-12T20:14:16.000Z [2025-12-12T20:14:16+00:00] INFO: Retrying execution of execute[check if clustermgtd heartbeat is available], 29 attempts left
2025-12-12T20:15:38.000Z [2025-12-12T20:15:38+00:00] INFO: Waiting for static fleet capacity provisioning
2025-12-12T20:15:07.000Z [2025-12-12T20:15:07+00:00] INFO: Waiting for static fleet capacity provisioning

Reference

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html#CloudWatch-Agent-Configuration-File-Logssection
Expand CloudWatch agent configuration file: Logs section section:

%z
Time zone, expressed as the offset between the local time zone and UTC. For example, -0700. Only this format is supported. For example, -07:00 isn't a valid format.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@hehe7318 hehe7318 added the 3.x label Dec 9, 2025
@hehe7318 hehe7318 requested review from a team as code owners December 9, 2025 21:03
The CloudWatch Agent configuration was using the 'default' timestamp format
(%Y-%m-%d %H:%M:%S,%f) for chef-client.log, but Chef/Cinc outputs timestamps
in ISO 8601 format: [YYYY-MM-DDTHH:MM:SS+TZ].

This mismatch caused CloudWatch to fail parsing timestamps, resulting in log
lines being associated with incorrect timestamps (often shifted by hours),
which significantly complicated root cause analysis during incident
investigations.

Changes:
- Add new 'chef' timestamp format: [%Y-%m-%dT%H:%M:%S
  (Note: CloudWatch Agent's %z only supports timezone without colon like -0700,
  but Chef outputs +02:00 format. We match up to seconds and let CloudWatch
  handle the rest.)
- Update chef-client.log configuration to use the new 'chef' format

Reference: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html\#CloudWatch-Agent-Configuration-File-Logssection
@hehe7318 hehe7318 force-pushed the wip/fix-log-timestamp-3141 branch from 9e16558 to f415f1d Compare December 9, 2025 21:08
@himani2411
Copy link
Contributor

Can you add more details on how this was tested?

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.22%. Comparing base (f4ed435) to head (4a68a7f).
⚠️ Report is 2 commits behind head on release-3.14.

Additional details and impacted files
@@              Coverage Diff              @@
##           release-3.14    #3061   +/-   ##
=============================================
  Coverage         75.22%   75.22%           
=============================================
  Files                24       24           
  Lines              2446     2446           
=============================================
  Hits               1840     1840           
  Misses              606      606           
Flag Coverage Δ
unittests 75.22% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hehe7318
Copy link
Contributor Author

Can you add more details on how this was tested?

Done

"default": "%Y-%m-%d %H:%M:%S,%f",
"bracket_default": "[%Y-%m-%d %H:%M:%S]",
"slurm": "%Y-%m-%dT%H:%M:%S.%f",
"chef": "[%Y-%m-%dT%H:%M:%S",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing final square bracket?

Copy link
Contributor

@gmarciani gmarciani Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the description this is intentional:

(Note: CloudWatch Agent's %z only supports timezone without colon like -0700, but Chef outputs +02:00 format. We only match up to seconds and let CloudWatch handle the rest.)

I'm ok with the current approach. However, did you check if chef allows to write the offset in the format accepted by CW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have synced, that we do have a way to customize the chef timestamp format, but it doesn't worth the effort. So we will accept the current approach.

gmarciani
gmarciani previously approved these changes Dec 16, 2025
CHANGELOG.md Outdated
- Open MPI: openmpi40-aws-4.1.7-2 and openmpi50-aws-5.0.8-11

**BUG FIXES**
- Fix incorrect timestamp parsing for chef-client.log in CloudWatch Agent configuration. The timestamp format now correctly matches Chef's output format `[YYYY-MM-DDTHH:MM:SS+TZ]`.
Copy link
Contributor

@gmarciani gmarciani Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the detail about the format because it is a low level detail.
I'd limit the line to

Fix incorrect timestamp parsing for chef-client.log in CloudWatch Agent configuration.

@hehe7318 hehe7318 enabled auto-merge (squash) December 16, 2025 17:54
@hehe7318 hehe7318 disabled auto-merge December 16, 2025 21:17
@hehe7318 hehe7318 merged commit 4aa21a5 into aws:release-3.14 Dec 16, 2025
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants