Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awscloudwatch input drops data #38918

Closed
faec opened this issue Apr 13, 2024 · 1 comment · Fixed by #38953
Closed

awscloudwatch input drops data #38918

faec opened this issue Apr 13, 2024 · 1 comment · Fixed by #38953
Assignees
Labels
bug Team:Elastic-Agent Label for the Agent team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team

Comments

@faec
Copy link
Contributor

faec commented Apr 13, 2024

The awscloudwatch input can skip data in its target log groups, with severity depending on configuration and log size. This seems to apply to all platforms and all versions at least since 8.0.

Easy reproduction:

  • Set number_of_workers to 1 (the default)
  • Set start_position to beginning (the default)
  • Set log_group_name_prefix to a value matching 2 or more log groups

The first matching log group will ingest data starting from the beginning, but all other log groups will only include data from after ingestion began.

This loss continues during ingestion: events from any time span will only include data from at most one log group at a time.

More finicky reproduction with a single log group:

  • Set number_of_workers to 1 (the default)
  • Set start_position to beginning (the default)
  • Target a single log group with a significant amount of past data (enough to require significantly longer than the scan_frequency to ingest -- optionally set scan_frequency to 1s to make this easier)

Data added to the log group between the start of ingestion and the completion of the first scan will be skipped.

@faec faec added bug Team:Elastic-Agent Label for the Agent team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team labels Apr 13, 2024
@faec faec self-assigned this Apr 13, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@faec faec added Team:Cloud-Monitoring Label for the Cloud Monitoring team and removed Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team labels Apr 15, 2024
@faec faec added Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team and removed Team:Cloud-Monitoring Label for the Cloud Monitoring team labels Apr 15, 2024
faec added a commit that referenced this issue Apr 23, 2024
Fix a bug in cloudwatch worker allocation that could cause data loss (#38918).

The previous behavior wasn't really tested, since worker tasks were computed in cloudwatchPoller's polling loop which required live AWS connections. So in addition to the basic logical fix, I did some refactoring to cloudwatchPoller that makes the task iteration visible to unit tests.
mergify bot pushed a commit that referenced this issue Apr 23, 2024
Fix a bug in cloudwatch worker allocation that could cause data loss (#38918).

The previous behavior wasn't really tested, since worker tasks were computed in cloudwatchPoller's polling loop which required live AWS connections. So in addition to the basic logical fix, I did some refactoring to cloudwatchPoller that makes the task iteration visible to unit tests.

(cherry picked from commit deece39)
faec added a commit that referenced this issue Apr 24, 2024
Fix a bug in cloudwatch worker allocation that could cause data loss (#38918).

The previous behavior wasn't really tested, since worker tasks were computed in cloudwatchPoller's polling loop which required live AWS connections. So in addition to the basic logical fix, I did some refactoring to cloudwatchPoller that makes the task iteration visible to unit tests.

(cherry picked from commit deece39)

Co-authored-by: Fae Charlton <fae.charlton@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent Label for the Agent team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants