Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report resource usage counts by handling heartbeat events #35968

Merged
merged 2 commits into from Jan 3, 2024

Conversation

zmb3
Copy link
Collaborator

@zmb3 zmb3 commented Dec 21, 2023

Buddy PR for #34954
Closes #34954

Copy link

The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with changelog: followed by the changelog entries for the PR.

@zmb3 zmb3 added the no-changelog Indicates that a PR does not require a changelog entry label Dec 21, 2023
@zmb3 zmb3 force-pushed the zmb3/buddy-resource-reporting branch from 457c248 to f37ed08 Compare December 21, 2023 16:57
proto/prehog/v1/teleport.proto Outdated Show resolved Hide resolved
lib/usagereporter/teleport/aggregating/reporter.go Outdated Show resolved Hide resolved
for _, item := range result.Items {
report := &prehogv1.ResourcePresenceReport{}
if err := proto.Unmarshal(item.Value, report); err != nil {
return nil, trace.Wrap(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to abort the listing operation if there is one bad report in storage? Can we log the failure and keep trying the rest of the reports instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but if we make that change we should do it for user activity reports too, so for now I think it's best to be consistent with current behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably something we want to look into. I know we've had various bugs caused by getting resources failing due to one bad resource aborting the entire operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing usage data submission will result in a cluster alert, which will hopefully prompt the customer into calling us. That'd still work if we skipped over invalid data, but we would need to tweak the logic around creating and deleting the cluster alerts (to still create one), since we don't want to keep ignoring some logic bug.

lib/usagereporter/teleport/aggregating/service_test.go Outdated Show resolved Hide resolved
@rosstimothy rosstimothy self-requested a review December 21, 2023 20:14
@zmb3 zmb3 force-pushed the zmb3/buddy-resource-reporting branch from f37ed08 to a03f93d Compare December 24, 2023 18:28
proto/prehog/v1/teleport.proto Outdated Show resolved Hide resolved
Copy link
Contributor

@espadolini espadolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do-not-merge because the .protos need to be updated in cloud master first, then copied here, other than that LGTM.

Buddy PR for #34954
Closes #34954

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
Signed-off-by: Zac Bergquist <zac.bergquist@goteleport.com>
@zmb3
Copy link
Collaborator Author

zmb3 commented Jan 2, 2024

do-not-merge because the .protos need to be updated in cloud master first, then copied here, other than that LGTM.

Protos were already merged in cloud master (see https://github.com/gravitational/cloud/pull/6823) but I pulled the latest in here (only differences were in comments).

@zmb3 zmb3 force-pushed the zmb3/buddy-resource-reporting branch from 6e930ff to 89c35c9 Compare January 2, 2024 21:47
@zmb3
Copy link
Collaborator Author

zmb3 commented Jan 2, 2024

cc @timothyb89 - this pulls in some of your new bot protos. Let me know if that's okay.

@timothyb89
Copy link
Contributor

cc @timothyb89 - this pulls in some of your new bot protos. Let me know if that's okay.

I think it's fine, it'll just be competing with #35881 so one of us will have a conflict to resolve 🙂

Merged via the queue into master with commit 711fa4e Jan 3, 2024
41 checks passed
@zmb3 zmb3 deleted the zmb3/buddy-resource-reporting branch January 3, 2024 20:20
@public-teleport-github-review-bot

@zmb3 See the table below for backport results.

Branch Result
branch/v12 Failed
branch/v13 Failed
branch/v14 Failed

Envek added a commit to Envek/teleport that referenced this pull request Jan 4, 2024
…se-anon-key

* origin/master: (344 commits)
  Undelete CreateHostUserMode_HOST_USER_MODE_DROP (gravitational#36273)
  allow cwd to be changed in difftest (gravitational#35946)
  Auth device list component (gravitational#36235)
  make unified resources responsive (gravitational#35961)
  Support running Teleport in a "hot reload" mode (gravitational#35040)
  Prevent deleting enum values, allow deleting enum reservations in types.proto (gravitational#36248)
  Remove support for legacy (Amazon Linux 2) AMIs (gravitational#36153)
  Bump version(s) used for teleport-lab and teleport-quickstart (gravitational#36167)
  Allow Reconciler update handler to examine old value during update (gravitational#36171)
  Validate the user still exists during account reset (gravitational#35676)
  ButtonTextWithAddIcon shared component (gravitational#36103)
  Refactor hostname resolution for SSH connections via the WebUI (gravitational#35773)
  add structuredClone to jest JSDOMEnvironment (gravitational#36213)
  fix flaky `lib/auth` cache-enabled tests (gravitational#36216)
  Report resource usage counts by handling heartbeat events (gravitational#35968)
  Reviewer bot should use the stable version of Go (gravitational#36242)
  RFD 0153 Resource Guidelines (gravitational#34103)
  Use cmp and cmpots properly in operator tests (gravitational#36215)
  Relax Kubernetes CRD discovery when building cache (gravitational#36214)
  Add Access List messages to TAG protobuf (gravitational#36176)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants