Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fallback for EmitAuditEvents failure due to event conflicts (DynamoDB backend) #40854

Merged
merged 2 commits into from
Apr 25, 2024

Conversation

gabrielcorado
Copy link
Contributor

Closes #40126.

Adds a fallback for when the put item fails due to the condition exception (duplicate events).

In addition, we're adding a new option to disable condition checking, which can be configured through the DynamoDB URL. This option can be used to restore the old behavior.

NOTE: This is being solved on the DynamoDB events "layer" because multiple parts of Teleport are subject to this failure (not only the one described on the issue).

changelog: Fix audit event failures when using DynamoDB event storage.

@gabrielcorado gabrielcorado self-assigned this Apr 24, 2024
@github-actions github-actions bot added audit-log Issues related to Teleports Audit Log size/sm labels Apr 24, 2024
func TestEmitSessionEventsSameIndex(t *testing.T) {
ctx := context.Background()
tt := setupDynamoContext(t)
sessionID := session.NewID()

require.NoError(t, tt.log.EmitAuditEvent(ctx, generateEvent(sessionID, 0)))
require.NoError(t, tt.log.EmitAuditEvent(ctx, generateEvent(sessionID, 1)))
require.Error(t, tt.log.EmitAuditEvent(ctx, generateEvent(sessionID, 1)))
require.NoError(t, tt.log.EmitAuditEvent(ctx, generateEvent(sessionID, 1)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do we want the audit log to be able to be overwritten.

I feel like the original behavior of "first write wins" is more correct for an audit log.

Should we instead just not treat already exists as an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do we want the audit log to be able to be overwritten.

This will only happen if the disable_conflict_check is provided. Otherwise, the event will always need a unique session_id/event_index pair. The fallback logic only sets the event index to a different value and tries to put the item again. If there is still a conflict (very unlikely to happen since we're using unix nano and this is attached to the session), the event emission will still fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it makes it clearer, we can add another test that causes the fallback to also fail. In this case, the EmitAuditEvent will return an error.

@gabrielcorado gabrielcorado requested a review from zmb3 April 24, 2024 16:29
Copy link
Collaborator

@zmb3 zmb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.

@gabrielcorado gabrielcorado added this pull request to the merge queue Apr 25, 2024
Merged via the queue into master with commit af4a627 Apr 25, 2024
40 checks passed
@gabrielcorado gabrielcorado deleted the gabrielcorado/dynamoevents-condition-handling branch April 25, 2024 17:36
@public-teleport-github-review-bot

@gabrielcorado See the table below for backport results.

Branch Result
branch/v14 Create PR
branch/v15 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

session.command event hits a ConditionalCheckFailedException on dynamoevents
3 participants