Skip to content

feat(issues): Add GroupActionLogEntry#115771

Open
kcons wants to merge 7 commits into
masterfrom
kcons/thelog
Open

feat(issues): Add GroupActionLogEntry#115771
kcons wants to merge 7 commits into
masterfrom
kcons/thelog

Conversation

@kcons
Copy link
Copy Markdown
Member

@kcons kcons commented May 19, 2026

Adds GroupActionLog entry, with no meaningful actions or uses yet.

Admittedly, this isn't very meaningful without a mechanism to derive data or any real events, but it is a precursor to both of those and needs to make sense as a log independent of them.
The aim here is to make it possible to record arbitrary actions on Groups and consume them sequentially.
There aren't specific planned query patterns for actor or project_id as provided here, but they are core pieces of information about an event that should always be present and stored efficiently, and requiring them out of the gate allow for a variety of useful analyses we may need to do.

A core risk here is that this scheme is ultimately mutable. There'll be a backfill helper method with invalidation hooks and another one for merges (both render log-derived data potentially no longer canonical); asequential autoincrement IDs through an ordered log give a record of this permanently, which is nice, but it's always possible to add code that doesn't follow the rules.

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 19, 2026
@kcons
Copy link
Copy Markdown
Member Author

kcons commented May 19, 2026

@shashjar fyi

@github-actions
Copy link
Copy Markdown
Contributor

This PR has a migration; here is the generated SQL for src/sentry/migrations/1100_add_issue_action_log_entry.py

for 1100_add_issue_action_log_entry in sentry

--
-- Create model IssueActionLogEntry
--
CREATE TABLE "sentry_issueactionlogentry" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "group_id" bigint NOT NULL, "project_id" bigint NOT NULL, "original_group_id" bigint NULL, "type" integer NOT NULL CHECK ("type" >= 0), "actor_type" integer NOT NULL CHECK ("actor_type" >= 0), "actor_id" bigint NOT NULL, "data" jsonb NOT NULL, "date_added" timestamp with time zone DEFAULT (STATEMENT_TIMESTAMP()) NOT NULL, "idempotency_key" varchar(64) NULL);
CREATE UNIQUE INDEX CONCURRENTLY "uniq_issueactionlogentry_group_idempotency_key" ON "sentry_issueactionlogentry" ("group_id", "idempotency_key") WHERE "idempotency_key" IS NOT NULL;
CREATE INDEX CONCURRENTLY "sentry_issu_group_i_ff3639_idx" ON "sentry_issueactionlogentry" ("group_id", "date_added", "id");

@kcons kcons marked this pull request as ready for review May 19, 2026 17:38
@kcons kcons requested review from a team as code owners May 19, 2026 17:38
Comment thread src/sentry/issues/derived/types.py Outdated
Comment thread src/sentry/models/issueactionlogentry.py Outdated
Comment thread src/sentry/issues/groupactionlogentry.py
Comment thread src/sentry/issues/groupactionlogentry.py
Comment thread src/sentry/issues/groupactionlogentry.py
Comment thread tests/sentry/deletions/test_group.py Outdated
Comment thread src/sentry/issues/derived/recording.py Outdated
Comment thread src/sentry/issues/derived/types.py Outdated
Comment thread src/sentry/models/groupactionlogentry.py Outdated
Comment thread src/sentry/models/groupactionlogentry.py Outdated
Comment thread src/sentry/issues/groupactionlogentry.py
@kcons kcons marked this pull request as draft May 19, 2026 22:05
Comment thread src/sentry/deletions/defaults/group.py Outdated
Comment thread src/sentry/issues/derived/types.py Outdated
Comment thread src/sentry/issues/derived/types.py Outdated
@thetruecpaul
Copy link
Copy Markdown
Contributor

Okay, potentially hot take that you can feel free to ignore: have we considered using "Group" instead of "Issue" throughout? (So it'd be GroupActionLog, etc.)

The reason for this is pretty simple: consistency with everything else. All of our tables use Group instead of Issue (e.g. sentry_groupredirect), as do all of our Models (e.g. GroupRedirect)... generally we've used Issues for user-facing stuff but kept Group around internally.

Personally I think the consistency here is worth it — but will leave this up to you.

@kcons
Copy link
Copy Markdown
Member Author

kcons commented May 19, 2026

Okay, potentially hot take that you can feel free to ignore: have we considered using "Group" instead of "Issue" throughout? (So it'd be GroupActionLog, etc.)

The reason for this is pretty simple: consistency with everything else. All of our tables use Group instead of Issue (e.g. sentry_groupredirect), as do all of our Models (e.g. GroupRedirect)... generally we've used Issues for user-facing stuff but kept Group around internally.

Personally I think the consistency here is worth it — but will leave this up to you.

I find the argument for Group compelling; I used "Issue" since that's what all of the discussion prior to this PR has used. If I can force us to set policy on "Group is what we call it in code, not an old name we haven't managed to change", I will.

@kcons kcons changed the title feat(issues): Add IssueActionLogEntry feat(issues): Add GroupActionLogEntry May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

This PR has a migration; here is the generated SQL for src/sentry/migrations/1100_add_group_action_log_entry.py

for 1100_add_group_action_log_entry in sentry

--
-- Create model GroupActionLogEntry
--
CREATE TABLE "sentry_groupactionlogentry" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "group_id" bigint NOT NULL, "project_id" bigint NOT NULL, "original_group_id" bigint NULL, "type" integer NOT NULL CHECK ("type" >= 0), "actor_type" integer NOT NULL CHECK ("actor_type" >= 0), "actor_id" bigint NOT NULL, "data" jsonb NOT NULL, "date_added" timestamp with time zone DEFAULT (STATEMENT_TIMESTAMP()) NOT NULL, "date_updated" timestamp with time zone NOT NULL, "idempotency_key" varchar(64) NULL);
CREATE UNIQUE INDEX CONCURRENTLY "uniq_groupactionlogentry_group_idempotency_key" ON "sentry_groupactionlogentry" ("group_id", "idempotency_key") WHERE "idempotency_key" IS NOT NULL;
CREATE INDEX CONCURRENTLY "sentry_grou_group_i_cc465f_idx" ON "sentry_groupactionlogentry" ("group_id", "date_added", "id");

Copy link
Copy Markdown
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this similar to the existing GroupHistory table?

@kcons
Copy link
Copy Markdown
Member Author

kcons commented May 20, 2026

Isn't this similar to the existing GroupHistory table?

It is fairly similar. Also very similar to Activity. Also a few others.
The theory is that it should be possible and ultimately more manageable for us to derive that data from this table.
That's not the near-term target, we do need to make sure that we have sufficient info (ie release data) for it to be possible, and it's entirely possible that performance constraints push us to something that looks suspiciously like GroupHistory, but the goal is to have a canonical log so we can produce whatever derived data we need (like GroupHistory or other features) from a stable foundation.
Whether that'll work out is yet to be determined, but at worst we should find it easier to derive data based on fairly arbitrary event sequences.

One of the main risks here is
standards_2x

@kcons kcons marked this pull request as ready for review May 20, 2026 22:21
@kcons kcons requested review from shashjar and thetruecpaul May 20, 2026 23:24
Copy link
Copy Markdown
Member

@shashjar shashjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kcons
Copy link
Copy Markdown
Member Author

kcons commented May 21, 2026

Secondary database plan is here to move it later. I've started the process of getting it set up, it'll take a bit, and in the meantime I'll configure the getsentry router to yell at us if it is used in any context that is incompatible with separate database.

Comment on lines +52 to +56
date_added = models.DateTimeField(db_default=Now())

# Primarly intended for debugging; not intended to be relied upon
# for invalidation.
date_updated = models.DateTimeField(auto_now=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should we add these via using the DefaultFieldsModel base class instead?

Comment on lines +51 to +52
# DB-defaulted; backfill code may pass an explicit value.
date_added = models.DateTimeField(db_default=Now())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we evict these over time, or just rely on Group to cascade delete? If we want to expire them after a period of time, an index could be useful here

app_label = "sentry"
db_table = "sentry_groupactionlogentry"
indexes = [
models.Index(fields=["group_id", "date_added", "id"]),
Copy link
Copy Markdown
Member

@wedamija wedamija May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the intent of this index? jfyi, usually if you need to filter on two really high cardinality columns an index isn't going to help too much, since there typically won't be more than one row in each leaf.

The main reason this can be helpful is if you're making a query that only returns group_id, date_added, id. If that's not the intent, probably I'd just remove id here and keep the index size a little smaller.

Comment on lines +32 to +34
group_id = BoundedBigIntegerField()
# The project the group belongs to.
project_id = BoundedBigIntegerField()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably add indexes on all of these fk columns

# The project the group belongs to.
project_id = BoundedBigIntegerField()
# The group_id before any merges, if this entry was migrated.
original_group_id = BoundedBigIntegerField(null=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this also needs an index?

actor_type = BoundedPositiveIntegerField(
choices=[(t.value, t.name) for t in GroupActorType],
)
actor_id = BoundedBigIntegerField()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also need an index?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants