feat(analytics): analytics subcommand group (digest shim, quiet, trends)#47
Open
mvanhorn wants to merge 4 commits intosteipete:mainfrom
Open
feat(analytics): analytics subcommand group (digest shim, quiet, trends)#47mvanhorn wants to merge 4 commits intosteipete:mainfrom
mvanhorn wants to merge 4 commits intosteipete:mainfrom
Conversation
Adds discrawl digest, a per-channel activity summary over a time window. discrawl already has report for repo-wide README dumps and messages / search for retrieval; digest answers what happened in this guild over the last N days, per channel. Ports vincentkoc/slacrawl#9 (merged 2026-04-22). Same SQL recipe, adapted to discrawl's Discord schema (guild_id, members, mention_events) and the existing stdlib-flag CLI dispatch. This contribution was developed with AI assistance.
RankedCount is reused by digest's top_posters/top_mentions slices.
Without JSON tags, those nested entries serialized as {Name, Count}
while the rest of the digest schema uses snake_case ({channel_id,
messages, ...}). Tag the fields so --json output is consistent.
Surfaced by codex review on the previous commit.
Introduces discrawl analytics as a namespace for activity-style queries. Three subcommands ship: - analytics digest: delegates to the existing digest implementation, so discrawl digest is unchanged - analytics quiet: channels with no activity in the lookback window (archive candidates), default --since 30d - analytics trends: week-bucketed message counts per channel, zero-filled across the window, default --weeks 8 Ports vincentkoc/slacrawl#13 (merged 2026-04-23). Same SQL recipes adapted to discrawl's Discord schema. Stacked on top of the digest PR so analytics digest can shim to runDigest and share the implementation. This contribution was developed with AI assistance.
Both quiet and trends queries left-joined the full channels table, which includes category and voice channels. Those rows can never have synced messages, so quiet surfaced them as never-active archive candidates and trends emitted all-zero rows for them. Filter to the message-bearing kinds the syncer ingests: - text - announcement - thread_public - thread_private - thread_announcement Forum parents are excluded since the syncer's messageChannelKinds() also excludes them. Forum threads (kind='thread_public') are still included. Surfaced by codex review on the previous commit.
|
To use Codex here, create a Codex account and connect to github. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces
discrawl analyticsas a namespace for activity-style queries. Three subcommands ship:analytics digest- delegates to the existingdigestimplementation, sodiscrawl digestis unchangedanalytics quiet- channels with no activity in the lookback window (archive candidates), default--since 30danalytics trends- week-bucketed message counts per channel, zero-filled across the window, default--weeks 8Ports vincentkoc/slacrawl#13 (merged 2026-04-23). Same SQL recipes adapted to discrawl's Discord schema.
Stacking
This PR is stacked on top of #46 (
feat(cli): digest command). Until #46 merges, the diff here includes both sets of changes. Once #46 lands, this PR narrows to just the analytics work. Happy to wait for #46 review before this one moves; happy to combine into a single PR if that is preferred.Why this matters
quietandtrendsanswer them.discrawlalready has the schema, indexes (idx_messages_guild_created_id), and theinternal/report/package needed. The SQL is one query each; no schema changes.Output
Flags
analytics quiet:--since(default30d),--guild. Inherits--jsonand--plainfrom the root CLI.analytics trends:--weeks(default8),--guild,--channel. Same root-CLI inheritance.analytics digest: same flags asdigest(delegates torunDigest, identical behavior).Scope
messages/channelstables.quietandtrendsfilter to message-bearing channel kinds (text,announcement,thread_public,thread_private,thread_announcement) so category and voice channels do not appear as never-active or all-zero rows. Mirrors the syncer'smessageChannelKinds()predicate.internal/report/quiet.go,internal/report/quiet_test.go,internal/report/trends.go,internal/report/trends_test.go,internal/cli/analytics.go,internal/cli/analytics_test.go.internal/cli/cli.go(one switch case),internal/cli/output.go(printPlain + printHuman cases forQuietandTrends, plusanalyticsin usage),README.md,SPEC.md.Test plan
gofmt -l .cleango vet ./...cleango build ./cmd/discrawlgo test ./...passes (11 packages, all green)Self-reviewed via
codex reviewbefore pushing. The first review pass flagged thatquietandtrendswould surface category and voice channels as never-active, which would be misleading. Fixed in a follow-up commit on this branch by adding the kind filter described in Scope.Open questions
analyticsvsinsightsvsstats. Slacrawl went withanalytics; happy to rename.internal/report/vs newinternal/analytics/package: kept everything ininternal/report/alongsidereport.goanddigest.goto match the current organization. If you would prefer to carve out aninternal/analytics/package, easy follow-up PR.--sinceforquiet: went with30d(slacrawl's default). For Discord's larger guilds with many low-activity channels,60dor90dmay be more useful. Happy to change.quietorder: zero-activity (silent: -) first, then by name alphabetically. The slacrawl version sorts by last-message ascending which puts oldest first; this implementation puts never-active first since they are the strongest archive candidates. Happy to flip if you prefer slacrawl's ordering.Subsequent phases (not in this PR)
Slacrawl's plan doc had
healthandresponse-timesas phase 3, thenthreads-staleandactivityas phase 4. Same idea here; happy to follow up if these land well.This contribution was developed with AI assistance.