feat(discord): validate staging JSON with Pydantic; document schema; align exporter fields by leostar0412 · Pull Request #198 · cppalliance/boost-data-collector

leostar0412 · 2026-05-12T18:51:39Z

Summary

Adds a documented Discord staging JSON model, validates envelopes and normalized messages with Pydantic before DB writes in backfill and run_discord_activity_tracker, ships a generated JSON Schema (discord_staging_v1.json), and tightens exporter output (canonical timestamps, actor/source fields where applicable). Shared parsing and truncation live in core so Discord and GitHub trackers stay consistent.

Changes

Docs: docs/discord-tracker-schema.md — envelope, normalized message, reactions, limitations, JSON Schema vs runtime validation.
Discord: staging_schema.py, schemas/discord_staging_v1.json, scripts/write_staging_json_schema.py; validation wired in backfill + run command; chat_exporter / workspace iteration updates; tests for schema, workspace ordering, exporter, sync utils.
Core: format_instant_iso_z / parse_iso_datetime_lenient (and tests); truncate_content helper + tests.
GitHub: sync/utils.py uses core’s lenient datetime parser (dedupe).
Deps: pydantic in requirements.in with lockfile updates.

How to test

pytest discord_activity_tracker/tests/test_staging_schema.py \
  discord_activity_tracker/tests/test_sync_chat_exporter.py \
  discord_activity_tracker/tests/test_run_discord_activity_tracker_command.py \
  core/tests/test_datetime_parsing.py \
  core/tests/test_text_processing.py

Closes #188

Summary by CodeRabbit

New Features
- Lenient ISO datetime parsing/formatting with safe fallbacks
- Text truncation utility
- Discord staging JSON Schema, runtime validation, and schema export script
Improvements
- Consistent datetime normalization across syncs and earlier validation of incoming exports
- Active-channel logic now derives activity from stored messages; per-channel timestamp fields removed
- Deterministic ordering of message JSON processing; admin list display simplified
Tests
- Expanded coverage for datetime parsing/formatting, truncation, conversion, staging-schema validation, and sync flows
Documentation
- Added/updated Discord tracker schema and service API docs

…chema support

coderabbitai · 2026-05-12T18:54:48Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Pydantic staging models and a committed JSON Schema + writer; centralizes lenient ISO datetime parsing, canonical-Z formatting, and text truncation; integrates envelope/normalized-message validation into Discord sync/backfill/export; replaces channel timestamp fields with message-derived activity queries; updates tests, docs, and a migration.

Changes

Discord Staging Schema & Utility Consolidation

Layer / File(s)	Summary
Core DateTime & Text Utilities `core/utils/datetime_parsing.py`, `core/utils/text_processing.py`, `core/tests/test_datetime_parsing.py`, `core/tests/test_text_processing.py`	Adds `parse_iso_datetime_lenient()`, `format_instant_iso_z()`, `CANONICAL_INSTANT_UTC_Z_PATTERN`, module `logger`, and `truncate_content()` with tests covering empty/valid/invalid/edge cases.
Pydantic Schema Models & Validation `discord_activity_tracker/staging_schema.py`	Pydantic `DiscordChatExporterEnvelope` and `NormalizedDiscordMessage` models, validators, `_validation_error` translation to `ValueError`, JSON-schema bundle builder, and `write_staging_json_schema()` helper.
Committed JSON Schema & Generation Script `discord_activity_tracker/schemas/discord_staging_v1.json`, `discord_activity_tracker/scripts/write_staging_json_schema.py`, `discord_activity_tracker/scripts/__init__.py`	Committed draft 2020-12 JSON Schema for envelope and normalized message; script to write the schema from models and package docstring.
Chat Exporter Conversion & Normalization `discord_activity_tracker/sync/chat_exporter.py`, `discord_activity_tracker/tests/test_sync_chat_exporter.py`	`convert_exporter_message_to_dict()` accepts `server_id`/`channel_id`, normalizes/coerces and filters reactions, normalizes timestamps, sets `occurred_at` for canonical timestamps, derives `actor_id`, and conditionally sets `source_url`; tests updated.
Sync Messages & Management Commands Integration `discord_activity_tracker/sync/messages.py`, `discord_activity_tracker/management/commands/*.py`	Sync and management commands validate envelopes, use `parse_iso_datetime_lenient()` for timestamps, convert messages with explicit server/channel ids, validate normalized messages, and stop persisting per-channel last-activity/synced timestamps.
Services, Models & Migration `discord_activity_tracker/services.py`, `discord_activity_tracker/models.py`, `discord_activity_tracker/migrations/0006_remove_channel_last_timestamps.py`	Removes `DiscordChannel.last_activity_at`/`last_synced_at` and index; adds `get_channel_latest_message_at`, `queryset_channels_with_recent_messages`, and updates `get_active_channels` to use message-derived activity; migration added; admin/docs updated.
Export, Workspace Ordering, Tests & Docs `discord_activity_tracker/sync/export.py`, `discord_activity_tracker/workspace.py`, `discord_activity_tracker/tests/*`, `docs/discord-tracker-schema.md`	Export delegates active-channel selection to service helper; `iter_existing_message_jsons()` yields sorted filenames deterministically; adds tests for staging validation, normalized messages, reaction handling, datetime regex/formatting, truncation, and schema documentation.
GitHub Adapter & Requirements `github_activity_tracker/sync/utils.py`, `requirements.in`	GitHub adapter re-exports `parse_iso_datetime_lenient` from core; `pydantic>=2,<3` added to dependencies.
Misc: scripts & cleanup `discord_activity_tracker/scripts/*`, `discord_activity_tracker/sync/utils.py`	Adds script entrypoint to write schema, module docstring, and removes local `parse_datetime`/`truncate_content` helpers in favour of core utilities.

Sequence Diagram(s)

sequenceDiagram
  participant Cmd as Management Command
  participant JSONFile as Discord JSON
  participant ValidateEnv as validate_envelope
  participant Convert as convert_exporter_message_to_dict
  participant ValidateMsg as validate_normalized_message
  participant Parser as parse_iso_datetime_lenient
  participant DB as Database

  Cmd->>JSONFile: Load JSON export
  JSONFile-->>Cmd: Raw envelope dict
  Cmd->>ValidateEnv: validate_envelope(raw, source)
  ValidateEnv-->>Cmd: DiscordChatExporterEnvelope model
  Cmd->>Convert: convert_exporter_message_to_dict(msg, server_id, channel_id)
  Convert->>Parser: parse created_at/edited_at -> datetime
  Parser-->>Convert: datetime (or None)
  Convert-->>Cmd: Normalized dict (actor_id, occurred_at, source_url)
  Cmd->>ValidateMsg: validate_normalized_message(normalized, source)
  ValidateMsg-->>Cmd: NormalizedDiscordMessage model
  Cmd->>DB: Persist validated message

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

jonathanMLDev
wpak-ai

"🐰 Hop hooray — schemas, dates, and text aligned!
I trimmed long tails and Z-times I signed,
Validators guard each normalized line,
Reactions clean, and source URLs shine,
Hop—tests pass and trackers hum in kind!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly addresses the primary objective: it announces the addition of Pydantic-based staging JSON validation, schema documentation, and alignment of exporter fields to canonical form—exactly what the PR delivers.
Linked Issues check	✅ Passed	All coding-related requirements from issue `#188` are met: JSON schema documented in docs/discord-tracker-schema.md [`#188`], Pydantic validation models in staging_schema.py with validate_envelope and validate_normalized_message [`#188`], snake_case fields and ISO Z timestamps [`#188`], test coverage for well-formed messages/reactions and malformed rejection [`#188`], and deterministic workspace file ordering [`#188`].
Out of Scope Changes check	✅ Passed	All changes align with issue `#188` scope: core datetime/text utilities support the validation layer, admin/model/service/sync refactoring removes obsolete timestamp-tracking fields, migration removes deprecated columns, and tests validate the new schema/validation infrastructure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

discord_activity_tracker/management/commands/run_discord_activity_tracker.py (1)

202-207: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Same validation-bypass bug here: consume validate_envelope(...) output.

validate_envelope is currently verification-only in this path. Its normalized/coerced values are discarded, so malformed-but-coercible payloads can still fail later when reading raw data.

Suggested fix

-            data = parse_exported_json(json_path)
-            validate_envelope(data, source=json_path.name)
-            guild_info = data.get("guild", {})
-            channel_info = data.get("channel", {})
-            messages = data.get("messages", [])
+            data = parse_exported_json(json_path)
+            envelope = validate_envelope(data, source=json_path.name)
+            guild_info = envelope.guild.model_dump(by_alias=True)
+            channel_info = envelope.channel.model_dump(by_alias=True)
+            messages = envelope.messages

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`
around lines 202 - 207, validate_envelope(...) is being used only for
side-effect validation while the code continues to use the original raw `data`,
discarding normalized/coerced values; change the flow to use the function's
returned/normalized envelope (e.g. assign the result of
validate_envelope(json_path.name) to a variable or replace `data` with that
return value) and then read `guild_info`, `channel_info`, and `messages` from
that validated object (referencing parse_exported_json, validate_envelope, and
the variables `data`, `guild_info`, `channel_info`, `messages`) so you consume
the coerced values rather than the original raw payload.

🧹 Nitpick comments (2)

core/tests/test_text_processing.py (1)

73-86: ⚡ Quick win

Add a regression test for negative max_length behavior

These tests cover the positive-path boundaries well. Add one case for negative max_length to pin expected behavior after the utility guard is introduced.

Suggested test case

+import pytest
+
 def test_truncate_content_max_length_at_most_three():
     assert truncate_content("abcdef", max_length=2) == "ab"
+
+
+def test_truncate_content_negative_max_length_raises():
+    with pytest.raises(ValueError):
+        truncate_content("abcdef", max_length=-1)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/tests/test_text_processing.py` around lines 73 - 86, Add a regression
test for truncate_content when max_length is negative: create a new test (e.g.,
test_truncate_content_negative_max_length) that calls
truncate_content(some_string, max_length=-1) and asserts the expected behavior
(either returns the original string, an empty string, or clamps to 0 depending
on the implemented guard). Locate truncate_content usage in the test module
(functions like test_truncate_content_short_unchanged,
test_truncate_content_long_adds_ellipsis,
test_truncate_content_max_length_at_most_three) and add the negative-case
assertion consistent with the utility's guard.

discord_activity_tracker/tests/test_staging_schema.py (1)

25-35: ⚡ Quick win

Add a negative timestamp-format test to lock canonical Z behavior.

Current tests don’t assert that malformed/non-Z timestamps are rejected, so this contract can regress silently.

🧪 Suggested test addition

+def test_validate_normalized_rejects_non_canonical_timestamp():
+    raw = _minimal_exporter_message()
+    converted = convert_exporter_message_to_dict(raw, server_id=1, channel_id=2)
+    converted["created_at"] = "2026-01-01 12:00:00"  # not ISO Z
+
+    with pytest.raises(ValueError, match="Invalid normalized Discord message"):
+        validate_normalized_message(converted, source="unit")

Also applies to: 55-81

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/tests/test_staging_schema.py` around lines 25 - 35,
Add a negative test that ensures timestamps without the canonical trailing "Z"
are rejected: create a new test (e.g.,
test_validate_normalized_rejects_non_z_timestamp) that builds a message via
_minimal_exporter_message() and convert_exporter_message_to_dict(...), then
mutate converted["occurred_at"] to a non-`Z` form (e.g., replace trailing "Z"
with "+00:00" or drop the "Z"), and assert
validate_normalized_message(converted, source="test") raises the expected
validation error (use pytest.raises or the project’s ValidationError type).
Repeat the same style of negative assertion for the other related tests in the
55-81 area so the canonical `Z` contract is locked.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/utils/text_processing.py`:
- Around line 249-255: The function truncate_content currently allows negative
max_length which triggers Python negative slicing; add an explicit guard at the
start of truncate_content to handle max_length < 0 (e.g., raise a ValueError
with a clear message like "max_length must be non-negative" or otherwise
normalize negative values to 0) before the existing len/content checks so the
function has deterministic behavior for negative limits; keep the existing
branches (<= max_length, <= 3, and the "..."/trimming logic) intact.

In
`@discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`:
- Around line 83-89: validate_envelope is called but its normalized return value
is ignored, so downstream reads (guild_info/channel_info/messages) still use the
raw data and can be None (e.g. messages). Replace usage of raw data with the
validated envelope result: capture the return of validate_envelope(data,
source=rel) (e.g. validated = validate_envelope(...)) and then derive guild_info
= validated.get("guild", {}), channel_info = validated.get("channel", {}), and
messages = validated.get("messages", []) so the normalized values from
validate_envelope are used.

In `@discord_activity_tracker/schemas/discord_staging_v1.json`:
- Around line 131-135: The schema uses unresolved refs like
"#/$defs/DiscordExporterGuild" and "#/$defs/DiscordExporterChannel"; update
those $ref pointers to the actual locations where the defs live (e.g., change
"#/$defs/DiscordExporterGuild" and "#/$defs/DiscordExporterChannel" to
"#/discord_chat_exporter_envelope/$defs/DiscordExporterGuild" and
"#/discord_chat_exporter_envelope/$defs/DiscordExporterChannel"), and similarly
update refs that point into normalized_discord_message to
"#/normalized_discord_message/$defs/YourDefName" (replace YourDefName with the
exact symbols such as DiscordMessage, etc.) so all $ref values point to the
correct nested paths (use the exact symbol names DiscordExporterGuild,
DiscordExporterChannel, and the normalized_discord_message defs to locate and
fix each reference).
- Around line 233-249: The schema currently allows any non-empty string for
created_at, edited_at, and occurred_at; tighten these fields to enforce
canonical ISO-8601 UTC timestamps by replacing the loose string constraint with
a pattern that requires the full-dateTime with trailing Z (e.g., use a regex
like ^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?Z$). For created_at and
occurred_at set type: "string" plus the pattern; for edited_at keep the anyOf
allowing null but change the string branch to require the same pattern (and
remove or keep default null as appropriate). Update the schema entries for
created_at, edited_at, and occurred_at accordingly.

In `@discord_activity_tracker/sync/chat_exporter.py`:
- Around line 649-665: The reaction count parsing can raise ValueError when
external JSON supplies a non-numeric "count"; in the loop that builds
reactions_out from msg_data (look for the for reaction in
msg_data.get("reactions", []) block) wrap the conversion of
reaction.get("count", 0) to an int in a safe guard: validate the raw value
(e.g., check isinstance(...) for int/str digits or try/except
ValueError/TypeError) and on failure fall back to 0, then use max(0,
parsed_count) when appending to reactions_out so malformed exporter payloads do
not abort message conversion.

---

Outside diff comments:
In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 202-207: validate_envelope(...) is being used only for side-effect
validation while the code continues to use the original raw `data`, discarding
normalized/coerced values; change the flow to use the function's
returned/normalized envelope (e.g. assign the result of
validate_envelope(json_path.name) to a variable or replace `data` with that
return value) and then read `guild_info`, `channel_info`, and `messages` from
that validated object (referencing parse_exported_json, validate_envelope, and
the variables `data`, `guild_info`, `channel_info`, `messages`) so you consume
the coerced values rather than the original raw payload.

---

Nitpick comments:
In `@core/tests/test_text_processing.py`:
- Around line 73-86: Add a regression test for truncate_content when max_length
is negative: create a new test (e.g., test_truncate_content_negative_max_length)
that calls truncate_content(some_string, max_length=-1) and asserts the expected
behavior (either returns the original string, an empty string, or clamps to 0
depending on the implemented guard). Locate truncate_content usage in the test
module (functions like test_truncate_content_short_unchanged,
test_truncate_content_long_adds_ellipsis,
test_truncate_content_max_length_at_most_three) and add the negative-case
assertion consistent with the utility's guard.

In `@discord_activity_tracker/tests/test_staging_schema.py`:
- Around line 25-35: Add a negative test that ensures timestamps without the
canonical trailing "Z" are rejected: create a new test (e.g.,
test_validate_normalized_rejects_non_z_timestamp) that builds a message via
_minimal_exporter_message() and convert_exporter_message_to_dict(...), then
mutate converted["occurred_at"] to a non-`Z` form (e.g., replace trailing "Z"
with "+00:00" or drop the "Z"), and assert
validate_normalized_message(converted, source="test") raises the expected
validation error (use pytest.raises or the project’s ValidationError type).
Repeat the same style of negative assertion for the other related tests in the
55-81 area so the canonical `Z` contract is locked.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f47ecc0a-1aeb-48c9-802a-7fdd433b38db

📥 Commits

Reviewing files that changed from the base of the PR and between 43679a0 and 064181b.

⛔ Files ignored due to path filters (2)

requirements-dev.lock is excluded by !**/*.lock
requirements.lock is excluded by !**/*.lock

📒 Files selected for processing (21)

core/tests/test_datetime_parsing.py
core/tests/test_text_processing.py
core/utils/datetime_parsing.py
core/utils/text_processing.py
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/schemas/discord_staging_v1.json
discord_activity_tracker/scripts/__init__.py
discord_activity_tracker/scripts/write_staging_json_schema.py
discord_activity_tracker/staging_schema.py
discord_activity_tracker/sync/chat_exporter.py
discord_activity_tracker/sync/messages.py
discord_activity_tracker/sync/utils.py
discord_activity_tracker/tests/test_staging_schema.py
discord_activity_tracker/tests/test_sync_chat_exporter.py
discord_activity_tracker/tests/test_sync_utils.py
discord_activity_tracker/tests/test_workspace.py
discord_activity_tracker/workspace.py
docs/discord-tracker-schema.md
github_activity_tracker/sync/utils.py
requirements.in

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@discord_activity_tracker/tests/test_staging_schema.py`:
- Around line 106-110: The test test_validate_envelope_rejects_non_list_messages
should use valid guild and channel shapes so the failure comes from messages not
being a list; update the test's call to validate_envelope to pass realistic
guild and channel dicts (matching shapes used elsewhere in tests) instead of
empty {} for "guild" and "channel" while keeping "messages": "nope" and the same
pytest.raises assertion so the error is triggered by the non-list messages.
- Around line 83-103: The test
test_validate_normalized_rejects_created_at_without_z_suffix currently builds a
full hand-crafted payload that may fail for reasons other than the timestamp;
instead, obtain a known-valid normalized message (e.g., from the existing valid
fixture or conversion helper used elsewhere in tests) and only mutate the
created_at field to "2026-01-01T00:00:00+00:00" before calling
validate_normalized_message(source="unit"), so the assertion specifically checks
the Z-suffix rejection and nothing else.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 24720052-20ac-44a4-9a10-072bc9679155

📥 Commits

Reviewing files that changed from the base of the PR and between 064181b and 1efb055.

📒 Files selected for processing (11)

core/tests/test_datetime_parsing.py
core/tests/test_text_processing.py
core/utils/datetime_parsing.py
core/utils/text_processing.py
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/schemas/discord_staging_v1.json
discord_activity_tracker/staging_schema.py
discord_activity_tracker/sync/chat_exporter.py
discord_activity_tracker/tests/test_staging_schema.py
discord_activity_tracker/tests/test_sync_chat_exporter.py

🚧 Files skipped from review as they are similar to previous changes (8)

core/tests/test_text_processing.py
discord_activity_tracker/tests/test_sync_chat_exporter.py
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
discord_activity_tracker/schemas/discord_staging_v1.json
core/utils/datetime_parsing.py
discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/sync/chat_exporter.py
discord_activity_tracker/staging_schema.py

…for message creation and improve validation checks.

…st_synced_at fields from DiscordChannel model. Update related services and tests to utilize message timestamps instead. Enhance datetime parsing functions for improved error handling and normalization.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 201-204: validate_envelope invocation can raise validation errors
but current generic error handling unlinks (deletes) the input file; change the
flow so validation failures do not remove the source JSON. Wrap the call to
validate_envelope (and subsequent envelope.* access) in a specific try/except
for validation errors (the exception type raised by validate_envelope) and in
that handler preserve the file (e.g., leave it in place or move it to a "failed"
folder) instead of unlinking; keep the existing generic error path for
non-validation exceptions that still perform unlinking. Refer to
validate_envelope, envelope, and json_path to locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94aaa79b-d3df-43eb-81f3-e9ae929da71f

📥 Commits

Reviewing files that changed from the base of the PR and between 2165660 and af07fd7.

📒 Files selected for processing (18)

core/utils/datetime_parsing.py
discord_activity_tracker/admin.py
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/migrations/0006_remove_channel_last_timestamps.py
discord_activity_tracker/models.py
discord_activity_tracker/services.py
discord_activity_tracker/staging_schema.py
discord_activity_tracker/sync/chat_exporter.py
discord_activity_tracker/sync/export.py
discord_activity_tracker/sync/messages.py
discord_activity_tracker/tests/test_messages_more.py
discord_activity_tracker/tests/test_services_core.py
discord_activity_tracker/tests/test_services_extras.py
discord_activity_tracker/tests/test_sync_messages.py
discord_activity_tracker/tests/test_sync_utils.py
docs/Schema.md
docs/service_api/discord_activity_tracker.md

💤 Files with no reviewable changes (4)

discord_activity_tracker/admin.py
discord_activity_tracker/models.py
discord_activity_tracker/tests/test_services_extras.py
docs/Schema.md

🚧 Files skipped from review as they are similar to previous changes (2)

discord_activity_tracker/sync/chat_exporter.py
discord_activity_tracker/staging_schema.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

discord_activity_tracker/tests/test_run_discord_activity_tracker_command.py (1)
44-64: ⚡ Quick win

Add a command-level test for file retention on validation failures.

These tests only lock down the helper’s current string matching. The user-visible contract lives in task_discord_sync: staging validation failures should keep the JSON file, while unrelated failures should still delete it. A small command-level test for those two paths would catch regressions that this unit coverage won’t.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/tests/test_run_discord_activity_tracker_command.py`
around lines 44 - 64, Add a command-level test in
tests/test_run_discord_activity_tracker_command.py that exercises the
task_discord_sync command path (the user-visible contract) to assert that when a
staging validation failure occurs (use
_is_discord_staging_validation_error-friendly input or mock task_discord_sync to
raise that specific ValueError) the original JSON file is preserved on disk, and
that when an unrelated failure occurs the file is deleted; target the task entry
point (task_discord_sync or the command runner used by the tests) and use
temporary files/fixtures to create a JSON file, invoke the command, then assert
file existence for the validation-error case and file absence for the other
error case so regressions of the retention behaviour are caught.
discord_activity_tracker/management/commands/run_discord_activity_tracker.py (1)
376-382: The validation call doesn't cause data loss, but could be clearer.

validate_normalized_message(...) is called for error-checking side effects and its return value (a NormalizedDiscordMessage model) is discarded. The function does transform blank timestamp strings to None, but _prepare_message_data calls parse_iso_datetime_lenient() which already handles both empty strings and None gracefully, so no data is lost.

However, the intent here is unclear: either wire the validated model into converted for consistency, or make the validation call's purpose explicit with a comment (validate early to catch bad data).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`
around lines 376 - 382, The call to validate_normalized_message(...) currently
discards its return value which makes intent unclear; either replace each
converted entry with the validated NormalizedDiscordMessage returned by
validate_normalized_message (i.e., assign converted[idx] =
validate_normalized_message(...)) so downstream uses the canonical model, or if
you only want early error checking, add a short comment above the loop stating
"validate only for early error checking; do not replace converted entries" and
keep the call for side-effects; reference validate_normalized_message,
convert_exporter_message_to_dict, _prepare_message_data,
parse_iso_datetime_lenient, converted, and _process_messages_in_batches when
making the change so reviewers can find the logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 60-67: Replace the brittle string-based check in
_is_discord_staging_validation_error with a concrete typed exception: add a new
StagingValidationError class in staging_schema.py and have validate_envelope()
and validate_normalized_message() raise StagingValidationError on validation
failures, then change the caller that currently uses
_is_discord_staging_validation_error (the delete-guard logic around the Discord
staging processing) to catch StagingValidationError directly instead of matching
ValueError text; remove or simplify _is_discord_staging_validation_error to use
isinstance(exc, staging_schema.StagingValidationError) (or inline the isinstance
check where the guard runs) so malformed staging payloads are reliably
preserved.

---

Nitpick comments:
In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 376-382: The call to validate_normalized_message(...) currently
discards its return value which makes intent unclear; either replace each
converted entry with the validated NormalizedDiscordMessage returned by
validate_normalized_message (i.e., assign converted[idx] =
validate_normalized_message(...)) so downstream uses the canonical model, or if
you only want early error checking, add a short comment above the loop stating
"validate only for early error checking; do not replace converted entries" and
keep the call for side-effects; reference validate_normalized_message,
convert_exporter_message_to_dict, _prepare_message_data,
parse_iso_datetime_lenient, converted, and _process_messages_in_batches when
making the change so reviewers can find the logic.

In `@discord_activity_tracker/tests/test_run_discord_activity_tracker_command.py`:
- Around line 44-64: Add a command-level test in
tests/test_run_discord_activity_tracker_command.py that exercises the
task_discord_sync command path (the user-visible contract) to assert that when a
staging validation failure occurs (use
_is_discord_staging_validation_error-friendly input or mock task_discord_sync to
raise that specific ValueError) the original JSON file is preserved on disk, and
that when an unrelated failure occurs the file is deleted; target the task entry
point (task_discord_sync or the command runner used by the tests) and use
temporary files/fixtures to create a JSON file, invoke the command, then assert
file existence for the validation-error case and file absence for the other
error case so regressions of the retention behaviour are caught.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eabbdbf5-b74a-454a-8bc7-87497d139ad3

📥 Commits

Reviewing files that changed from the base of the PR and between af07fd7 and b82b25f.

📒 Files selected for processing (2)

discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/tests/test_run_discord_activity_tracker_command.py

…idationError for validation failures in envelope and normalized message processing.

…ent tests and updating docstring references for clarity.

feat: enhance Discord activity tracker with Pydantic validation and s…

064181b

…chema support

leostar0412 self-assigned this May 12, 2026

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Add changelog, project configuration, and versioning support

1efb055

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Comment thread discord_activity_tracker/tests/test_staging_schema.py

Comment thread discord_activity_tracker/tests/test_staging_schema.py

Refactor tests in test_staging_schema.py to utilize helper functions …

2165660

…for message creation and improve validation checks.

leostar0412 requested a review from jonathanMLDev May 12, 2026 19:57

jonathanMLDev reviewed May 13, 2026

View reviewed changes

Refactor Discord activity tracker by removing last_activity_at and la…

af07fd7

…st_synced_at fields from DiscordChannel model. Update related services and tests to utilize message timestamps instead. Enhance datetime parsing functions for improved error handling and normalization.

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread discord_activity_tracker/management/commands/run_discord_activity_tracker.py

Implement staging validation error handling in Discord activity tracker.

b82b25f

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread discord_activity_tracker/management/commands/run_discord_activity_tracker.py Outdated

Refactor error handling in Discord activity tracker to use StagingVal…

fa4b028

…idationError for validation failures in envelope and normalized message processing.

leostar0412 requested a review from jonathanMLDev May 13, 2026 16:38

Refactor tests in test_sync_utils.py by removing unused truncate_cont…

0c0e125

…ent tests and updating docstring references for clarity.

jonathanMLDev approved these changes May 13, 2026

View reviewed changes

leostar0412 requested a review from wpak-ai May 13, 2026 17:00

wpak-ai approved these changes May 13, 2026

View reviewed changes

wpak-ai merged commit 4027f51 into cppalliance:develop May 13, 2026
6 of 7 checks passed

Conversation

leostar0412 commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

How to test

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leostar0412 commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading