Add discord_activity_tracker app with DiscordChatExporter CLI integration#50
Conversation
snowfox1003
left a comment
There was a problem hiding this comment.
I reviewed your result (https://github.com/CppDigest/discord-cplusplus-together-context
) and noticed a few issues:
- Timestamps are not unique. It appears the time may be based on your local time zone. Could you standardize all timestamps to UTC (or clearly specify the time zone)?
- Some content appears corrupted or incorrectly formatted.
Example: https://github.com/CppDigest/discord-cplusplus-together-context/blob/main/2026/2026-02/2026-02-c-help-text.md#1416---mxtreme-1
-Files such as2026-02-c-cpp-discussion.mdare too large. Could you split them into separate files per day if a single day’s content is too big?? - The Markdown formatting should be cleaner and more readable in preview mode. Please review the
llvm-project-contextrepository and align the formatting style accordingly. - Update reply references from
Reply to: @roemy2826 (01:59)
to a linkable format such as:
Reply to: [@roemy2826 (01:59)](<filename>#<date>-@<username>) - Bot messages are missing. Is it not possible to retrieve bot-generated content?
Hello @snowfox1003 |
01:34:12.223 UTC — @TartarusFire
jensen's background music makes it better
|
… per-day export
…lector into feature/issue-26
snowfox1003
left a comment
There was a problem hiding this comment.
Review feedback
Performance
- Exporting only 10 days of Discord history can take over an hour. That is a known limitation when you pull a lot of history at once.
Intended use vs full scrape
- The design is for daily-incremental runs (like in boost-data-collector). Each run only grabs new data since the last sync, so one day's worth is small and runs stay fast.
- For full-historical scrapes, a different approach may work better instead of using the same flow with a big
--days-backor "all history" from the CLI.
Data and service layer
- All user/identity profiles in this project live in cppa_user_tracker.
- When you add or update user-related records (identities, profiles, emails, etc.), use cppa_user_tracker.services. Do not write to models or other apps directly.
- See
docs/Contributing.mdanddocs/service_api/cppa_user_tracker.md.
Documentation
- Project docs live under the docs/ folder.
- Add (or extend) service API docs: add
docs/service_api/discord_activity_tracker.mdand link it fromdocs/service_api/README.mdso the Discord tracker is in the same service API index as the other apps.
Workspace and raw files
- Use the app-level workspace (each app's
workspacemodule orconfig.workspace.get_workspace_path(app_slug)). Do not hardcode project-root paths likeworkspace/exporter_temp. - Put raw export files in that app's workspace under a raw subfolder (e.g.
workspace/raw/discord_activity_tracker). - This follows the pattern in
docs/Workspace.md: one workspace root, one subfolder per app. Temp and raw data stay under the app that owns them.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new discord_activity_tracker Django app with models, migrations, services, async sync clients and DiscordChatExporter wrappers, markdown export & git push tooling, management commands, workspace utilities, tests, docs, settings/env entries, and migrations migrating DiscordUser → DiscordProfile and deleting DiscordUser. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Cmd as Management Command
participant Client as DiscordSyncClient
participant Exporter as DiscordChatExporter
participant Parser as Export Parser
participant Services as Service Layer
participant DB as Database
participant Git as Git Repo
User->>Cmd: invoke (sync / import / export / debug)
alt sync via discord.py
Cmd->>Client: connect & fetch guild/channels/messages
Client-->>Services: raw message dicts
else sync via DiscordChatExporter
Cmd->>Exporter: export_guild_to_json(user_token, guild_id)
Exporter-->>Parser: JSON files
Parser-->>Services: parsed message dicts
end
Services->>DB: bulk upsert users, messages, reactions
DB-->>Services: created/updated results
Services->>DB: update channel last_activity/last_synced
Cmd->>Services: export_and_push(context_repo_path, server)
Services->>Git: commit_and_push_context_repo
Git-->>Cmd: push result
Cmd-->>User: summary / logs
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Note 🎁 Summarized by CodeRabbit FreeYour organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login. Comment |
…nd discord workspace refs
…nd discord workspace refs
… clean up wrappers
… clean up wrappers
snowfox1003
left a comment
There was a problem hiding this comment.
I have a question:
Why did you delete run_all_collectors.py, fixtures.py and test_commands.py in workflow app?
Closes #36
Summary
discord_activity_trackerDjango app to archive C/C++ Together Discord server (ID: 331718482485837825) chat historyrun_discord_exporterwith sync modes:sync,export,all,import-onlylast_synced_attracking, with--days-back,--full-syncoverridesWhat changed
sync/chat_exporter.py— Popen (no timeout), stderr streaming, proxy strippingrun_discord_exporter— tasks:sync,export,all,import-only--days-back,--full-sync,--months,--active-days--task import-onlyfor pre-exported JSON (full history export)run_all_collectors.py→ usesrun_discord_exportersettings.py,.env.example,requirements.txtREADME.md,EXPORTER_INTEGRATION.md,tools/README.md(user token guide)Bugs fixed
subprocess.run(timeout=3600)hard-kills long exportssubprocess.Popen— no timeout, streams stderr progress--days-backignored whenlast_synced_atexistsmin(sync_date, days_back_date)so--days-backcan widen windowtimestampduring DB importexport_and_parse_guildconverted messages, then_persist_exported_dataconverted againexport_and_parse_guildFile structure
discord_activity_tracker/
├── management/commands/
│ ├── run_discord_exporter.py # CLI integration command (primary)
│ └── run_discord_activity_tracker.py # Bot API command (blocked)
├── migrations/
│ └── 0001_initial.py
├── sync/
│ ├── chat_exporter.py # DiscordChatExporter CLI wrapper
│ ├── client.py # Discord HTTP API client (bot method)
│ ├── export.py # Markdown generation + git operations
│ ├── messages.py # Message processing + DB persistence
│ └── utils.py # Date parsing, URL formatting
├── tools/
│ └── README.md # User token extraction guide
├── models.py # Server, Channel, Author, Message, Reaction
├── services.py # DB helper functions
├── workspace.py
├── README.md # Usage docs, all commands, full history setup
└── EXPORTER_INTEGRATION.md # CLI integration details
Modified existing files
config/settings.py.env.exampleDISCORD_USER_TOKEN,DISCORD_SERVER_ID,DISCORD_CONTEXT_REPO_PATH.gitignoretools/README.mdrequirements.txtaiohttp,asgirefworkflow/.../run_all_collectors.pyrun_discord_exporterExported data
yyyy/yyyy-MM/yyyy-MM-channel-name.mdTest plan
--task import-onlySummary by CodeRabbit