TG-Digest is a self-hosted Telegram morning digest service. It reads selected Telegram chats through a user session, filters noise, summarizes with an OpenAI-compatible LLM, archives the full report on the built-in web UI, and can deliver a compact briefing through Telegram, ntfy, or email.
- Telegram user-source adapter with incremental cursor support.
- SQLModel repositories for groups, messages, cursors, reports, runs, usage, and UI settings.
- Noise, deduplication, and link metadata filters.
- Media enrichment for Telegram image posts: downloaded images are combined into a contact sheet and passed to the configured vision model as best-effort image context.
- Bounded external link context fetching for accessible non-Telegram URLs.
- Prompt registry with base + category + optional group override templates.
- Category prompts include news, medical, group chat, generic, and military/security.
- LiteLLM client wrapper, map/reduce summarizer, and per-run usage recording.
- Markdown, HTML, and JSON report renderers.
- Web, Telegram bot, ntfy, and email deliverers.
- FastAPI management UI with HTMX, Alpine.js, Tailwind CDN, dark mode, local demo mode, report history, costs, run status, group configuration, and LLM/budget settings.
- Split management pages for home, groups, settings, costs, and report history.
- Telegram account identity detection for self-related message prioritization.
- Like/dislike feedback on report items to bias future summaries.
- APScheduler daily cron job with missed-run backfill.
- Loguru console/file logging with secret redaction patterns.
- Dockerfile and Docker Compose deployment skeleton.
Real credentials belong only in .env, which is ignored by Git. The repository includes .env.example with key names and empty placeholders. Runtime data, session files, logs, and virtual environments are also ignored.
Never put these values in source files, config YAML, tests, commit messages, or logs:
TG_API_ID,TG_API_HASH,TG_PHONE- Telegram login codes or session files
TG_BOT_TOKEN,TG_TARGET_CHAT_IDLLM_API_KEY,LLM_BASE_URLWEB_AUTH_USER,WEB_AUTH_PASSNTFY_*,SMTP_*BARK_*
The current workspace uses the project-local virtual environment at venv/.
venv\Scripts\python.exe -m pip install -e ".[dev]"
venv\Scripts\python.exe -m pytest -q
venv\Scripts\python.exe -m ruff check src tests
venv\Scripts\python.exe -m src.mainOpen:
- http://localhost:8080/ for the management UI
- http://localhost:8080/healthz for the health check
- http://localhost:8080/reports/latest for the latest report
Without .env, the UI still loads and can seed demo dialogs/reports for local validation.
Copy .env.example to .env and fill only the values needed for the gate you are testing.
Copy-Item .env.example .envNon-secret defaults live in config/settings.yaml:
- schedule cron and time window
- default models and concurrency
- filter limits
- renderers and enabled delivery channels
- daily budget and warning ratio
- web host/port
- cron schedule and digest time window
The management UI can persist runtime overrides for LLM and budget settings in the database.
It also includes a write-only credential panel for .env: configured values are shown only as
configured/missing, input fields stay blank, and empty submissions keep existing values.
For networks that require a local VPN/proxy, configure proxy URLs in .env.
PROXY_URL=
TG_PROXY_URL=socks://127.0.0.1:10808
LLM_PROXY_URL=http://127.0.0.1:10808TG_PROXY_URL is used by Telethon. LLM_PROXY_URL is applied to LiteLLM/OpenAI-compatible HTTP calls and optional HTTP delivery clients. If a channel-specific value is empty, PROXY_URL is used as the fallback.
Telegram user collection requires a one-time MTProto session.
- Create Telegram API credentials at https://my.telegram.org.
- Put
TG_API_ID,TG_API_HASH, andTG_PHONEin.env. - Run:
venv\Scripts\python.exe -m src.ingest.loginEnter the Telegram verification code when prompted. If the account has 2FA, enter the 2FA password. The session is written under data/session/, which is ignored by Git.
If interactive stdin is unreliable on Windows terminals, use the two-step flow:
venv\Scripts\python.exe -m src.ingest.login --request-codeAfter Telegram sends the code:
"123456" | venv\Scripts\python.exe -m src.ingest.login --completeFor accounts with 2FA, pipe the code and password on separate lines.
The same request-code and complete-login flow is available in the web UI. Verification codes and
2FA passwords are one-time inputs only and are not written to .env, the database, or logs.
The dashboard exposes two manual actions:
立即生成: incremental mode. It uses each group's cursor, fetches only messages after that cursor, and advances the cursor after saving messages.根据过去 N 小时生成: replay mode. It ignores cursors, re-summarizes the configured time window, and does not advance cursors. Use this for testing prompts or regenerating today's report.
The scheduled daily job uses incremental mode when Telegram and LLM credentials are configured:
- select groups where
enabled = true - fetch messages newer than the per-group cursor and inside
schedule.time_window_hours - cap each group by
max_messagesand prompt size bymax_tokens - filter noise, deduplicate, map summaries per group, then reduce into one report
Unread counts are only displayed and used for UI sorting. They do not decide what gets summarized.
The default schedule is 30 7 * * * in Asia/Shanghai, i.e. 07:30 every day. The cron
expression, timezone, and time window can be changed from the settings page.
When dialogs are synced, TG-Digest detects the logged-in Telegram account ID and username. Messages
sent by the account, Telegram-marked mentions, or text mentioning @username are marked as
self-related and get higher prompt priority. Self-related report items are rendered in a dedicated
section instead of being mixed into the must-read list.
For image-heavy Telegram posts, TG-Digest downloads available images to data/media, builds a
single contact sheet with message IDs, and sends that sheet to media.vision_model as optional
vision context. The checked-in default is Qwen/Qwen3.6-35B-A3B. If the provider or model does not
support image input, the run falls back to text, media metadata, and source links. Video posts are
not transcoded; the report keeps the original Telegram message link so they can be opened in
Telegram. When a summarized item is matched to downloaded images, the Web/HTML/Markdown report
renders those images inline; Telegram bot delivery sends the matched images as photo messages after
the text digest. Accessible non-Telegram links are fetched only for bounded title/description
context.
Report items include like/dislike controls. Feedback is stored locally and summarized into future LLM prompts as preference memory; urgent, safety, medical, deadline, or direct-to-user items are not suppressed solely because of dislikes.
Use any OpenAI-compatible provider supported by LiteLLM.
Set in .env:
LLM_API_KEY=
LLM_BASE_URL=Set models in the UI or config/settings.yaml:
llm:
litellm_provider: "openai"
map_model: "provider/model"
reduce_model: "provider/model"
media:
vision_model: "Qwen/Qwen3.6-35B-A3B"The checked-in default model split is:
deepseek-ai/DeepSeek-V4-Flashfor map summariesdeepseek-ai/DeepSeek-V4-Profor final reduce reports
The UI cost panel reads token usage from the usage table and estimates spend with the
per-1k-token rates in config/settings.yaml. The checked-in defaults match the SiliconFlow
USD prices for the default models as of 2026-06-20:
- map /
deepseek-ai/DeepSeek-V4-Flash: input$0.00013per 1k tokens, output$0.00028per 1k tokens - reduce /
deepseek-ai/DeepSeek-V4-Pro: input$0.0016per 1k tokens, output$0.003135per 1k tokens
Existing historical runs are not rewritten when rates change. If an older usage row has tokens but
stored cost is 0, the UI marks it with * and estimates it with the current rates.
Delivery channels are configured in config/settings.yaml:
deliver:
enabled: ["web", "telegram", "bark"]Credential requirements:
telegram:TG_BOT_TOKEN,TG_TARGET_CHAT_IDntfy:NTFY_URLbark:BARK_URL, optionalBARK_GROUP. Use a full Bark endpoint such ashttps://api.day.app/<device-key>.email:SMTP_HOST,SMTP_FROMorSMTP_USER,SMTP_TO, optionalSMTP_PASS
Telegram bot delivery uses Bot API HTML formatting and splits long reports into multiple messages below the Telegram message length limit. Matched report images are delivered as Telegram photo messages. The web report remains the complete archive; Telegram is a readable mobile preview with source links and inline media.
docker compose up --buildThe compose file mounts:
data/sessionfor Telethon session filesdata/dbfor SQLitedata/reportsfor rendered reportsdata/logsfor rotating application logsconfigfor editable prompts and settings
The .env file is optional for booting the zero-secret UI, but required for real Telegram/LLM/delivery integration.
Phase A is designed to run without real credentials. Phase B begins with Telegram user login and real network validation. At that point the required values are TG_API_ID, TG_API_HASH, and TG_PHONE in .env.