fix(start): lazy-init languages on /start to unblock cold-start orphans#222
Merged
Conversation
Sega (#370728472) hit /start, immediately got "memes ended". Root cause:
no rows in user_language for him → recommendations query filters every
candidate out → cold start surfaces the empty-feed message.
How he ended up there: he registered on 2026-04-01 with deep_link
'kitchen'. The kitchen branch in handle_start returns BEFORE any
init_user_languages_from_tg_user call:
if deep_link == "kitchen":
return await handle_show_kitchen(update, context)
The wrapped + main "if created" branches both call init; kitchen
silently skipped it. Six new users in the last 30 days (4 via 'kitchen',
2 via deep_link=None) ended up the same way — all backfilled by hand
just now.
Fix: hoist a single idempotent check above the deep_link branching:
if not await get_user_languages(user_id):
await init_user_languages_from_tg_user(update.effective_user)
This covers every branch (current + future deep_links) and self-heals
any historical orphan the next time they /start. Removed the per-branch
init calls in wrapped + new-user paths since the new check supersedes
them. Idempotent guarantee: add_user_languages uses ON CONFLICT DO NOTHING.
Note on extent: 893 historical orphans exist (mostly pre-init code),
but only 6 were active in the last 30d / 3 in the last 7d — the bug
was not blocking the main onboarding funnel (38/38 new users in last
7d had languages). It was a long-tail correctness issue that bit any
share-link / deep-link path that returned early.
Member
Author
|
STAFF ENGINEER REVIEW: APPROVED — Hoist + idempotent guard above all deep_link branches is correct. |
This was referenced May 2, 2026
ohld
added a commit
that referenced
this pull request
May 2, 2026
Targets users registered in the last 12 months who never had a single meme delivered (no row in user_meme_reaction). They were silently locked out by onboarding bugs — most via the kitchen deep_link path that returned before init_user_languages_from_tg_user (fixed forward in PR #222), some via other early-return drift. Sega (#370728472) was case zero — registered 2026-04-01 via ?start=kitchen, no language rows, "мемы кончились" on every /start. After manual backfill + apology DM he immediately produced a healthy session (4 likes / 1 dislike / 7 sent in 22 min, 80% positive). Scope: - 66 candidates total (40 RU, 26 EN) at the time of writing - Filtered: blocked_bot_at IS NULL, type NOT IN blocked/banned/waitlist - Dedup: reuses send_broadcast Redis-set marker per broadcast_id - Default 0.5s delay (~2/s) — very conservative for a small list Run: PYTHONPATH=/src python scripts/broadcast_ghost_recovery.py \ ghost-recovery-2026-05 --dry-run PYTHONPATH=/src python scripts/broadcast_ghost_recovery.py \ ghost-recovery-2026-05 Same shape as scripts/broadcast_wrapped.py — no new infra.
ohld
added a commit
that referenced
this pull request
May 2, 2026
UserType has no 'banned' value (src/tgbot/constants.py) — the filter is a no-op. Drop it so the WHERE clause reflects actual reachable types. Branch was also rebased onto production to pull in PR #222's lazy language-init in handle_start. The recommended SE fix B (repair-on-start) is now active for both new and existing users (start.py:124-125), so the broadcast's "/start" CTA will trigger language backfill and unblock the recommendation queue for ghost recipients. Addresses Staff Engineer review on #223.
ohld
added a commit
that referenced
this pull request
May 2, 2026
* ops(broadcast): one-shot ghost-user recovery script Targets users registered in the last 12 months who never had a single meme delivered (no row in user_meme_reaction). They were silently locked out by onboarding bugs — most via the kitchen deep_link path that returned before init_user_languages_from_tg_user (fixed forward in PR #222), some via other early-return drift. Sega (#370728472) was case zero — registered 2026-04-01 via ?start=kitchen, no language rows, "мемы кончились" on every /start. After manual backfill + apology DM he immediately produced a healthy session (4 likes / 1 dislike / 7 sent in 22 min, 80% positive). Scope: - 66 candidates total (40 RU, 26 EN) at the time of writing - Filtered: blocked_bot_at IS NULL, type NOT IN blocked/banned/waitlist - Dedup: reuses send_broadcast Redis-set marker per broadcast_id - Default 0.5s delay (~2/s) — very conservative for a small list Run: PYTHONPATH=/src python scripts/broadcast_ghost_recovery.py \ ghost-recovery-2026-05 --dry-run PYTHONPATH=/src python scripts/broadcast_ghost_recovery.py \ ghost-recovery-2026-05 Same shape as scripts/broadcast_wrapped.py — no new infra. * ops(broadcast): drop dead 'banned' filter from ghost recovery query UserType has no 'banned' value (src/tgbot/constants.py) — the filter is a no-op. Drop it so the WHERE clause reflects actual reachable types. Branch was also rebased onto production to pull in PR #222's lazy language-init in handle_start. The recommended SE fix B (repair-on-start) is now active for both new and existing users (start.py:124-125), so the broadcast's "/start" CTA will trigger language backfill and unblock the recommendation queue for ghost recipients. Addresses Staff Engineer review on #223.
ohld
added a commit
that referenced
this pull request
May 5, 2026
…s (FFM-907) (#224) PR #222 plugged the kitchen-branch leak (Sega's case). This audit ensures no future deep_link branch can re-introduce the same drift, and adds a runtime alarm so we don't have to wait for a friend to complain. - Contributor-facing comment in handle_start documenting the rule: every universal onboarding side effect lives ABOVE the deep_link ladder. New side effects must hoist, not bury inside per-branch blocks. - src/flows/monitors/ghost_users.py: Prefect flow runs every minute, counts new users (1-5min ago = WARN, >5min ago = ERROR) with no user_meme_reaction row, posts a single summary to admin chat. Filters out blocked-acquisition deep_links (intentional silent drop). - tests/tgbot/test_start.py: regression coverage for each known deep_link variant (none / kitchen / wrapped / giveaway_77 / s_*_* / blocked-acquisition / existing user). Asserts user_tg + user + user_language + user_deep_link_log rows after every created=True path, plus idempotency of the lazy lang init. Co-authored-by: Paperclip <noreply@paperclip.ing>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
User #370728472 (Sega, RU) ran
/start, instantly got "мемы кончились". DB showeduser_languageempty for his row. Recommendations filter onuser_language→ no candidates → cold-start "empty feed" message.He landed there because he registered on 2026-04-01 via
?start=kitchen. The kitchen branch inhandle_startreturns before any call toinit_user_languages_from_tg_user:wrappedand the mainif created:branches both init;kitchensilently skipped it.Blast radius
Already-active orphans backfilled manually on prod:
kitchendeep_link, 2 viaNone)user_languagefromuser_tg.language_code(5 →ru, 1 →en)Long tail (not in scope): 893 total historical orphans, but only 3 active in last 7d — they self-heal on their next
/startonce this PR ships (the new lazy-init covers them).Critically: 38/38 new users in the last 7 days have language rows — the main onboarding funnel was not blocked. This was a long-tail correctness bug on share-link / deep-link branches that returned early.
What I changed
src/tgbot/handlers/start.py: one idempotent check hoisted above the deep_link branching:Removed the duplicate per-branch calls in
wrappedandif created:(the new check supersedes them).add_user_languagesusesON CONFLICT DO NOTHINGso this is safe to re-run.Test plan
/startfrom a fresh account → confirmuser_languagerow created/start ?start=kitchenfrom a fresh account → confirmuser_languagerow created (was the bug)/startagain → confirm no double-insert (idempotent path)