Releases: fat32al1ty/HR-assist
v0.23.0 — admin metrics + observability + dashboard rework
Merged release combining two parallel work-streams that were both in flight on feat/v0.23-dashboard-rework.
(A) Admin metrics + observability layer
After Phase 6 closed search latency + cost, the operator still couldn't answer questions like "how fast is search today?", "how much OpenAI did we burn?", "is the funnel converting?". v0.23 closes that gap.
New persistence
openai_call_log— every OpenAI call (model, tokens, cost, duration, request_id). Replaces the stdout-only JSON-line audit so cost can be SQL'd, not grep'd.match_event—POST /api/telemetry/eventwas previously accept-and-drop. Now persists.freshness_sweep_log— history of nightly vacancy_freshness sweeps (was a single in-memory timestamp).
New endpoints
GET /metrics— Prometheus exposition format. Counters:search_requests_total{job_type,status},openai_calls_total{model,status},hh_api_requests_total{status},segment_warmup_jobs_total{status},freshness_archived_total{source},match_events_total{event}. Histograms:search_duration_seconds{job_type},openai_call_duration_seconds{model}. Operator can plug any Prometheus-compatible scraper.GET /api/admin/metrics/{latency|cost|activation-funnel|retention|quality|segment-warmup|freshness|match-events}with?range=24h|7d|30d— backs the new dashboards.
request_id middleware
Every HTTP request gets a UUID (or trusts inbound X-Request-ID). Surfaced via current_request_id() to service code (already wired into match_event, openai_call_log). Pure ASGI implementation — BaseHTTPMiddleware had a regression that re-buffered request bodies and broke Pydantic body parsing for downstream POSTs.
Frontend
MetricsDashboardsection on/adminwith 8 cards.24h / 7d / 30drange picker, manual refresh.- Per-card
?help popover (native<details>) with: что считает / хорошо / плохо / источник в БД. No new charting libs — sparklines/bars are inline<svg>.
Diagnosis we want to remember
from __future__ import annotations + Pydantic v2 + FastAPI body inference is a footgun: with deferred evaluation of annotations, FastAPI registered payload: EventPayload as a query parameter, and every POST started returning 422 "field required (query)". Removed the pragma from telemetry.py and left a header comment so nobody re-adds it.
Tests
+21 new tests across test_metrics_endpoint.py, test_request_id_middleware.py, test_match_event_persistence.py, test_openai_call_log.py, test_admin_metrics_endpoints.py.
(B) Dashboard rework + seen feedback + requirement overrides
Parallel session shipped under the same release banner:
/strategyfeature dropped entirely (route, components, services, schemas, tests, DB tables) — migration0039_drop_vacancy_strategy_tables.- New "seen" feedback dimension on
user_vacancy_feedback+ endpoint + repository — migration0040_feedback_seen. - New
requirement_overridestable for per-vacancy "I don't have this skill but bring me anyway" — migration0042_requirement_overrides. frontend/components/match/RequirementsChecklist.tsx+ types.- Matcher tweaks (
llm_rerank,rerank_cache,matching_service) consume the new override signal.
Migrations
0039_drop_vacancy_strategy_tables → 0040_feedback_seen + 0040_metrics_layer → 0041_merge_seen_and_metrics (mergepoint) → 0042_requirement_overrides.
Test status
644 pass, 5 fail. The 5 are: 1 pre-existing (test_applications_resume_badge from the 1-resume cap, out of scope) and 4 still being tuned in the parallel matcher work. Zero regressions in (A).
v0.22.1 — HH page ceiling + segment_warmup orphan sweep
Patch on top of v0.22.0 after a prod-state diagnostic found two issues with how the search worker was actually behaving on 2026-04-29.
Bug A — HH 400-storm
_build_rotation_offset could return start_page up to 90 to spread retries across deeper pages. But hh.ru public API refuses requests where (page+1) * per_page > 2000 — for per_page=100 that's a hard cap at page 19. With our 8-page parallel wave any rotation past page 11 turned into 8+ guaranteed 400s.
Logs from a single worker cycle showed 185 such 400s with messages like:
hh_api_http_error status=400 page=43 query='Backend Engineer python'
body={"description":"you can't look up more than 2000 items in the list"...}
This burned hh.ru rate budget (real risk of an IP ban) and worker cycle time on guaranteed-bad requests.
Fix: lowered rotation cap 90 → 11 in _build_rotation_offset and added an explicit short-circuit at the top of _search_hh_public_api_vacancies — if start_page is already past the ceiling, return [] without touching the API.
Bug B — segment_warmup orphans never released their dedup lock
v0.22.0's hardening item B2 deliberately excluded segment_warmup jobs from sweep_stale_running_jobs because a 60-vacancy crawl + LLM analyse legitimately exceeds the 180-second deep_scan timeout, and sweeping them was killing valid work mid-crawl.
But the v0.22 fix had no replacement timeout. Result: a worker that died mid-crawl (container restart, OOM, deploy) left the row stuck in running forever, and the unique partial index on segment_key WHERE status IN ('queued','running') blocked any re-enqueue. The segment was permanently dead until manual DB cleanup.
In prod we found recoveryf5a88c stuck in running for 10+ hours by the time we noticed.
Fix: new sweep_stale_segment_warmup_jobs with a separate, configurable segment_warmup_timeout_seconds = 1800 (30 minutes — long enough that it never touches a slow-but-legitimate crawl, short enough that an orphan from a worker restart self-heals within one cycle). Wired into vacancy_warmup._worker_loop next to the existing deep_scan sweep.
Bonus — flaky test stabilised
test_null_last_freshness_check_checked_before_recent (added in v0.22) iterated sweep_stale_vacancies at time.sleep(0.5) per row across the whole vacancies table. On a populated dev/staging DB with ~2k rows that's 17+ minutes per test run, which was timing CI out at 11 minutes for the full suite. Patched time.sleep + capped the test limit; full backend suite is back to 70 seconds.
Tests
- 3 new tests for HH page-ceiling clamp (
tests/test_vacancy_sources_page_ceiling.py). - 2 new tests for segment_warmup orphan sweep (
tests/test_segment_warmup.py::SweepStaleSegmentWarmupTest). - Updated
test_vacancy_pipeline_rotation.pyfor the new 11-cap. - 676/677 pass; the single failure (
test_applications_resume_badge) is pre-existing and unrelated to this change.
Settings touched
| Setting | Old | New |
|---|---|---|
| Rotation offset cap | 90 | 11 |
_search_hh_public_api_vacancies start_page guard |
none | >= 20 → return [] |
segment_warmup_timeout_seconds (new) |
— | 1800 |
v0.22.0 — freshness + ToS compliance (Phase 6 final)
Phase 6 closes with v0.22.0. Eliminates zombie vacancies and re-frames the product against hh.ru ToS §3.11.
F1 — on-read freshness check
Every instant search response runs an asyncio.gather over top-N (default 20) matches against https://api.hh.ru/vacancies/{id} via httpx.AsyncClient. An archived: true flag or a 404 triggers a soft-delete (status='archived', archived_at=now()) and removes the row from the response in-flight. Wrapped in try/except: if hh.ru blips, the response still ships.
F2 — nightly sweep
vacancy_warmup_worker._run_freshness_sweep_if_due re-checks up to 500 oldest-checked rows per 24h. Order: last_freshness_check ASC NULLS FIRST, shown_count DESC — newest unverified rows first, then most-shown. 0.5s polite delay between HH calls; ~4 minutes per sweep cycle.
F3 — framing as a search layer
- Every
VacancyCardnow renders a visible source-host button (hh.ru ↗) with bordered + accent-filled styling, replacing the previous subtle "Источник →" link. README.md,README.ru.md,PRIVACY.mdrewritten as "AI-assisted search layer over hh.ru" instead of "vacancy database".- New
PRIVACY.mdsection explicitly states: we do not republish vacancy descriptions, we link back to canonical postings, we honour archive status.
Migration
0038_vacancy_freshness adds:
vacancies.last_freshness_check TIMESTAMPTZ NULLvacancies.archived_at TIMESTAMPTZ NULLvacancies.shown_count INTEGER NOT NULL DEFAULT 0
Acceptance
- Archived <2% in served results.
- Every match card has a visible hh.ru link.
- Zero hh.ru complaints over 30 days (observational).
Tests
20 new tests in tests/test_vacancy_freshness.py:
_extract_hh_idhappy + invalid pathscheck_vacancy_alive: 200 + alive / 200 + archived / 404 / 5xx / network errorsweep_stale_vacanciesordering (NULL first, shown_count tiebreak)- Instant endpoint excludes archived from response
shown_countbumps after instant- Sweep-due window (
_run_freshness_sweep_if_duerespects 24h gate)
Full backend: 666 pass, 1 pre-existing unrelated failure.
Phase 6 closing summary
| Release | What |
|---|---|
| v0.19.0 | Persist instant matches as completed recommendation_jobs row |
| v0.20.0 | Strip Stage 2 deep-scan from the search button |
| v0.21.0 | Lazy segment populate on cold pool |
| v0.22.0 | Freshness check + nightly sweep + ToS framing |
p95 search latency dropped from ~60s to <1s. OpenAI cost dropped ~95%. The local pool is now self-cleaning. Decision-gate for Phase 7 (Telegram bot) is now open.
v0.21.0 — lazy segment populate (Phase 6 step 3)
Phase 6, step 3 — close the cold-segment hole.
Before v0.21, a user with a brand-new role/seniority/domain combination would hit "Подбор" and see an empty list with a bland skeleton. v0.20 made search fast and cheap, but cold segments were still a dead-end. This release fixes that.
Architecture
search button (instant, cached)
↓
matcher (Qdrant only, ≤500ms)
↓
matches found? → return them
↓ no
└─→ derive segment_key
enqueue segment_warmup job (idempotent on segment_key)
return {prefetch_empty: true, segment_warming: true}
vacancy_warmup_worker (background, every cycle):
drain pending segment_warmup jobs from DB
for each: HH crawl + LLM analyze inside system_budget_scope
with per-query 429 backoff
honoring segment_warmup_daily_cap
Segment key
sha256(role_family + "|" + seniority_band + "|" + sorted_top3_domains)[:16]. Three dimensions → ~200–400 real segments. Stack-specific terms are deliberately excluded — that's the downstream reranker's job.
Dedup, cost cap, recovery
- Dedup: unique partial index
WHERE status IN ('queued', 'running') AND segment_key IS NOT NULL. Two users hitting the same segment within a minute → one job. - Daily cap:
segment_warmup_daily_cap = 100segments/day, ~$0.003 each → $0.45/day worst case. - Recovery: the worker queries pending jobs from the DB each cycle, so a restart loses nothing — no in-memory queue.
- Budget isolation: new
system_budget_scopedecouples segment-warmup spend from user daily budgets.
Frontend
prefetch_empty=True && segment_warming=True triggers an honest "Прогрев индекса под твою роль — 5–10 минут" banner with a soft pulse animation, replacing the bland skeleton. Cold users now know exactly what's happening.
Migration
0037_segment_warmup_jobs adds job_type, segment_key, notify_user_id to recommendation_jobs plus the unique partial index.
Tests
13 new tests in tests/test_segment_warmup.py covering:
derive_segment_keypurity (case-insensitive sort, deterministic)- Dedup of concurrent enqueue calls
prefetch_emptyinstant response carriessegment_warming=true- Daily cap respected by worker
- Recovery: worker drains queued jobs left over from previous run
system_budget_scopedoesn't ding user daily budget
Full backend: 644 pass, 1 pre-existing unrelated failure (test_applications_resume_badge from the 1-resume cap, out of scope).
Acceptance
- Cold user sees warm-in-progress banner instantly.
- After 5–10 minutes the pool is filled; next click serves <1 s matches.
- Daily OpenAI cost stays ≤ $1 even with cold-segment traffic.
- Two users in the same segment in one minute = one job.
v0.20.0 — decouple deep-scan from search button (Phase 6 step 2)
Phase 6, step 2 — kill the live HH crawl on the search button.
What changed
- Frontend
refreshVacancyIndexno longer callsPOST /vacancies/recommend/start. The Stage 2 polling loop, merge-by-id, and live OpenAI cost ticker are gone from the hot path. - Refresh = single
POST /vacancies/recommend/instant/{resume_id}against the local prefetched index. No HH crawl, no LLM analyze in the request thread. POST /recommend/startstays in the API for admin/debug — just not on the user's hot path.
Why
Live ingestion inside the request thread was the root cause of:
- $0.50 per search OpenAI cost (~$2.50/DAU/day, unsustainable above 50 DAU)
- 60-second latency on the "Подбор" button
- Race conditions in the matcher loop (multi-call
user_vacancy_seenstamping exhausted the recall pool, fixed in 1d830fa but the architecture remained wrong) - The persistence asymmetry where Stage 2 silently regressed a high-quality Stage 1 result (fixed in v0.19.0 but only as a band-aid)
Industry pattern (LinkedIn Galene, Indeed, hh.ru): ingestion async, query = local index only. Background warmup worker handles ingestion; live freshness in v0.22.
Acceptance
- p95 latency for "Подбор" ≤1s
- OpenAI daily cost drops ~95%
- No regression for warm segments (v0.19 snapshot persistence guarantees refresh parity)
Risk + mitigation
Cold segments (no prefetched matches) currently see the existing prefetch_empty skeleton + "индекс пополняется в фоне" message. The full degrade-UX (relaxed filters, lazy segment_warmup_job enqueue) ships in v0.21. The current behaviour is honest but not yet helpful for net-new role families.
Frontend QA
tsc --noEmitclean,eslint .clean (5 pre-existing warnings, 0 new)docker compose up -d --build frontendhealthy, GET / returns 200- Manual browser smoke not run on this commit — user should sanity-check before relying on prod
v0.19.0 — persist instant snapshot (Phase 6 step 1)
Phase 6, step 1 — persistence fix for instant search results.
Problem
Stage 1 instant search returned matches in <5s but they were ephemeral. After a page refresh restoreRecommendationState called /recommend/latest which only saw the older deep_scan job, so a high-quality instant result silently regressed back to the worse Stage 2 deep_scan one.
Fix
The instant endpoint now persists a completed-status row in recommendation_jobs after every successful response, via a new record_instant_recommendation_snapshot helper. Stage 1 and Stage 2 can race — last-completed wins. No migration; uses the existing status path.
Setup for v0.20
This unblocks the full decouple: once /recommend/latest returns the same matches the user just saw, we can stop running deep_scan from the search button at all.
Tests
test_latest_returns_same_matches_as_instant_response— refresh round-trip parity.test_instant_persists_even_when_matches_are_empty— cold-index path also persists.- 7/7 instant-endpoint tests pass; 631/632 backend pass (one unrelated pre-existing failure in
test_applications_resume_badge.pyfrom the 1-resume cap).
v0.17.0 — Auto-pin rollback + multi-facet discovery + feedback loop
Контекст
v0.16.0 (pills + auto-pin) сломал подбор у первого же реального юзера: auto-pin записал LLM-выдранный мусор ('product' standalone, длинная фраза с /) в User.preferred_titles[], query поехал не туда, pre-filter уронил 205/208 → 0 матчей. Независимая product-сессия подтвердила: фикс backend-ный.
Tier 1 — emergency fix
- Auto-pin убит на фронте.
handleSaveAndRecommendбольше не делаетunion(localRoles, autoDetected).User.preferred_*пишется только когда юзер явно отредактировал пилюли. - Серверный noise-validator.
_validate_titlesотклоняет items <4 символов и blocklist generic-stems (product,manager,lead,head,director,specialist,engineer,developer,analyst) standalone. Применяется и вUserPreferencesUpdate, и вPreferenceOverrides. preferred_titles[:2]фикс. Брали первые 2 пилюли — теряли 3-ю и 4-ю. Теперь все, словесный cap_short_query_from_tokens(max_words=7)режет в конце.- Логирование уже-заражённых юзеров.
WARNING discovery_query_noisy_prefесли получили noisy список — без destructive миграции.
Tier 1.5 — UX
- Pills демотированы. Свёрнуты в
<details>"Дополнительные фильтры (необязательно)" с подсказкой что AI и так подбирает по резюме. - «Подбираем для тебя» summary. Над матчами read-only блок: роль · seniority · топ-3 домена. Скрывается, если активный
preferred_titlesoverride. Ссылка «не так? → /audit».
Tier 2 — multi-facet discovery
_build_deep_scan_queriesдополнен 3 независимыми facet'ами:role_familyотдельно (e.g. "software engineering Russia")- top-3
hard_skillsотдельно (e.g. "Python Kubernetes Docker Russia") role_family + top skillcombo
- Дедуп case-insensitive, cap
MAX_DEEP_SCAN_QUERIES=6(interactive). - Pre-filter audit (B5):
_looks_unlikely_stackсмотрит на vacancy.title,_has_sufficient_skill_overlap— на скиллы кандидата. Skill-only facet не over-фильтруется. - Counter
multi_facet_queries_generatedв admin-телеметрии.
Tier 3 — feedback loop
- Новый
feedback_signal_extractor.get_negative_term_set():- До 30 dislikes ≤30d → токены из
vacancy_profile.must_have_skills + nice_to_have_skills→ вычестьresume.hard_skills→ top-N с freq≥2. - Кеш 5 мин per
(user_id, resume_id).
- До 30 dislikes ≤30d → токены из
- ScoringStage:
−0.02за каждый пересекающийся токен, cap−0.06. Никогда не дропает. - Counter
negative_term_penalty_appliedв admin-телеметрии. - Гейт через
settings.preference_decay_enabled— по умолчанию off, оператор включает в.env.local. - Магнитуды откалиброваны: ниже
+0.03 DOMAIN_PREFERENCE_BOOSTи+0.05 TITLE_BOOST_PARTIAL— переранжирует близких конкурентов, не закапывает сильные семантические совпадения.
Тесты
711 passed, 1 xfailed. Существующие тесты v0.16.0 (test_user_preferences.py, test_discovery_query_prefs.py, test_matcher_domain_boost.py) всё ещё зелёные.
Action для уже-заражённых юзеров
Если у тебя в "Дополнительные фильтры" висят странные пилюли (короткие токены, дубли, длинные фразы с /) — открой блок, нажми × на каждой → «Сохранить и обновить подбор». После сохранения система пойдёт по резюме (новые validator'ы не пропустят мусор обратно в БД).
v0.16.0 — Editable role + domain pills
Что нового
В сайдбаре / появились две группы редактируемых пилюль для управления поиском.
UX
- Роли (до 5) и Домены (до 3) — отдельные группы. Auto-detected из резюме помечены серым, вручную добавленные — акцентом.
- Inline-typeahead на каждое поле: подсказки приходят из частотного индекса по
vacancy_profiles(5-мин кеш). - Кнопка «Сохранить и обновить подбор» — PATCH в профиль и сразу запуск instant-first refresh с новыми фильтрами.
- «Сбросить» появляется только когда есть несохранённые изменения.
- Анимации
pill-in/pill-outподprefers-reduced-motion.
Backend
- Миграция
0035— колонкаusers.preferred_domains: text[]. PATCH /api/users/me/preferencesпринимаетpreferred_domains(≤3, ≤64 символов;[]= очистить).- Новый эндпоинт
GET /api/users/preferences/suggestions?type=role|domain&q=&limit=— типахед поvacancy_profiles. _build_discovery_queryуважаетpreferred_titlesиpreferred_domainsкак override над выводом из резюме.- В matcher'е — soft-boost
+0.03кvector_score, еслиvacancy.domains ∩ preferred_domains ≠ ∅. Никогда не отбраковывает. - Новый счётчик
domain_preference_boost_appliedв admin-телеметрии.
Designer tokens
16 семантических токенов: pill-auto-*, pill-pinned-*, pill-add-*, pill-remove-icon*, combobox-*, unsaved-indicator-fg. Анимации в globals.css. Style-guide и brand-preview обновлены.
Eval
18 новых тестов — PATCH (cap/clear/persist), suggestions (sort/prefix/auth/cache), discovery query (overrides+fallback), matcher A/B (только FinTech получает +0.03, HealthTech не трогаем, ничего не дропается).
Hardening (post-review)
useId()на listbox/option ID — две инстанцииPillsEditorбольше не конфликтуют.- Полный ARIA 1.2 combobox pattern (
role,aria-haspopup,aria-expanded). maxLength={64}в инпуте — пользователь не получит сырой 422.- Миграция
0036— функциональный GIN-индекс наvacancy_profiles.profile::jsonb(CONCURRENTLY). - Domains теперь идут перед skills в discovery query — не теряются под 7-словным cap'ом.
v0.15.0 — Instant-first matching
Что нового
Убираем «спиннер на 10% по 7 минут» в подборе вакансий.
UX
- Instant-first flow. Кнопка «Подбор» теперь возвращает результаты из прогретого индекса за <5 секунд (новый sync-эндпоинт
POST /api/vacancies/recommend/instant/{resume_id}). Тяжёлый deep-scan уходит в фон — тонкий индикатор сверху списка показывает «ищем ещё», результаты доливаются по завершении без блок-спиннера. - Partial вместо ошибки. Если deep-scan не уложился во внутренний бюджет 150 c, job завершается
completedс флагомmetrics.partial=trueи фронт рисует баннер «это часть результатов, обновите через 1–2 минуты» вместо ошибки таймаута. prefetch_emptyskeleton. Когда индекс холодный, вместо «ничего не найдено» показываем skeleton + честный текст про прогрев.
Performance / defaults
- Frontend default payload:
use_prefetched_index=true, discover_count=40(былоfalse/100— это и было причиной 7-минутных хвостов). - Server
recommendation_job_timeout_seconds: 420 → 180 c.
Reliability
- Новый janitor
sweep_stale_running_jobs()вvacancy_warmup_workerраз в цикл подметает зомби-jobs (status=runningстарше timeout) — фильтр в SQL, не Python-loop.
Контракт
VacancyRecommendResponseполучает полеprefetch_empty: bool.RecommendationJobStatusResponse.metricsполучает ключpartial: bool.
Тесты
11 новых интеграционок: instant happy/cold/404/no-HH-call, partial flag round-trip, sweeper. Полный suite: 692 passed.
Hardening (post-review)
Пост-релизный reviewer-pass + follow-up fix(recommendation): post-review hardening — мутекс против double-trigger из restoreRecommendationState, push-down DB-фильтра в sweeper, комментарии по edge-кейсам.
v0.14.0 — Phase 5.2 — Per-vacancy strategy + cover letter + recommendation corrections
v0.14.0 — Phase 5.2 — Per-vacancy strategy + cover letter draft + recommendation corrections
Released: 2026-04-25
Phase plan: .claude/skills/product-roadmap/phase-5.2-per-vacancy-strategy.md
The "AI consultant" thesis lands its second slice. Track segmentation (v0.13.0) told the user which vacancies to consider; this release tells them what to do with each one. Per-vacancy /strategy/{resume_id}/{vacancy_id} page renders three editorial blocks — match_highlights, gap_mitigations, cover_letter_draft — and the user can correct any block to feed the deterministic ranker.
What's new
/strategy/{resume_id}/{vacancy_id} page
- Three blocks with editorial 3 px left-rule treatment (match-highlight = neutral, gap-mitigation = amber).
- Each
match_highlightandgap_mitigationcard has a "Это не я / Это неправда" inline correction button. A correction POSTs to/api/recommendation-corrections, the card greys-out optimistically, and astrategy_match_highlight_corrected/strategy_gap_mitigation_correctedevent fires. cover_letter_draftrendered in an editable textarea backed by--color-strategy-editor-surface. "Скопировать" button + edit detection emitcover_letter_copied/cover_letter_edited.- Skeleton state, 401 / 404 / 429 / 503 / generic-error states all rendered inline (no toasts).
Стратегияbutton added to every vacancy card on/(between "Откликнуться" and "Интересно", hidden when no resume is selected). Apply-after-strategy detection viasessionStorage["strategy_seen:{rid}:{vid}"]=1→ firesapply_after_strategy_viewif user applies within the same session.
Backend: vacancy strategy service
backend/app/services/vacancy_strategy.py produces the three blocks for a (resume_id, vacancy_id) pair through two paths:
- LLM path — single
responses.createcall againstOPENAI_MATCHING_MODEL,response_format={"type":"json_object"}, PII-scrubbed input (scrub_piioncanonical_text), regex output sanitizer that strips emails / phones /+7|8-digit-cluster patterns fromcover_letter_draftand truncates to 1200 chars at sentence boundary. - Template path — pure Python:
_skill_overlapranks experience entries by overlap withvacancy_must_have_skills, picks top-3 highlights and top-2 gaps, fills a 3-paragraph skeleton. No LLM, no cost.
Path selection:
- If
feature_vacancy_strategy_enabledisFalse→ 503. - If
feature_vacancy_strategy_template_mode_enabledisTrue→ template. - If
daily_user_llm_cost_usd(user_id, today) >= vacancy_strategy_cost_cap_usd_per_day(default $0.05) → template. - Otherwise → LLM, with template fallback on any LLM error.
Backend: cost-cap helper shared across audit + strategy
backend/app/services/llm_cost_accounting.py::daily_user_llm_cost_usd(db, user_id, today) sums today's cost_usd across resume_audits (Phase 5.0) and vacancy_strategies (this phase) for the user. Both endpoints share the same $0.05/DAU/day budget — the cap is per user, not per surface.
Backend: cache + rate limit
vacancy_strategiestable caches the JSON output by(resume_id, vacancy_id, prompt_version), TTL =vacancy_strategy_cache_ttl_days(default 30 days).- Endpoint enforces a soft rate limit of 2 strategy computations per hour per user, evaluated server-side by counting fresh
vacancy_strategiesrows over the last hour. The cap only fires when actually recomputing — cache hits don't count.?force=truebypasses both cache and rate limit (for admin/debug; not surfaced in UI).
Backend: recommendation corrections feedback loop
POST /api/recommendation-corrections records user-issued corrections of the strategy:
{
"resume_id": 12,
"vacancy_id": 4321,
"correction_type": "match_highlight" | "gap_mitigation" | "cover_letter",
"subject_index": 0,
"subject_text": "не работал в банках"
}Pydantic validates correction_type as Literal[...] and subject_index as int in [0, 10]. Corrections feed correction_signal future iterations of the ranker (Phase 5.3 hook); for now they're write-only and surfaced in admin samples.
Telemetry — 6 new events
All routed through the generic sink POST /api/telemetry/event (rate-limited 120/min, fire-and-forget) shipped in v0.13.0:
| Event | Payload |
|---|---|
strategy_view |
{ resume_id, vacancy_id, template_mode } |
strategy_match_highlight_corrected |
{ resume_id, vacancy_id, subject_index } |
strategy_gap_mitigation_corrected |
{ resume_id, vacancy_id, subject_index } |
cover_letter_copied |
{ resume_id, vacancy_id } |
cover_letter_edited |
{ resume_id, vacancy_id, char_delta } |
apply_after_strategy_view |
{ resume_id, vacancy_id } |
apply_after_strategy_view is the activation gate metric for the phase: it tells us whether reading the strategy actually moves people to apply.
Design tokens (5 new)
In frontend/app/globals.css:
--color-strategy-match-rule(neutral)--color-strategy-gap-rule(amber, 4.5:1 contrast against canvas)--color-strategy-gap-surface--color-strategy-gap-label--color-strategy-editor-surface
Match-highlight uses neutral border like the match track in v0.13.0 — calm by default. Gap uses the same amber family as stretch track to keep editorial color logic coherent.
DB migrations
0033_vacancy_strategies.py—vacancy_strategies(id PK, resume_id FK CASCADE, vacancy_id FK CASCADE, prompt_version VARCHAR(32), strategy_json JSON NULL, cost_usd NUMERIC(10,6) NULL, template_mode BOOL NOT NULL DEFAULT false, computed_at TIMESTAMPTZ DEFAULT NOW()). Unique on(resume_id, vacancy_id, prompt_version). Indexix_vs_resume_computed(resume_id, computed_at).0034_recommendation_corrections.py—recommendation_corrections(id PK, user_id FK CASCADE, resume_id FK CASCADE, vacancy_id FK CASCADE, correction_type VARCHAR(32), subject_index INT, subject_text TEXT NULL, created_at TIMESTAMPTZ DEFAULT NOW()). Two indexes:(resume_id, vacancy_id)and(user_id, created_at).
Feature flags + config
feature_vacancy_strategy_enabled— defaultTrue.feature_vacancy_strategy_template_mode_enabled— defaultFalse. Flip toTrueto force template path (debug / cost emergencies).vacancy_strategy_cost_cap_usd_per_day— default$0.05. Shared budget with/auditviadaily_user_llm_cost_usd.vacancy_strategy_cache_ttl_days— default30.
Files changed
New backend:
app/services/vacancy_strategy.py, app/services/llm_cost_accounting.py, app/api/routes/vacancy_strategy.py, app/api/routes/recommendation_corrections.py, app/models/vacancy_strategy.py, app/models/recommendation_correction.py, app/schemas/vacancy_strategy.py, app/schemas/recommendation_correction.py, alembic/versions/{0033,0034}_*.py.
New frontend:
app/strategy/page.tsx, app/strategy/StrategySkeleton.tsx, components/strategy/StrategyView.tsx (designer anchor, now production), types/strategy.ts.
New tests (54 cases):
backend/tests/test_vacancy_strategy_template_mode.py (7), test_vacancy_strategy_endpoint.py (6), test_recommendation_corrections.py (7), test_llm_cost_accounting.py (4), test_strategy_pii_hard_guard.py (30 parametrized).
Modified:
app/main.py (router wiring), app/models/__init__.py (new model exports), app/api/routes/telemetry.py (6 new events whitelisted), app/core/config.py (4 new settings), frontend/app/page.tsx (Стратегия button + apply-after-strategy detection), frontend/app/globals.css (5 new tokens), frontend/docs/style-guide.md.
Decision-gate
The original phase plan put feature_vacancy_strategy_enabled behind a 14-day flag with engagement / quality / cost metrics. With N=1 dogfood, the decision-gate is invalidated — flag is on by default, fix-forward instead. The cost-cap, PII guard, and rate limit are still load-bearing and remain in production. Metrics framework is intact for when WAU > 1.
Quality gate
- ✅
python -m ruff format --check backend/app/ - ✅
python -m ruff check backend/app/ - ✅
tsc --noEmit && eslint .(5 pre-existing warnings onapp/page.tsx) - ✅
pytest -q tests/ --ignore=tests/eval/test_audit_regression.py— 638 passed (584 prior + 54 new) - ✅ Frontend container rebuilt + healthy on localhost:3000
- ✅ PII hard guard: 30 parametrized cases (emails / phones / Cyrillic name pairs) — zero leaks in template mode
Out of scope (deferred to v1.0.0 / Phase 5.3)
- Domain expansion (PM / Design / Analytics) — gated on PMF, separate roadmap.
- Ranker reading from
recommendation_corrections— the data is being collected; the ranker hook lands in Phase 5.3. - Multi-step interview prep / mock interviews — not scoped.