OpenHub-Store · rainxchzed · May 5, 2026 · May 5, 2026 · May 5, 2026 · coderabbitai
diff --git a/.claude/scheduled_tasks.lock b/.claude/scheduled_tasks.lock
@@ -0,0 +1 @@
+{"sessionId":"7555a767-0d96-4490-86d6-a13b5c13148b","pid":40413,"procStart":"Sun May  3 16:45:03 2026","acquiredAt":1777917963204}
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -111,7 +111,7 @@ RepoRefreshWorker (hourly)  — re-fetches passthrough repos by oldest indexed_a
 - **Auth is a stateless proxy, not a session.** `/v1/auth/device/*` forwards to `github.com/login/*` with the backend's `GITHUB_OAUTH_CLIENT_ID` injected. The backend must **never** log, cache, or persist the access token returned by a successful poll — it passes through the suspending handler and out to the HTTP response, nothing else. No database table, no in-memory map, no breadcrumb. The client is the only place the token lives. Client is backend-first on these two calls and falls back to direct-to-github.com on 5xx / network errors (only — not on valid-but-negative responses like `authorization_pending` or `access_denied`, which are GitHub's real answer and `github.com` direct would say the same thing).
 - **Unified ranking via `SearchScore.compute()`** (`ranking/SearchScore.kt`). Formula: `0.40·log₁₀(stars+1)/6 + 0.30·ctr + 0.20·install_success_rate + 0.10·exp(-days_since_release/90)`. Two callers: `SignalAggregationWorker` (hourly, with real signals) and `GitHubSearchClient` at ingest time (cold-start, signals = 0 — still gives passthrough repos a non-null score so they sort). Weights live in the object only; never inline the formula elsewhere.
 - **Meilisearch partial-update gotcha — PUT, never POST.** `MeilisearchClient.addDocuments()` is POST, which on Meili *replaces* the document with whatever fields you send (everything else becomes null). `MeilisearchClient.updateScores()` is PUT, which merges. Pushing just `{id, search_score}` with POST will wipe every other field on 3000+ docs. If you add a new "partial update" path, verify the HTTP verb before deploying.
-- **Dynamic category/topic ordering.** `RepoRepository.findByCategory()` / `findByTopicBucket()` sort by `searchScore DESC NULLS LAST, rank ASC`. The Python fetcher's static `rank` is only a tie-breaker now; behavioral signals dominate.
+- **Dynamic category/topic ordering.** `RepoRepository.findByCategory()` picks a category-specific primary sort column (`trending_score` for trending, `popularity_score` for most-popular, `latest_release_date` for new-releases), falls back to global `searchScore`, then static `rank` as final tie-breaker. Without category-specific primary, both trending and most-popular collapse onto the same global score — the bug fix in PR #12. `findByTopicBucket()` keeps the simpler `searchScore DESC NULLS LAST, rank ASC` order because topics are flat lists, not flavour-segmented like the categories.
 - **Exposed `Repos` table uses `array<String>("topics", TextColumnType())`** for the Postgres `TEXT[]` column. The Python fetcher writes these via psycopg2's automatic list-to-array conversion.
 - **Cache headers are set per endpoint**, not globally. Announcements: 600s/3600s. Categories/topics: 60s/600s. Repo detail: 30s/300s. Search: 15s/30s. Readme proxy: 3600s/21600s. User proxy: 86400s/604800s. Badges (fresh): 3600s/3600s with `stale-while-revalidate=86400`; (degraded) 300s/300s. Edge respects `s-maxage`; the larger `s-maxage` lets Gcore's shield/tiered cache topology absorb origin load while browsers stay fresher via the smaller `max-age`. `/internal/metrics` is uncached.
 - **HEAD routes to GET** via the `AutoHeadResponse` plugin (`Plugins.kt`). Without it, Ktor 3 returns 404 for HEAD on every GET handler — confusing for `curl -I`, monitoring, and CDN origin probes.

diff --git a/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt b/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt
@@ -19,18 +19,32 @@ class RepoRepository {
     }
 
     suspend fun findByCategory(category: String, platform: String, limit: Int = 50): List<RepoResponse> = newSuspendedTransaction(Dispatchers.IO) {
-        // Primary: dynamic behavioral search_score (updated hourly by
-        // SignalAggregationWorker from clicks / installs / stars / freshness).
-        // Tie-breaker: the static rank the Python fetcher writes once a day,
-        // which preserves the category's semantic flavor (trending stays
-        // velocity-flavored, new-releases stays recency-flavored, etc.) when
-        // two repos have similar behavioral scores.
+        // Primary sort is category-specific: trending velocity for the
+        // trending list, absolute popularity for the popular list, release
+        // recency for new-releases. Without category-specific primary, both
+        // trending and most-popular collapse onto the same global
+        // search_score and return ~99% identical top-N results -- the bug
+        // this query previously had.
+        //
+        // Each category falls back to the global behavioral search_score
+        // when its category-specific column is NULL, then to the static
+        // rank the Python fetcher writes once a day. The fetcher populates
+        // the category-specific scores for repos in that category, so the
+        // fallback is mostly a no-op except for newly-ingested rows that
+        // haven't been reranked yet.
+        val primary: org.jetbrains.exposed.sql.Expression<*> = when (category) {
+            "trending" -> Repos.trendingScore
+            "most-popular" -> Repos.popularityScore
+            "new-releases" -> Repos.latestReleaseDate
+            else -> Repos.searchScore
+        }
         Repos.innerJoin(RepoCategories, { id }, { repoId })
             .selectAll()
             .where {
                 (RepoCategories.category eq category) and (RepoCategories.platform eq platform)
             }
             .orderBy(
+                primary to SortOrder.DESC_NULLS_LAST,
                 Repos.searchScore to SortOrder.DESC_NULLS_LAST,
                 RepoCategories.rank to SortOrder.ASC,
             )
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		{"sessionId":"7555a767-0d96-4490-86d6-a13b5c13148b","pid":40413,"procStart":"Sun May 3 16:45:03 2026","acquiredAt":1777917963204}