Skip to content

Conversation

@aysahoo
Copy link
Collaborator

@aysahoo aysahoo commented Aug 23, 2025

  • Now it will create a default value for repo_id
  • for previous projects I have put a button in admin page to update their repo ids as it will be needed once (we can set cron job for later if needed)
  • added some skeletons (don't look right will change afterwards , I needed them so added)
  • now if a new project comes then it fetched the repo id and automatically updates
  • added the fetch of repo_id to seed but will change it to normal hardcoded data after this is solved

the button -
Screenshot 2025-08-24 at 12 30 41 AM

Summary by CodeRabbit

  • New Features
    • Admin dashboard: “Update Repo IDs” quick action with progress state and toast notifications.
    • Projects now auto-derive and persist repository IDs on create; admin mutation to batch-update repo IDs across hosts.
  • Bug Fixes
    • More resilient metric fetching with safe fallbacks; project cards show loading skeletons and fallbacks for missing repo data.
    • Submission payloads include a pending repoId when appropriate.
  • Chores
    • DB migration adds repo_id column and index; seed script resolves repo IDs with throttled provider calls.

@vercel
Copy link

vercel bot commented Aug 23, 2025

@aysahoo is attempting to deploy a commit to the ossdotnow Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link

coderabbitai bot commented Aug 23, 2025

Walkthrough

Adds repoId handling across the stack: DB migrations and index, seeding and seed-time resolution, driver methods to batch-update repo IDs, router admin mutation + safer metric fetching, admin UI button to trigger updates, frontend project card loading fallbacks, and submission payload/form omission of repoId.

Changes

Cohort / File(s) Summary
Admin dashboard wiring
apps/web/app/(admin)/admin/page.tsx
Adds React Query mutation wired to trpc.projects.updateRepoIds, success/error toasts, query invalidation, and a Quick Action button (Database icon) to trigger updates for all hosts.
Public project card loading states
apps/web/app/(public)/(projects)/projects/project-card.tsx
Uses query isLoading to render skeletons for avatar, stars, forks, and created date; removes early-return on missing repo and falls back to project-provided values.
Submission payload & form schema
apps/web/components/submissions/submission-form.tsx, apps/web/forms/index.ts
Submission payload now includes repoId: 'pending'; form input schemas omit repoId (client no longer supplies it).
Git drivers: repoId updater
packages/api/src/driver/github.ts, packages/api/src/driver/gitlab.ts, packages/api/src/driver/types.ts
Adds updateRepoIds(ctx) to GitHub/GitLab managers and GitManager interface: batch-fetch provider repo IDs, update DB, return counts (updated/failed/skipped); includes per-item logging, 1s throttling, and error handling.
Projects router: derive and update repoId; safer metrics
packages/api/src/routers/projects.ts
Adds admin mutation updateRepoIds({ gitHost }) returning per-host and totals; addProject/createProjectInput no longer accepts repoId and derives it server-side; metric fetching wrapped in try/catch and optionally writes refreshed metrics back to DB.
DB migrations and metadata
packages/db/drizzle/0024_free_khan.sql, packages/db/drizzle/0025_add_repo_id_index.sql, packages/db/drizzle/meta/0024_snapshot.json, packages/db/drizzle/meta/_journal.json
Adds project.repo_id column, backfills values, applies NOT NULL, and creates project_repo_id_idx; updates drizzle snapshot and journal entries.
Seed: resolve repoId at insert time
packages/db/src/seed/fixtures/projects.ts
Seed script now fetches provider repo IDs per-record (GitHub/GitLab), throttles requests, falls back to gitRepoUrl on failure, and inserts enriched records; logs progress/errors.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Admin
  participant Web as Admin UI
  participant API as tRPC Projects Router
  participant GH as GithubManager
  participant GL as GitlabManager
  participant DB as Database
  participant ExtGH as GitHub API
  participant ExtGL as GitLab API

  Admin->>Web: Click "Update Repo IDs"
  Web->>API: projects.updateRepoIds({ gitHost: 'all' })
  par GitHub path
    API->>GH: updateRepoIds(ctx)
    loop for each github project
      GH->>ExtGH: GET /repos/{owner}/{repo}
      ExtGH-->>GH: repo.id or error
      GH->>DB: UPDATE project.repo_id (when found)
    end
  and GitLab path
    API->>GL: updateRepoIds(ctx)
    loop for each gitlab project
      GL->>ExtGL: GET /projects/:encoded_path
      ExtGL-->>GL: repo.id or error
      GL->>DB: UPDATE project.repo_id (when found)
    end
  end
  API-->>Web: { results per host, totals }
  Web-->>Admin: show toast, invalidate admin.dashboard
Loading
sequenceDiagram
  autonumber
  actor User
  participant Web as Submit UI
  participant API as tRPC Projects Router
  participant Driver as Provider Driver
  participant Ext as Provider API
  participant DB as Database

  User->>Web: Submit project (no repoId)
  Web->>API: addProject(input without repoId)
  API->>Driver: derive owner/repo and fetch repoId
  Driver->>Ext: GET repo metadata
  Ext-->>Driver: repo.id or not found
  Driver-->>API: repoId or fallback (gitRepoUrl/'pending')
  API->>DB: INSERT project with derived repo_id
  DB-->>API: inserted project
  API-->>Web: success
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • fixes #199 — Wires project.repoId into UI/types; overlaps with backend and UI repoId integration in this PR.
  • feat: gitlab apis #25 — Related earlier work on provider drivers and router surfaces that this PR extends with updateRepoIds.
  • admin pages #20 — Prior admin dashboard changes; this PR adds the repo-id update action and hook integration to that UI area.

Poem

Thump-thump, I hop through migration rows,
Fetching IDs where the git-host wind blows.
Seeds enriched, indexes climb,
Admin clicks — counts in time.
Carrots for commits, a rabbit’s small cheer. 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9ce768c and bdfeb82.

📒 Files selected for processing (2)
  • apps/web/app/(public)/(projects)/projects/project-card.tsx (3 hunks)
  • packages/api/src/driver/gitlab.ts (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/web/app/(public)/(projects)/projects/project-card.tsx
  • packages/api/src/driver/gitlab.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Vade Review
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@@ -0,0 +1,2 @@
-- Add an index for better repo_id query performance
CREATE INDEX IF NOT EXISTS "project_repo_id_idx" ON "project" ("repo_id");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration 0025 creates the same index that was already created in migration 0024, making it redundant.

View Details
📝 Patch Details
diff --git a/packages/db/drizzle/0025_add_repo_id_index.sql b/packages/db/drizzle/0025_add_repo_id_index.sql
deleted file mode 100644
index df53c9b..0000000
--- a/packages/db/drizzle/0025_add_repo_id_index.sql
+++ /dev/null
@@ -1,2 +0,0 @@
--- Add an index for better repo_id query performance
-CREATE INDEX IF NOT EXISTS "project_repo_id_idx" ON "project" ("repo_id");
diff --git a/packages/db/drizzle/meta/_journal.json b/packages/db/drizzle/meta/_journal.json
index 90a467a..c68ce29 100644
--- a/packages/db/drizzle/meta/_journal.json
+++ b/packages/db/drizzle/meta/_journal.json
@@ -176,13 +176,6 @@
       "when": 1755967252017,
       "tag": "0024_free_khan",
       "breakpoints": true
-    },
-    {
-      "idx": 25,
-      "version": "7",
-      "when": 1755968025000,
-      "tag": "0025_add_repo_id_index",
-      "breakpoints": true
     }
   ]
 }

Analysis

Migration 0024 (line 17) already creates the index project_repo_id_idx on the repo_id column:

CREATE INDEX IF NOT EXISTS "project_repo_id_idx" ON "project" ("repo_id");

Migration 0025 attempts to create the exact same index again. While the IF NOT EXISTS clause makes this safe and won't cause errors, having a dedicated migration that only creates an already-existing index is redundant and adds unnecessary confusion to the migration history.

This could indicate either a mistake in the migration planning or a leftover migration file that should have been removed.


Recommendation

Remove the entire migration file packages/db/drizzle/0025_add_repo_id_index.sql and update the journal file to remove the corresponding entry, since the index is already created in migration 0024.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it again cause it was not working for me in the first time

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
apps/web/app/(public)/(projects)/projects/project-card.tsx (2)

1-1: Add "use client" directive — this component uses hooks and must be a Client Component in Next.js app router.

Without the "use client" directive, using useQuery/useTRPC will cause a build/runtime error in Next.js (app directory). Place it at the very top of the file before any imports.

+"use client";
 import { projectProviderEnum, project as projectSchema } from '@workspace/db/schema';

40-72: Avatar URL handling: avoid hardcoding gitlab.com and add project.logoUrl fallback.

  • Bug risk: repo?.namespace?.avatar_url may be absolute (works) or relative; hardcoding https://gitlab.com breaks self-hosted GitLab and other hosts.
  • UX: When repo is absent or errors, you always show a skeleton even if project.logoUrl exists. Prefer falling back to project.logoUrl.
  • Simplify by computing a single avatarUrl that prefers repo owner/group avatar, then project.logoUrl, else skeleton.

Apply this minimal render diff (uses an avatarUrl variable defined just above the return):

-          {isLoading || isError || !repo ? (
-            <div className="h-[78px] w-[78px] animate-pulse bg-neutral-900" />
-          ) : (repo?.owner?.avatar_url || repo?.namespace?.avatar_url) ? (
-            project.ownerId ? (
-              <Link
-                href={`/profile/${project.ownerId}`}
-                onClick={(e) => e.stopPropagation()}
-                className="z-10 shrink-0 rounded-none"
-              >
-                <Image
-                  src={
-                    repo?.owner?.avatar_url || `https://gitlab.com${repo?.namespace?.avatar_url}`
-                  }
-                  alt={project.name ?? 'Project Logo'}
-                  width={256}
-                  height={256}
-                  className="h-[78px] w-[78px] rounded-none transition-opacity hover:opacity-80"
-                  loading="lazy"
-                />
-              </Link>
-            ) : (
-              <Image
-                src={repo?.owner?.avatar_url || `https://gitlab.com${repo?.namespace?.avatar_url}`}
-                alt={project.name ?? 'Project Logo'}
-                width={256}
-                height={256}
-                className="h-[78px] w-[78px] rounded-none"
-                loading="lazy"
-              />
-            )
-          ) : (
-            <div className="h-[78px] w-[78px] animate-pulse bg-neutral-900" />
-          )}
+          {isLoading ? (
+            <div className="h-[78px] w-[78px] animate-pulse bg-neutral-900" />
+          ) : avatarUrl ? (
+            project.ownerId ? (
+              <Link
+                href={`/profile/${project.ownerId}`}
+                onClick={(e) => e.stopPropagation()}
+                className="z-10 shrink-0 rounded-none"
+              >
+                <Image
+                  src={avatarUrl}
+                  alt={project.name ?? 'Project Logo'}
+                  width={256}
+                  height={256}
+                  className="h-[78px] w-[78px] rounded-none transition-opacity hover:opacity-80"
+                  loading="lazy"
+                />
+              </Link>
+            ) : (
+              <Image
+                src={avatarUrl}
+                alt={project.name ?? 'Project Logo'}
+                width={256}
+                height={256}
+                className="h-[78px] w-[78px] rounded-none"
+                loading="lazy"
+              />
+            )
+          ) : (
+            <div className="h-[78px] w-[78px] animate-pulse bg-neutral-900" />
+          )}

Add this helper near the top of the component (outside JSX) to resolve the avatar URL robustly across GitHub/GitLab (including self-hosted) and to fall back to project.logoUrl:

// Place inside the component, above `return`
const avatarUrl = (() => {
  const gh = repo?.owner?.avatar_url as string | undefined;
  const gl = repo?.namespace?.avatar_url as string | undefined;
  if (gh) return gh;
  if (gl) {
    try {
      // Use repo host as base for relative GitLab avatar paths
      const base = new URL(project.gitRepoUrl).origin;
      return new URL(gl, base).toString();
    } catch {
      // If new URL fails, return as-is when it already looks absolute
      if (gl.startsWith('http://') || gl.startsWith('https://')) return gl;
    }
  }
  return project.logoUrl || undefined;
})();
apps/web/components/submissions/submission-form.tsx (2)

257-265: Repository format check rejects valid owners with dots; align with form + parser

formatRegex prohibits . in the owner segment, while:

  • the form schema allows it,
  • parseRepositoryUrl patterns allow it,
  • GitLab group paths commonly include dots.

This yields false negatives in validation.

-      const formatRegex = /^[a-zA-Z0-9_-]+\/[a-zA-Z0-9._-]+$/;
+      const formatRegex = /^[a-zA-Z0-9._-]+\/[a-zA-Z0-9._-]+$/;

Optional: Extract a shared constant so the UI parser, UI validator, and API validator use the same source of truth.


418-425: Remove client-supplied repoId – the backend derives it
The repoId: 'pending' you’re spreading into mutate(...) isn’t consumed by any TRPC input schema (it’s stripped by Zod’s default behavior) and is instead looked up server-side via the GitHub/GitLab driver. Passing it from the client adds noise and a potential footgun if the schema ever becomes strict or someone mistakenly persists it.

Please update:

   mutate({
     ...formData,
     status: formData.status || '',
     type: formData.type || '',
-    repoId: 'pending',
   });
🧹 Nitpick comments (22)
apps/web/app/(public)/(projects)/projects/project-card.tsx (2)

136-142: Simplify date fallback and avoid double branching.

Use a single nullish-coalescing fallback to project date. This keeps zero-cost semantics and reduces churn.

-              {isLoading ? (
-                <span className="inline-block h-3 w-16 animate-pulse rounded bg-neutral-700"></span>
-              ) : isError || !repo ? (
-                formatDate(new Date(project.createdAt))
-              ) : (
-                repo?.created_at ? formatDate(new Date(repo.created_at)) : formatDate(new Date(project.createdAt))
-              )}
+              {isLoading ? (
+                <span className="inline-block h-3 w-16 animate-pulse rounded bg-neutral-700"></span>
+              ) : isError || !repo ? (
+                formatDate(new Date(project.createdAt))
+              ) : (
+                formatDate(new Date(repo?.created_at ?? project.createdAt))
+              )}

40-72: Optional: Precompute display values to shrink JSX branching.

Consider computing avatarUrl, stars, forks, and created above the return and reference them in JSX. This improves readability and avoids recomputation on each render path.

Example:

const stars = isError || !repo ? (project.starsCount ?? 0) : (repo?.stargazers_count ?? repo?.star_count ?? project.starsCount ?? 0);
const forks = isError || !repo ? (project.forksCount ?? 0) : (repo?.forks_count ?? project.forksCount ?? 0);
const created = formatDate(new Date((repo?.created_at ?? project.createdAt)));

Also applies to: 112-143

apps/web/forms/index.ts (1)

28-33: Unify validation copy: GitHub-only wording while GitLab is supported

Message says “Invalid GitHub repository format” but the app supports GitHub and GitLab. Recommend neutral copy.

-        message: 'Invalid GitHub repository format. Use: username/repository',
+        message: 'Invalid repository format. Use: owner/repository',
packages/api/src/driver/types.ts (2)

257-259: Broaden updateRepoIds ergonomics and narrow types

Consider evolving the signature for long-running admin jobs and type safety:

  • Replace ctx: { db: any } with a typed context (e.g., DbClient from drizzle).
  • Add optional controls: limit?: number, dryRun?: boolean, continueOnError?: boolean, since?: string (ISO) to support incremental/admin usage and safer rollouts.
-export interface GitManager {
+export interface GitManager {
@@
-  updateRepoIds(ctx: { db: any }): Promise<{ updated: number; failed: number; skipped: number }>;
+  updateRepoIds(
+    ctx: { db: DbClient; logger?: { info: Function; warn: Function; error: Function } },
+    opts?: { limit?: number; dryRun?: boolean; continueOnError?: boolean; since?: string },
+  ): Promise<{ updated: number; failed: number; skipped: number }>;

116-128: Naming consistency: UnSubmittedRepoUnsubmittedRepo

Prefer consistent casing (“unsubmitted” as a single word) to avoid divergent type names across modules.

-export interface UnSubmittedRepo {
+export interface UnsubmittedRepo {
   name: string;
   repoUrl: string;
   stars: number;
   forks: number;
   isOwner: boolean;
   description?: string | null;
   gitHost: 'github' | 'gitlab';
   owner: {
     avatar_url: string;
   };
   created_at: string;
 }

And update the usage in getUnsubmittedRepos: Promise<UnsubmittedRepo[]>.

packages/db/drizzle/0025_add_repo_id_index.sql (1)

1-2: Redundant index creation; decide single source of truth

This migration recreates project_repo_id_idx, which is already created in 0024. It’s harmless due to IF NOT EXISTS but introduces duplication. Consider removing one or making 0025 solely responsible for index management to reduce noise in future diffs.

If you intend to enforce uniqueness on repo_id (recommended if each repo maps to one project), we’ll need a drop-and-recreate path guarded by pre-checks. I can draft that once we confirm no duplicates exist.

packages/db/drizzle/0024_free_khan.sql (1)

16-17: Consider uniqueness + normalization enforcement (optional) and avoid duplicate index definitions

  • You create the same index here and again in 0025. Keep it in one migration to avoid drift.
  • If business rules require a 1:1 between repo and project, enforce uniqueness (after cleaning duplicates) and/or store repo_id lowercase consistently.

Options:

  • Enforce app-level normalization (drivers/admin updater writes lower(host:owner/repo)).
  • Add a unique index on repo_id once data is clean:
-- After verifying no duplicates:
DROP INDEX IF EXISTS "project_repo_id_idx";
CREATE UNIQUE INDEX IF NOT EXISTS "project_repo_id_uq_idx" ON "project" ("repo_id");

Operational note: For large tables in production, prefer CREATE INDEX CONCURRENTLY in a separate, non-transactional migration to reduce locking.

packages/api/src/driver/gitlab.ts (3)

36-91: Add rate-limit/backoff handling and remove the unused variable.

  • encodedPath is computed (Line 64) but unused. Either use it or drop it to avoid confusion.
  • Consider handling 429/5xx with exponential backoff to make the batch update more resilient and faster overall.

Apply:

-      const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
+      const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
+      const withBackoff = async <T>(fn: () => Promise<T>) => {
+        let delay = 1000;
+        for (let attempt = 0; attempt < 5; attempt++) {
+          try {
+            return await fn();
+          } catch (e: any) {
+            const status = e?.response?.status ?? e?.cause?.response?.status;
+            if (status === 429 || (status >= 500 && status < 600)) {
+              await sleep(delay);
+              delay = Math.min(delay * 2, 16000);
+              continue;
+            }
+            throw e;
+          }
+        }
+        throw new Error('Max retries exceeded');
+      };

And:

-          const encodedPath = encodeURIComponent(p.gitRepoUrl);
-          const response = await this.gitlab.Projects.show(p.gitRepoUrl);
+          // Gitbeaker accepts path-with-namespace and performs encoding internally.
+          const response = await withBackoff(() => this.gitlab.Projects.show(p.gitRepoUrl));

57-63: Skip condition can be simplified for readability.

Since a numeric repoId will never equal a path like owner/repo, the final comparison is redundant.

-          if (p.repoId && !isNaN(Number(p.repoId)) && p.repoId !== p.gitRepoUrl) {
+          if (p.repoId && !isNaN(Number(p.repoId))) {
             result.skipped++;
             continue;
           }

934-970: Avoid hardcoding gitlab.com in URL parsing; derive owner safely.

Parsing projectPath via a gitlab.com regex (Lines 966-967) will break for self-hosted or alternate domains and may fail if web_url changes. You already have the full project path before "/-/merge_requests/"; parse it generically.

-            const urlParts = mr.web_url.split('/-/merge_requests/');
-            const projectUrl = urlParts[0] || '';
-            const pathMatch = projectUrl.match(/gitlab\.com\/(.+)$/);
-            const projectPath = pathMatch ? pathMatch[1] : '';
-
-            const [ownerLogin] = projectPath.split('/');
+            const [projectUrl] = mr.web_url.split('/-/merge_requests/');
+            // Extract path after the domain, regardless of host
+            const projectPath = projectUrl.replace(/^https?:\/\/[^/]+\//, '');
+            const [ownerLogin] = projectPath.split('/');
packages/db/src/seed/fixtures/projects.ts (5)

4-5: Prefer built-in fetch (Node 18+) or type-safe import to remove ts-ignore.

Relying on @ts-ignore masks real issues. If your runtime is Node 18+, use global fetch; otherwise, import node-fetch with proper typing or switch to undici.

-// @ts-ignore - Ignoring type issues for node-fetch in seed script
-import fetch from 'node-fetch';
+// Use built-in fetch (Node 18+) to avoid ts-ignore and extra dep
+const fetchFn: typeof fetch = globalThis.fetch as any;

And replace usages of fetch below with fetchFn.


8-31: Allow authenticated GitHub requests to avoid rate limiting during seeding.

Unauthenticated GitHub API calls are limited to ~60/hour. Consider using GITHUB_TOKEN if available.

-async function fetchGitHubRepoId(repoPath: string): Promise<string | null> {
+async function fetchGitHubRepoId(repoPath: string): Promise<string | null> {
   try {
-    const response = await fetch(`https://api.github.com/repos/${repoPath}`, {
+    const githubToken = process.env.GITHUB_TOKEN;
+    const response = await fetchFn(`https://api.github.com/repos/${repoPath}`, {
       headers: {
         Accept: 'application/vnd.github+json',
         'User-Agent': 'OSSLaunch-Seeding',
+        ...(githubToken ? { Authorization: `Bearer ${githubToken}` } : {}),
       },
     });

34-57: Same for GitLab: support token auth to reduce failures and noise.

GitLab API allows PRIVATE-TOKEN or Authorization: Bearer. Use env if provided.

-async function fetchGitLabRepoId(repoPath: string): Promise<string | null> {
+async function fetchGitLabRepoId(repoPath: string): Promise<string | null> {
   try {
     const encodedPath = encodeURIComponent(repoPath);
-    const response = await fetch(`https://gitlab.com/api/v4/projects/${encodedPath}`, {
+    const gitlabToken = process.env.GITLAB_TOKEN;
+    const response = await fetchFn(`https://gitlab.com/api/v4/projects/${encodedPath}`, {
       headers: {
         'User-Agent': 'OSSLaunch-Seeding',
+        ...(gitlabToken
+          ? { 'PRIVATE-TOKEN': gitlabToken } // or { Authorization: `Bearer ${gitlabToken}` }
+          : {}),
       },
     });

905-931: Type the accumulator to catch shape mismatches at compile time.

Make projectsWithRepoId a typed array of project inserts.

-      const projectsWithRepoId = [];
+      const projectsWithRepoId: typeof project.$inferInsert[] = [];

933-934: Consider on-conflict protection for parallel runs.

If multiple seed runs happen in parallel, a unique git_repo_url could cause a constraint error. Using Drizzle’s onConflictDoNothing helps.

-      await database.insert(project).values(projectsWithRepoId);
+      await database
+        .insert(project)
+        .values(projectsWithRepoId)
+        .onConflictDoNothing({ target: project.gitRepoUrl });
apps/web/app/(admin)/admin/page.tsx (3)

27-41: Use TRPC-provided query keys/utilities for invalidation to avoid stale data.

Hardcoding ['admin.dashboard'] may not match the actual query key, depending on your trpc client factory. Prefer the TRPC utils/helpers.

-        queryClient.invalidateQueries({ queryKey: ['admin.dashboard'] });
+        // Prefer TRPC utilities if available, e.g.:
+        // const utils = trpc.useUtils();
+        // await utils.admin.dashboard.invalidate();
+        // If not, use the query key from queryOptions:
+        queryClient.invalidateQueries(trpc.admin.dashboard.queryOptions().queryKey);

27-41: Surface server error details in the toast for faster debugging.

Expose the message if present.

-      onError: () => {
-        toast.error('Error', {
-          description: 'Failed to update repository IDs',
-        });
-      },
+      onError: (err) => {
+        toast.error('Error', {
+          description: err?.message ?? 'Failed to update repository IDs',
+        });
+      },

147-155: Add basic a11y affordances to the action button.

Announce loading state via aria attributes.

-            <Button
+            <Button
               className="w-full justify-start"
               variant="outline"
               onClick={() => updateRepoIds.mutate({ gitHost: 'all' })}
               disabled={updateRepoIds.isPending}
+              aria-busy={updateRepoIds.isPending}
+              aria-live="polite"
             >
packages/api/src/driver/github.ts (1)

33-89: Batch update logic is sound; add basic backoff and simplify skip condition.

  • The throttle is good; adding exponential backoff for 429/5xx will make the job more robust.
  • The p.repoId !== p.gitRepoUrl part is redundant for numeric repoIds.
-      const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
+      const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
+      const withBackoff = async <T>(fn: () => Promise<T>) => {
+        let delay = 1000;
+        for (let attempt = 0; attempt < 5; attempt++) {
+          try {
+            return await fn();
+          } catch (e: any) {
+            const status = e?.status ?? e?.response?.status ?? e?.cause?.response?.status;
+            if (status === 429 || (status >= 500 && status < 600)) {
+              await sleep(delay);
+              delay = Math.min(delay * 2, 16000);
+              continue;
+            }
+            throw e;
+          }
+        }
+        throw new Error('Max retries exceeded');
+      };

And:

-          if (p.repoId && !isNaN(Number(p.repoId)) && p.repoId !== p.gitRepoUrl) {
+          if (p.repoId && !isNaN(Number(p.repoId))) {
             result.skipped++;
             continue;
           }

And:

-          const { data } = await this.octokit.rest.repos.get({ owner, repo });
+          const { data } = await withBackoff(() => this.octokit.rest.repos.get({ owner, repo }));
packages/api/src/routers/projects.ts (2)

65-107: Optional: rate-limit metric fetches to avoid API throttling

Even with a 3-minute cache, parallel Promise.allSettled can hammer provider APIs on popular pages. Consider a small concurrency limiter (e.g., p-limit 5–10) and exponential backoff on 429s. This is a nice-to-have given the admin backfill and short TTL.

I can wire a tiny in-memory queue keyed by repoId+metric to deduplicate bursts for the same project.


110-150: Isolate provider failures in updateRepoIds to allow partial success

The updateRepoIds methods on both drivers already match the expected signature, so you can safely wrap each provider call in its own try/catch. This will prevent one provider’s failure from aborting the entire mutation and still return aggregated results.

• File: packages/api/src/routers/projects.ts
Lines: ~110–150

Proposed optional refactor:

-      try {
-        if (input.gitHost === 'github' || input.gitHost === 'all') {
-          const githubDriver = await getActiveDriver('github', ctx as Context);
-          results.github = await githubDriver.updateRepoIds({ db: ctx.db });
-        }
-
-        if (input.gitHost === 'gitlab' || input.gitHost === 'all') {
-          const gitlabDriver = await getActiveDriver('gitlab', ctx as Context);
-          results.gitlab = await gitlabDriver.updateRepoIds({ db: ctx.db });
-        }
-
-        return {
-          success: true,
-          results,
-          totals: {
-            updated: results.github.updated + results.gitlab.updated,
-            failed: results.github.failed + results.gitlab.failed,
-            skipped: results.github.skipped + results.gitlab.skipped,
-          },
-        };
-      } catch (error) {
-        console.error('Error updating repo IDs:', error);
-        throw new TRPCError({
-          code: 'INTERNAL_SERVER_ERROR',
-          message: `Failed to update repo IDs: ${error instanceof Error ? error.message : 'Unknown error'}`,
-        });
-      }
+      if (input.gitHost === 'github' || input.gitHost === 'all') {
+        try {
+          const githubDriver = await getActiveDriver('github', ctx as Context);
+          results.github = await githubDriver.updateRepoIds({ db: ctx.db });
+        } catch (e) {
+          console.error('Error updating GitHub repo IDs:', e);
+        }
+      }
+      if (input.gitHost === 'gitlab' || input.gitHost === 'all') {
+        try {
+          const gitlabDriver = await getActiveDriver('gitlab', ctx as Context);
+          results.gitlab = await gitlabDriver.updateRepoIds({ db: ctx.db });
+        } catch (e) {
+          console.error('Error updating GitLab repo IDs:', e);
+        }
+      }
+      return {
+        success: true,
+        results,
+        totals: {
+          updated: results.github.updated + results.gitlab.updated,
+          failed: results.github.failed + results.gitlab.failed,
+          skipped: results.github.skipped + results.gitlab.skipped,
+        },
+      };

This change will:

  • Ensure each provider runs independently.
  • Log provider-specific failures without aborting the mutation.
  • Still return consolidated counts in totals.
packages/db/drizzle/meta/0024_snapshot.json (1)

763-824: Consider indexing repo_id in snapshot (or ensure 0025 does); partial index optional

The runtime will frequently look up by repo_id. If 0025 adds it, great. Optionally make it a partial index excluding 'pending' to keep it small.

I can draft the partial index as:

create index concurrently if not exists project_repo_id_idx
  on public.project (repo_id)
  where repo_id <> 'pending';
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9af790e and 9ce768c.

📒 Files selected for processing (13)
  • apps/web/app/(admin)/admin/page.tsx (2 hunks)
  • apps/web/app/(public)/(projects)/projects/project-card.tsx (3 hunks)
  • apps/web/components/submissions/submission-form.tsx (2 hunks)
  • apps/web/forms/index.ts (2 hunks)
  • packages/api/src/driver/github.ts (4 hunks)
  • packages/api/src/driver/gitlab.ts (3 hunks)
  • packages/api/src/driver/types.ts (2 hunks)
  • packages/api/src/routers/projects.ts (6 hunks)
  • packages/db/drizzle/0024_free_khan.sql (1 hunks)
  • packages/db/drizzle/0025_add_repo_id_index.sql (1 hunks)
  • packages/db/drizzle/meta/0024_snapshot.json (1 hunks)
  • packages/db/drizzle/meta/_journal.json (1 hunks)
  • packages/db/src/seed/fixtures/projects.ts (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
packages/api/src/driver/gitlab.ts (1)
packages/db/src/schema/projects.ts (1)
  • project (16-71)
packages/db/src/seed/fixtures/projects.ts (2)
packages/db/src/seed/create-fixture.ts (1)
  • data (13-27)
packages/db/src/schema/projects.ts (1)
  • project (16-71)
apps/web/app/(public)/(projects)/projects/project-card.tsx (2)
packages/db/src/schema/projects.ts (1)
  • project (16-71)
apps/web/lib/utils.ts (1)
  • formatDate (1-1)
packages/api/src/driver/github.ts (3)
packages/db/src/schema/projects.ts (1)
  • project (16-71)
packages/api/src/driver/utils.ts (1)
  • Context (7-7)
packages/api/src/driver/types.ts (1)
  • UnSubmittedRepo (116-128)
apps/web/app/(admin)/admin/page.tsx (2)
packages/api/src/driver/github.ts (1)
  • updateRepoIds (33-89)
packages/api/src/driver/gitlab.ts (1)
  • updateRepoIds (36-91)
packages/api/src/routers/projects.ts (3)
packages/api/src/driver/utils.ts (2)
  • getActiveDriver (10-33)
  • Context (7-7)
packages/api/src/trpc.ts (1)
  • adminProcedure (54-73)
packages/api/src/utils/constants.ts (1)
  • PROVIDER_URL_PATTERNS (1-4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Vade Review
🔇 Additional comments (17)
apps/web/app/(public)/(projects)/projects/project-card.tsx (3)

20-27: LGTM on the repo query wiring and cache policy.

The query wiring via trpc.repository.getRepo.queryOptions, host/provider gating, and 24h staleTime align well with the PR’s objectives around background repo_id resolution and UI fallbacks.


20-27: The project is on v5.83.0 of @tanstack/react-query and currently only uses isLoading (no isPending found). The original suggestions still apply:

  • Retry Policy: Consider adding a conservative retry option to the useQuery call (e.g. retry: false or retry: 1) to avoid repeated failures on hard errors (private/missing repos).
  • Version Alignment (isPending): Since you’re on React Query v5 (v5.83.0), you can optionally adopt the new isPending flag for initial load semantics, but continuing with isLoading remains valid.

No further changes are required unless you want to adopt isPending or tweak retry behavior.


40-72: I've requested extraction of the remotePatterns block from your Next.js config and a scan for relevant avatar hostnames in the code. Once you provide the script output, I'll confirm whether any hosts are missing from your next.config.mjs and update the review accordingly.

apps/web/forms/index.ts (3)

23-24: LGTM: repoId intentionally excluded from form input

Excluding repoId from earlySubmissionForm keeps client input minimal and pushes derivation to the backend, which aligns with the PR objective.


56-57: LGTM: repoId intentionally excluded from submission form input

Same rationale as above for submisionForm. The server should own repoId derivation.


47-47: Rename Verified and Complete

All instances of submisionForm have been successfully renamed to submissionForm and no leftover references remain in the codebase. Imports and usage in both apps/web/forms/index.ts and consuming components have been updated accordingly.

apps/web/components/submissions/submission-form.tsx (1)

133-134: No-op formatting change

Trailing comma in props destructuring is fine and has no behavioral impact.

packages/db/drizzle/meta/_journal.json (2)

173-186: Migrations journal entries look consistent and correctly appended.

Idx values are sequential, version stays "7", and timestamps/tags look sane. No JSON formatting issues spotted.


173-186: Migrations and snapshot consistency verified

All checks confirm that the journal, SQL migrations, and schema snapshot are in sync:

  • packages/db/drizzle/0024_free_khan.sql
    • Adds repo_id column (text)
    • Populates existing rows via UPDATE … SET repo_id = CASE … END
    • Alters repo_id to SET NOT NULL
  • packages/db/drizzle/0025_add_repo_id_index.sql
    • Creates project_repo_id_idx on project(repo_id)
  • packages/db/drizzle/meta/0024_snapshot.json
    • Defines repo_id with "notNull": true

No drift detected—no further changes needed.

packages/api/src/driver/github.ts (1)

1056-1056: Strict equality for owner check is correct.

The change to repo.owner.login === username is the right comparison for determining ownership.

packages/api/src/routers/projects.ts (5)

1001-1019: Edge case: GitLab subgroups again

This path extraction uses PROVIDER_URL_PATTERNS and builds owner/repo. With GitLab subgroups, owner may contain slashes. Verify drivers accept the full path. If not, we should update the regex and pass the full subgroup path into getRepo.

I can update the regex in packages/api/src/utils/constants.ts to capture group/subgroup paths and adjust call sites if you confirm drivers handle it.


1032-1038: Persisting repoId on insert is correct

Explicitly writing repoId on create ensures NOT NULL is satisfied and avoids a later patch write.


1643-1655: Minor API ergonomics: input shape looks good; LGTM

Switching to a structured input for getUnSubmitted keeps it consistent with the rest of the router and future-proofs it.


65-71: The constants file will be inspected for your review; I’ll follow up with a refined comment once we confirm the patterns.


998-1027: Please verify existing repo_id values in the database before rolling out this change

Repo IDs should always be the native provider identifier (e.g. GitHub’s numeric ID) and never a URL. Here’s the minimal change to enforce that invariant in packages/api/src/routers/projects.ts:

     let repoId = '';

@@
-    if (!repoId) {
-      repoId = input.gitRepoUrl || 'pending';
-    }
+    if (!repoId) {
+      // Keep repoId semantics consistent — do not store URLs in repoId.
+      repoId = 'pending';
+    }

Locations to update:

  • packages/api/src/routers/projects.ts (around lines 998–1027)

Next steps:

  • Run a quick query against your production (or staging) database to see if any rows currently have a URL in repo_id:

    SELECT COUNT(*) AS url_count
      FROM "Project"
     WHERE repo_id LIKE 'http%' OR repo_id ~ '^https?://';
  • If url_count > 0, normalize them before applying the code change:

    UPDATE "Project"
       SET repo_id = 'pending'
     WHERE repo_id LIKE 'http%' OR repo_id ~ '^https?://';
  • Finally, trigger the existing updateRepoIds admin procedure to backfill all numeric IDs.

Let me know once you’ve verified or if you’d like a one-off migration script to automate the normalization step.

packages/db/drizzle/meta/0024_snapshot.json (2)

881-888: Uniqueness stays on git_repo_url — OK

Keeping uniqueness on git_repo_url avoids accidental cross-host collisions, while repo_id remains indexable but not unique. This aligns with the backfill/admin update plan.


617-623: repo_id migration backfill and indexing verified

The 0024_free_khan.sql migration:

  • Adds the repo_id column without NOT NULL
  • Backfills all existing rows by setting repo_id via a CASE on git_host and git_repo_url
  • Applies the NOT NULL constraint
  • Creates project_repo_id_idx on project(repo_id)

The 0025_add_repo_id_index.sql migration:

  • Repeats CREATE INDEX IF NOT EXISTS project_repo_id_idx (safe no-op)

The meta journal confirms both entries (0024_free_khan and 0025_add_repo_id_index).

No further changes required.

Comment on lines +1009 to +1013
async getUnsubmittedRepos(
ctx: Context,
username: string,
userId: string,
): Promise<UnSubmittedRepo[]> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Signature change: ensure all call sites are updated.

getUnsubmittedRepos now takes ctx, username, userId. Verify all usages were changed accordingly.

Run:


🏁 Script executed:

#!/bin/bash
# Find usages of getUnsubmittedRepos and show context
rg -nP --type=ts -C3 '\bgetUnsubmittedRepos\s*\('

Length of output: 2072


Critical Signature Mismatch in GitLab Driver

The getUnsubmittedRepos method signature has been updated to accept three parameters (ctx, username, userId) in both the interface (packages/api/src/driver/types.ts) and the GitHub driver, but the GitLab driver still only declares a single ctx parameter. This will lead to type‐checking and runtime errors whenever the GitLab implementation is invoked with three arguments.

• File: packages/api/src/driver/gitlab.ts (around line 1011)
– Current signature:
ts async getUnsubmittedRepos(ctx: Context): Promise<UnSubmittedRepo[]> {
– Should be:
diff - async getUnsubmittedRepos(ctx: Context): Promise<UnSubmittedRepo[]> { + async getUnsubmittedRepos(ctx: Context, username: string, userId: string): Promise<UnSubmittedRepo[]> {

• Ensure the implementation uses username and userId to fetch/filter the correct set of repos, mirroring the logic in the GitHub driver.

• After updating, re-run:

rg -nP --type=ts '\bgetUnsubmittedRepos\s*\(' -C2

to confirm all drivers and call sites align with the new three-argument signature.

🤖 Prompt for AI Agents
In packages/api/src/driver/gitlab.ts around line 1011, the getUnsubmittedRepos
signature currently only accepts ctx; update it to async
getUnsubmittedRepos(ctx: Context, username: string, userId: string):
Promise<UnSubmittedRepo[]> and adjust the implementation to use the provided
username and userId when fetching/filtering repos so it mirrors the GitHub
driver logic (use username to scope queries and userId for ownership/filtering
where appropriate), handle missing values defensively, and ensure return types
match; after changes re-run rg -nP --type=ts '\bgetUnsubmittedRepos\s*\(' -C2 to
verify all drivers and call sites use the new three-argument signature.

Comment on lines +25 to 26
repoId: true,
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Good call: exclude repoId from client input

Preventing clients from supplying repoId removes a trust boundary issue and centralizes derivation on the server.

Run this quick check to ensure no UI payloads still include repoId:


🏁 Script executed:

#!/bin/bash
# Find any client-side usage of repoId in form payloads or tRPC inputs
rg -nP -C2 '(repoId\s*:|repoId\?)' --type ts --type tsx apps/ packages/ | sed -n '1,200p'

Length of output: 117


To account for the “unrecognized file type: tsx” error, let’s re-run the search using explicit glob patterns instead of --type:


🏁 Script executed:

#!/bin/bash
# Re-run search for any client-side usage of repoId in TS/TSX files
rg -nP -C2 '(repoId\s*:|repoId\?)' -g '*.ts' -g '*.tsx' apps/ packages/ | sed -n '1,200p'

Length of output: 6411


Remove repoId from client‐side payloads

The search shows several spots in the web app where repoId is still part of your client schemas and form data—these must be removed or omitted so clients no longer send it.

Please update the following locations:

  • apps/web/types/project.ts (line 56): remove the repoId: string; field
  • apps/web/forms/index.ts (lines 23 and 56): drop repoId: true from your form Zod schemas
  • apps/web/components/submissions/submission-form.tsx (around line 423): don’t set repoId: 'pending' in the submission payload
  • apps/web/components/admin/project-edit-form.tsx (lines 20–21 and 77): remove repoId from both the Zod schema and the initial form values

Once these references are removed, rerun your payload‐inspection script to confirm no further occurrences.

🤖 Prompt for AI Agents
In packages/api/src/routers/projects.ts around lines 25–26 and across the repo,
remove handling of client-sent repoId and update client code: delete the repoId
field from apps/web/types/project.ts (line 56), remove repoId: true from the Zod
schemas in apps/web/forms/index.ts (lines 23 and 56), stop setting repoId:
'pending' in apps/web/components/submissions/submission-form.tsx (around line
423) and remove repoId from the Zod schema and initial form values in
apps/web/components/admin/project-edit-form.tsx (lines 20–21 and 77); ensure
server accepts/derives repo association internally (or ignores repoId) and rerun
the payload-inspection script to confirm no remaining client-side repoId usages.

Comment on lines +5 to +11
UPDATE "project"
SET "repo_id" =
CASE
WHEN "git_host" IS NOT NULL THEN "git_host" || ':' || "git_repo_url"
ELSE 'unknown:' || "git_repo_url"
END
WHERE "repo_id" IS NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

NULL concatenation will break NOT NULL; use COALESCE and canonicalize

In Postgres, 'unknown:' || "git_repo_url" evaluates to NULL when git_repo_url is NULL, leaving repo_id NULL and causing the subsequent SET NOT NULL to fail. Canonicalize and guard with COALESCE, and normalize case to avoid duplicates from case variance.

-UPDATE "project"
-SET "repo_id" =
-  CASE
-    WHEN "git_host" IS NOT NULL THEN "git_host" || ':' || "git_repo_url"
-    ELSE 'unknown:' || "git_repo_url"
-  END
-WHERE "repo_id" IS NULL;
+UPDATE "project"
+SET "repo_id" = lower(
+  coalesce("git_host", 'unknown') || ':' || coalesce("git_repo_url", 'unknown')
+)
+WHERE "repo_id" IS NULL;
🤖 Prompt for AI Agents
In packages/db/drizzle/0024_free_khan.sql around lines 5 to 11, the UPDATE uses
plain string concatenation which yields NULL if git_repo_url is NULL and doesn't
normalize case; change the expression to guard with COALESCE and canonicalize to
lower-case (e.g. use lower(COALESCE("git_host", 'unknown')) and
lower(COALESCE("git_repo_url", 'unknown')) joined with ':') so repo_id is never
left NULL and case-variants are normalized.

@ahmetskilinc ahmetskilinc merged commit 5e3dee3 into collabute:dev Aug 23, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants