Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion services/apps/git_integration/src/crowdgit/database/crud.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,14 +125,30 @@ async def acquire_repository(query: str, params: tuple = None) -> Repository | N
async def acquire_recurrent_repo() -> Repository | None:
"""Acquire a regular (non-onboarding) repository, that were not processed in the last x hours (REPOSITORY_UPDATE_INTERVAL_HOURS)"""
recurrent_repo_sql_query = f"""
WITH selected_repo AS (
-- Rate-limit guard: Gerrit (automotivelinux) aggressively rate-limits concurrent connections.
-- This CTE checks if any automotivelinux repo is already being processed,
-- so we skip picking another one from the same host until the current one finishes.
Comment on lines 126 to +130
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/title suggests preventing concurrent processing for automotivelinux repos in general, but this safeguard only applies to the recurrent acquisition path. Automotivelinux repos can still be acquired concurrently via acquire_onboarding_repo (pending) and acquire_pending_reonboard_repo (pending_reonboard). If the intent is truly “only one at a time”, the same host-level guard should be applied across all acquisition queries (or centralized in acquire_repo_for_processing).

Copilot uses AI. Check for mistakes.
WITH automotivelinux_processing AS (
SELECT 1
FROM git."repositoryProcessing" rp2
JOIN public.repositories r2 ON r2.id = rp2."repositoryId"
WHERE rp2.state = 'processing'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded state string bypasses parameterized query pattern

Low Severity

The automotivelinux_processing CTE hardcodes rp2.state = 'processing' as a string literal, while every other state comparison in this file uses parameterized values. Parameter $1 already contains RepositoryState.PROCESSING (value "processing") and is used in the same query's UPDATE clause. Using a hardcoded string breaks the established pattern and creates a maintenance risk — if the enum value ever changes, this CTE would silently check for the wrong state.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit db07fb6. Configure here.

AND rp2."lockedAt" IS NOT NULL
AND r2.url LIKE '%gerrit.automotivelinux.org%'
LIMIT 1
Comment on lines +135 to +138
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both checks use url LIKE '%gerrit.automotivelinux.org%' (leading wildcard), which prevents using the existing btree/unique index on repositories.url and can force scans as the table grows. If this runs frequently, consider matching in an index-friendly way (e.g., store/extract hostname in a separate column, use an anchored prefix pattern when possible, or add an appropriate trigram index).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stuck automotivelinux repo permanently blocks all others

High Severity

The automotivelinux_processing CTE has no staleness/timeout check on lockedAt. If a worker crashes (OOM, pod eviction) while processing an automotivelinux repo, that repo remains in state='processing' with lockedAt IS NOT NULL indefinitely. The CTE will then permanently match this stuck row, blocking all other automotivelinux repos from ever being acquired. The stuck repo itself is also excluded by states_to_exclude, so no worker can pick it up to resolve it — the block is permanent until manual DB intervention.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit db07fb6. Configure here.

),
selected_repo AS (
SELECT r.id
FROM public.repositories r
JOIN git."repositoryProcessing" rp ON rp."repositoryId" = r.id
WHERE NOT (rp.state = ANY($2))
AND rp."lockedAt" IS NULL
AND r."deletedAt" IS NULL
AND rp."lastProcessedAt" < NOW() - INTERVAL '1 hour' * $3
AND NOT (
r.url LIKE '%gerrit.automotivelinux.org%'
AND EXISTS (SELECT 1 FROM automotivelinux_processing)
Comment on lines +149 to +150
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hostname pattern '%gerrit.automotivelinux.org%' is duplicated in multiple places in this query; that increases the chance of future drift if it ever needs to change. Consider factoring it into a single SQL value (e.g., a CTE constant) or a Python-level constant used to build the query.

Copilot uses AI. Check for mistakes.
)
Comment on lines +148 to +151
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guard is not concurrency-safe: two workers can execute this statement concurrently, both evaluate automotivelinux_processing as empty (statement snapshot), and each acquire a different automotivelinux repo row via FOR UPDATE ... SKIP LOCKED, resulting in concurrent processing anyway. To guarantee single-flight per host, use a DB-level mutex (e.g., pg_try_advisory_xact_lock keyed by the host, or a dedicated lock row/table locked FOR UPDATE) acquired atomically during selection.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition allows concurrent automotivelinux repo acquisition

Medium Severity

The automotivelinux_processing CTE reads only committed data (PostgreSQL READ COMMITTED default), and doesn't acquire any locks. Two concurrent calls to acquire_recurrent_repo can both evaluate the CTE simultaneously, both find no automotivelinux repo in processing state, and both proceed to select and lock different automotivelinux repos via SKIP LOCKED. This defeats the stated goal of ensuring only one automotivelinux repo is processed at a time.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit db07fb6. Configure here.

ORDER BY rp.priority ASC, rp."lastProcessedAt" ASC
LIMIT 1
FOR UPDATE OF rp SKIP LOCKED
Expand Down
Loading