Enforce v2-only API and config, and SQLx macro checks by aurexav · Pull Request #32 · hack-ink/ELF

aurexav · 2026-02-07T14:20:00Z

Closes #31.

Changes:

Remove all v1 API routes and docs, leaving only /v2 endpoints.
Drop v1 config version toggles and default example config to v2 collection.
Convert SQLx usage to query!/query_as!/query_scalar! and commit offline metadata (.sqlx/).
Add scripts/sqlx-prepare.sh and set SQLX_OFFLINE=true in cargo make lint/test tasks.

Verification:

cargo make fmt-check
cargo make lint
cargo make test
cargo make test-integration (with ELF_PG_DSN, ELF_QDRANT_URL)
cargo make e2e (with ELF_PG_DSN, ELF_QDRANT_URL, ELF_QDRANT_HTTP_URL)

…v2-only API and config","intent":"Remove v1 endpoints and config variants and switch SQLx to macros with offline metadata","impact":"All builds use SQLX_OFFLINE with committed .sqlx and only /v2 routes are supported","breaking":true,"risk":"medium","refs":["gh:#31"]}

packages/elf-service/src/search.rs

 			top_k,
 		};
+
+		let mut items = Vec::with_capacity(results.len());


General approach: enforce a reasonable maximum on the number of search results that can be requested/handled, and ensure allocations are based on a bounded value rather than an unbounded user-controlled value. Additionally, avoid using with_capacity with a tainted size when it’s not necessary.

Best fix here without changing functionality: introduce a hard upper bound for top_k and candidate_k derived from configuration, and clamp the values accordingly. Because results.len() is effectively bounded by top_k, once top_k is clamped we know results.len() is safe. For extra safety and simplicity, we can also replace Vec::with_capacity(results.len()) with Vec::new(), removing the need to pre-allocate based on a tainted size while having negligible performance impact.

Concretely:

In packages/elf-service/src/search.rs, in ElfService::search_raw, after computing top_k and candidate_k, clamp them to maximums defined in config (which are accessed via self.cfg.memory.max_top_k / max_candidate_k or similar). Since we can’t assume such fields exist, we instead define local constants in this function (or near its top) like const MAX_TOP_K: u32 = 10_000; and clamp top_k and candidate_k to these constants. This ensures results.len() is bounded.

In finish_search in the same file, change let mut items = Vec::with_capacity(results.len()); to let mut items = Vec::new();, which removes the flagged sink while keeping behavior the same.

In packages/elf-service/src/progressive_search.rs, where top_k and candidate_k are recomputed in ElfService::search, apply the same clamping pattern, to ensure the admin/raw search variants also share the bound.

No new methods are required; we just add simple clamping and adjust the one allocation call.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8158a1be5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-07T14:28:27Z

scripts/context-misranking-harness.sh

+if ! command -v taplo >/dev/null 2>&1; then
+  echo "Missing taplo." >&2
+  exit 1


Drop hard failure on missing taplo in e2e harness

The harness now exits immediately when taplo is not installed, which makes cargo make e2e fail in environments that previously worked with only runtime dependencies (curl, psql, jq/jaq). Because the generated TOML is already valid for execution, this formatting-only tool requirement can block experiment runs without improving correctness; making formatting optional (or warning-only) would avoid this regression.

Useful? React with 👍 / 👎.

…runtime SQLx queries in tests","intent":"Allow tests to use sqlx::query/query_as/query_scalar without compile-time macros","impact":"Test code no longer depends on SQLx macros while production code remains macro-checked","breaking":false,"risk":"low","refs":["pr:32"]}

github-advanced-security bot found potential problems Feb 7, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Feb 7, 2026

View reviewed changes

aurexav merged commit 08fa6af into main Feb 7, 2026
5 of 6 checks passed

aurexav deleted the feat/v2-only-sqlx branch February 7, 2026 14:36

aurexav restored the feat/v2-only-sqlx branch February 7, 2026 14:37

aurexav deleted the feat/v2-only-sqlx branch February 7, 2026 14:39

@@ -384,8 +384,21 @@
             			return Err(ServiceError::NonEnglishInput { field: "$.query".to_string() });
             		}
-            		let top_k = req.top_k.unwrap_or(self.cfg.memory.top_k).max(1);
-            		let candidate_k = req.candidate_k.unwrap_or(self.cfg.memory.candidate_k).max(top_k);
+            		// Clamp the number of requested results to prevent unbounded allocations.
+            		const MAX_TOP_K: u32 = 10_000;
+            		const MAX_CANDIDATE_K: u32 = 20_000;
+            		let mut top_k = req.top_k.unwrap_or(self.cfg.memory.top_k).max(1);
+            		if top_k > MAX_TOP_K {
+            			top_k = MAX_TOP_K;
+            		}
+            		let mut candidate_k =
+            			req.candidate_k.unwrap_or(self.cfg.memory.candidate_k).max(top_k);
+            		if candidate_k > MAX_CANDIDATE_K {
+            			candidate_k = MAX_CANDIDATE_K;
+            		}
             		let query = req.query.clone();
             		let read_profile = req.read_profile.clone();
             		let record_hits_enabled = req.record_hits.unwrap_or(false);
@@ -1434,7 +1447,8 @@
             			top_k,
             		};
-            		let mut items = Vec::with_capacity(results.len());
+            		// Build the items vector without pre-allocating based on a tainted size.
+            		let mut items = Vec::new();
             		let mut trace_builder = SearchTraceBuilder::new(trace_context, &self.cfg, now);
             		for (idx, scored_chunk) in results.into_iter().enumerate() {

@@ -171,9 +171,21 @@
             impl ElfService {
             	pub async fn search(&self, req: SearchRequest) -> ServiceResult<SearchIndexResponse> {
-            		let top_k = req.top_k.unwrap_or(self.cfg.memory.top_k).max(1);
-            		let candidate_k = req.candidate_k.unwrap_or(self.cfg.memory.candidate_k).max(top_k);
+            		// Clamp requested sizes to prevent unbounded allocations in downstream search.
+            		const MAX_TOP_K: u32 = 10_000;
+            		const MAX_CANDIDATE_K: u32 = 20_000;
+            		let mut top_k = req.top_k.unwrap_or(self.cfg.memory.top_k).max(1);
+            		if top_k > MAX_TOP_K {
+            			top_k = MAX_TOP_K;
+            		}
+            		let mut candidate_k =
+            			req.candidate_k.unwrap_or(self.cfg.memory.candidate_k).max(top_k);
+            		if candidate_k > MAX_CANDIDATE_K {
+            			candidate_k = MAX_CANDIDATE_K;
+            		}
             		let mut raw_req = req.clone();
             		raw_req.top_k = Some(candidate_k);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce v2-only API and config, and SQLx macro checks#32

Enforce v2-only API and config, and SQLx macro checks#32
aurexav merged 2 commits intomainfrom
feat/v2-only-sqlx

aurexav commented Feb 7, 2026

Uh oh!

Check failure

Copilot Autofix

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aurexav commented Feb 7, 2026

Uh oh!

Check failure

Uh oh!

Uh oh!

Copilot Autofix

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant