feat: adds inventory matcher logic similar to Golang#3114
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds typed inventory/provider snapshot models, parsers for GPU and storage attributes, a bigint-backed ResourcePair model, mappers from snapshots and GroupSpec to inventory/request units, and a ClusterInventoryMatcher service with comprehensive unit tests for allocation, GPU resolution, and storage semantics. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3114 +/- ##
==========================================
- Coverage 61.76% 61.22% -0.55%
==========================================
Files 1029 995 -34
Lines 24722 23930 -792
Branches 6086 5996 -90
==========================================
- Hits 15270 14650 -620
+ Misses 8246 8087 -159
+ Partials 1206 1193 -13
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
f26096f to
928e82b
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
apps/api/src/bid-screening/lib/inventory-mapper.spec.ts (1)
25-29: The GPU-sort requirement is not exercised here.This only asserts a one-element
gpu.info, so the vendor/name/memorySize ordering can regress without failing the suite. Add an unsorted multi-GPU fixture and assert the normalized order, since wildcard matching depends on it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/lib/inventory-mapper.spec.ts` around lines 25 - 29, The test "maps GPU info from snapshot node GPUs" does not exercise GPU sorting; update the spec to include an unsorted multi-GPU snapshot via the existing setup() fixture so inventory.nodes[0].gpu.info contains multiple entries in an unsorted order, then assert that inventory.nodes[0].gpu.info is normalized/sorted (by vendor, name, memorySize as the mapper expects) and that gpu.quantity.toState().allocatable reflects the combined quantity; locate the test case in inventory-mapper.spec.ts and modify the setup call and expectations for inventory.nodes[0].gpu.info and gpu.quantity.toState().allocatable accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/api/src/bid-screening/lib/inventory-mapper.ts`:
- Around line 20-29: The GPU memory sort is comparing memorySize
lexicographically; update the comparator used in the gpuInfo sort so it extracts
the numeric prefix from memorySize (e.g. via a regex like /^(\d+(\.\d+)?)/ and
parseFloat) and compares those numbers numerically (treat non-parsable values as
0), and only fall back to localeCompare on the original memorySize string if the
numeric values are equal; change the sort inside the gpuInfo construction (the
.sort callback that currently uses a.memorySize.localeCompare(b.memorySize)) so
the first GPU chosen by const first = node.gpu.info[0] will reflect true numeric
memory ordering.
In `@apps/api/src/bid-screening/lib/storage-attribute-parser.ts`:
- Around line 18-19: The parser currently assigns storageClass when attr.key ===
"class" even if attr.value is whitespace-only; update the logic that sets
storageClass (the branch checking attr.key === "class") to trim attr.value and
only assign storageClass if the trimmed value is non-empty, otherwise leave it
undefined/ignore the attribute; apply the same trimmed-nonempty check to the
similar branches handling other storage-related attributes in this file (lines
around 27-38) so whitespace-only values are rejected consistently.
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 97-100: The code currently replaces group GPU attributes with
canonical.gpuSpecs (losing specific fields); instead, parse both
canonical.gpuSpecs and group.resources.gpu.attributes (using
parseGPUAttributes), compute an intersection/merge where for each GPU field you
keep the more specific (non-null) value and if both are specified and different
treat as a conflict (return failure / set resolvedGpuSpecs undefined or throw),
then pass the merged spec into this.#tryAdjustGPU (instead of
canonical.gpuSpecs) and assign resolvedGpuSpecs from that merged result; update
the logic around resolvedGpuSpecs, effectiveSpecs, and `#tryAdjustGPU` to use this
intersection and to fail on conflicting values.
- Around line 144-179: The `#tryAdjustGPU` method currently only decrements the
aggregate node.gpu.quantity and can match the same node.gpu.info entry multiple
times; change it to track per-device availability by consuming matched entries
from node.gpu.info (or building a local map of available devices indexed by
vendor/model/interface/ram) as you match each unit via matchesGPU, decrementing
or marking that specific entry so it cannot be reused, and only call
node.gpu.quantity.allocate(requestedUnits) after successfully reserving the
individual entries; update resolvedAttr/resolvedVendor/resolvedModel from the
actual consumed entry and return GPU_CHECK_FAIL if you cannot reserve distinct
device entries for each requested unit.
---
Nitpick comments:
In `@apps/api/src/bid-screening/lib/inventory-mapper.spec.ts`:
- Around line 25-29: The test "maps GPU info from snapshot node GPUs" does not
exercise GPU sorting; update the spec to include an unsorted multi-GPU snapshot
via the existing setup() fixture so inventory.nodes[0].gpu.info contains
multiple entries in an unsorted order, then assert that
inventory.nodes[0].gpu.info is normalized/sorted (by vendor, name, memorySize as
the mapper expects) and that gpu.quantity.toState().allocatable reflects the
combined quantity; locate the test case in inventory-mapper.spec.ts and modify
the setup call and expectations for inventory.nodes[0].gpu.info and
gpu.quantity.toState().allocatable accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9c8b5c1c-64b9-4e91-8369-180046251be3
📒 Files selected for processing (12)
apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.tsapps/api/src/bid-screening/lib/gpu-attribute-parser.tsapps/api/src/bid-screening/lib/groupspec-mapper.spec.tsapps/api/src/bid-screening/lib/groupspec-mapper.tsapps/api/src/bid-screening/lib/inventory-mapper.spec.tsapps/api/src/bid-screening/lib/inventory-mapper.tsapps/api/src/bid-screening/lib/resource-pair.spec.tsapps/api/src/bid-screening/lib/resource-pair.tsapps/api/src/bid-screening/lib/storage-attribute-parser.spec.tsapps/api/src/bid-screening/lib/storage-attribute-parser.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts
69dc134 to
eb6d1bd
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
apps/api/src/bid-screening/lib/inventory-mapper.ts (1)
20-28:⚠️ Potential issue | 🟠 MajorSort GPU memory numerically, not lexicographically.
"100Gi"currently sorts before"80Gi". That matters becauseClusterInventoryMatcherService.#tryAdjustGPU()falls back tonode.gpu.info[0]when no GPU attrs are provided, so the wrong comparator changes which SKU becomes canonical.Suggested fix
const gpuInfo: GpuInfo[] = (node.gpus || []) .map(gpu => ({ vendor: gpu.vendor, name: gpu.name, modelId: gpu.modelId, interface: gpu.interface, memorySize: gpu.memorySize })) - .sort((a, b) => a.vendor.localeCompare(b.vendor) || a.name.localeCompare(b.name) || a.memorySize.localeCompare(b.memorySize)); + .sort((a, b) => { + const vendorCmp = a.vendor.localeCompare(b.vendor); + if (vendorCmp) return vendorCmp; + + const nameCmp = a.name.localeCompare(b.name); + if (nameCmp) return nameCmp; + + const aMem = Number.parseFloat(a.memorySize); + const bMem = Number.parseFloat(b.memorySize); + + return (Number.isNaN(aMem) ? 0 : aMem) - (Number.isNaN(bMem) ? 0 : bMem) || a.memorySize.localeCompare(b.memorySize); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/lib/inventory-mapper.ts` around lines 20 - 28, The GPU memory is being sorted lexicographically via memorySize.localeCompare which misorders values like "100Gi" vs "80Gi"; update the sort in inventory-mapper.ts where gpuInfo is built so memorySize is compared numerically (e.g., parse or extract the numeric portion from a.memorySize and b.memorySize and compare as numbers) while leaving vendor and name tie-breakers in place; adjust the comparator used in the .sort(...) for gpuInfo to use numeric comparison for memorySize to ensure correct canonical GPU selection by ClusterInventoryMatcherService.#tryAdjustGPU().apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts (2)
165-179:⚠️ Potential issue | 🔴 CriticalReserve concrete GPU entries across replicas.
node.gpu.quantityis decremented, butnode.gpu.infonever changes. The next replica can therefore match the sameGpuInfoentry again, and mixed nodes withgpuAllocated > 0are still ambiguous because the matcher never knows which exact devices are already consumed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 165 - 179, The matcher picks GpuInfo entries but never marks them consumed, so subsequent replicas can re-match the same GPU; update the matching loop so each time you select an info (using matchesGPU) you reserve/remove/mark that exact GpuInfo from node.gpu.info before decrementing node.gpu.quantity (or immediately after successful node.gpu.quantity.allocate(requestedUnits)); specifically modify the block that finds attr via gpuSpecs.find(...) to also remove or flag that info (e.g., splice from node.gpu.info or set info.consumed = true and change matchesGPU to skip consumed entries), keep using resolvedVendor/resolvedModel/resolvedAttr as before, and only return GPU_CHECK_FAIL if allocation fails or remaining>0 after processing.
97-100:⚠️ Potential issue | 🟠 MajorDon't drop the current group's GPU constraints once canonicalization starts.
After the first allocation, Line 99 ignores
group.resources.gpu.attributesentirely. A later group that addsram/80Giorinterface/sxmis then validated only against the earlier canonical spec, so stricter constraints disappear instead of being intersected.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 97 - 100, The code drops group-specific GPU constraints when canonicalization exists; update the logic in ClusterInventoryMatcherService (around resolvedGpuSpecs and the call to `#tryAdjustGPU`) so that canonical.gpuSpecs is intersected with group.resources.gpu.attributes instead of replacing them. Concretely, compute an effective GPU spec by merging/intersecting canonical.gpuSpecs (if present) with parseGPUAttributes(group.resources.gpu.attributes) (or vice versa) and pass that combined spec(s) into this.#tryAdjustGPU(nodeCopy, group.resources.gpu.units, effectiveSpecs), ensuring resolvedGpuSpecs reflects the intersection result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/api/src/bid-screening/lib/inventory-mapper.ts`:
- Line 1: The import in inventory-mapper.ts is pointing to a non-existent
repository path; replace the current import of ProviderWithSnapshot with the
correct aliased module by importing ProviderWithSnapshot from
"@src/bid-screening/types/provider" so TypeScript can resolve the type and
restore downstream type inference (update the import statement that currently
references ProviderWithSnapshot in this file).
In `@apps/api/src/bid-screening/lib/resource-pair.ts`:
- Around line 14-17: The available() method currently maps the UNLIMITED
sentinel to BigInt(Number.MAX_SAFE_INTEGER); change it to return the UNLIMITED
sentinel directly when this.allocatable === UNLIMITED. Keep the existing diff
compute for finite values (const diff = this.allocatable - this.allocated;
return diff > 0n ? diff : 0n;) but replace the BigInt(Number.MAX_SAFE_INTEGER)
branch with: if (this.allocatable === UNLIMITED) return UNLIMITED;, ensuring
callers and tests (resource-pair.spec.ts) continue to see -1n as the unlimited
marker.
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 160-188: The loop currently accepts any matching GPU info and then
records the last match, allowing mixed SKUs to satisfy a multi-GPU request;
change it to pin to the first concrete SKU found and require all subsequent
matches to be the same SKU. Concretely: when iterating node.gpu.info, on the
first attr match save canonicalInfo (vendor/name) and canonicalAttr
(ram/interface), decrement remaining; for subsequent matches only accept entries
that equal the canonical SKU (compare info.name, info.vendor.toLowerCase(), and
attr.ram and attr.interface) and decrement; if you encounter a matching spec
that does not equal the canonical SKU skip or cause failure so that remaining
won’t reach zero with mixed SKUs; after the loop if remaining>0 return
GPU_CHECK_FAIL, then call node.gpu.quantity.allocate(requestedUnits) and return
resolved fields using the saved canonicalInfo/canonicalAttr (resolvedVendor,
resolvedModel, resolvedAttr).
---
Duplicate comments:
In `@apps/api/src/bid-screening/lib/inventory-mapper.ts`:
- Around line 20-28: The GPU memory is being sorted lexicographically via
memorySize.localeCompare which misorders values like "100Gi" vs "80Gi"; update
the sort in inventory-mapper.ts where gpuInfo is built so memorySize is compared
numerically (e.g., parse or extract the numeric portion from a.memorySize and
b.memorySize and compare as numbers) while leaving vendor and name tie-breakers
in place; adjust the comparator used in the .sort(...) for gpuInfo to use
numeric comparison for memorySize to ensure correct canonical GPU selection by
ClusterInventoryMatcherService.#tryAdjustGPU().
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 165-179: The matcher picks GpuInfo entries but never marks them
consumed, so subsequent replicas can re-match the same GPU; update the matching
loop so each time you select an info (using matchesGPU) you reserve/remove/mark
that exact GpuInfo from node.gpu.info before decrementing node.gpu.quantity (or
immediately after successful node.gpu.quantity.allocate(requestedUnits));
specifically modify the block that finds attr via gpuSpecs.find(...) to also
remove or flag that info (e.g., splice from node.gpu.info or set info.consumed =
true and change matchesGPU to skip consumed entries), keep using
resolvedVendor/resolvedModel/resolvedAttr as before, and only return
GPU_CHECK_FAIL if allocation fails or remaining>0 after processing.
- Around line 97-100: The code drops group-specific GPU constraints when
canonicalization exists; update the logic in ClusterInventoryMatcherService
(around resolvedGpuSpecs and the call to `#tryAdjustGPU`) so that
canonical.gpuSpecs is intersected with group.resources.gpu.attributes instead of
replacing them. Concretely, compute an effective GPU spec by
merging/intersecting canonical.gpuSpecs (if present) with
parseGPUAttributes(group.resources.gpu.attributes) (or vice versa) and pass that
combined spec(s) into this.#tryAdjustGPU(nodeCopy, group.resources.gpu.units,
effectiveSpecs), ensuring resolvedGpuSpecs reflects the intersection result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1f154a3b-2de5-4f98-8a16-a99af27f88cc
📒 Files selected for processing (13)
apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.tsapps/api/src/bid-screening/lib/gpu-attribute-parser.tsapps/api/src/bid-screening/lib/groupspec-mapper.spec.tsapps/api/src/bid-screening/lib/groupspec-mapper.tsapps/api/src/bid-screening/lib/inventory-mapper.spec.tsapps/api/src/bid-screening/lib/inventory-mapper.tsapps/api/src/bid-screening/lib/resource-pair.spec.tsapps/api/src/bid-screening/lib/resource-pair.tsapps/api/src/bid-screening/lib/storage-attribute-parser.spec.tsapps/api/src/bid-screening/lib/storage-attribute-parser.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.tsapps/api/src/bid-screening/types/inventory.types.ts
✅ Files skipped from review due to trivial changes (3)
- apps/api/src/bid-screening/types/inventory.types.ts
- apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.ts
- apps/api/src/bid-screening/lib/resource-pair.spec.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- apps/api/src/bid-screening/lib/groupspec-mapper.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.spec.ts
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
apps/api/src/bid-screening/lib/inventory-mapper.ts (1)
20-29:⚠️ Potential issue | 🟠 MajorSort GPU RAM numerically, not lexicographically.
memorySize.localeCompare(...)puts values like"100Gi"before"80Gi", so the default GPU later chosen fromnode.gpu.info[0]can be the wrong SKU when the request omits GPU details.Suggested fix
+const parseGpuRam = (value: string) => Number.parseFloat(value.match(/^(\d+(?:\.\d+)?)/)?.[1] ?? "0"); + const gpuInfo: GpuInfo[] = (node.gpus || []) .map(gpu => ({ vendor: gpu.vendor, name: gpu.name, modelId: gpu.modelId, interface: gpu.interface, memorySize: gpu.memorySize })) - .sort((a, b) => a.vendor.localeCompare(b.vendor) || a.name.localeCompare(b.name) || a.memorySize.localeCompare(b.memorySize)); + .sort( + (a, b) => + a.vendor.localeCompare(b.vendor) || + a.name.localeCompare(b.name) || + parseGpuRam(a.memorySize) - parseGpuRam(b.memorySize) || + a.memorySize.localeCompare(b.memorySize) + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/lib/inventory-mapper.ts` around lines 20 - 29, The sort comparator for gpuInfo uses memorySize.localeCompare which sorts lexicographically (so "100Gi" < "80Gi"); update the comparator in the mapping that builds gpuInfo (the code referencing node.gpus and the memorySize field) to parse and normalize memorySize to a numeric value (e.g., convert units like Ki/Mi/Gi to a base number or extract the numeric portion) and compare those numeric values instead of using localeCompare; keep the existing vendor and name comparisons and use the numeric memory comparison as the final tie-breaker.apps/api/src/bid-screening/lib/resource-pair.ts (1)
14-17:⚠️ Potential issue | 🟠 MajorKeep the unlimited sentinel intact.
available()rewrites-1ntoNumber.MAX_SAFE_INTEGER, but the rest of this type treats-1nas the unlimited marker. Returning a finite value breaks direct sentinel checks and makes “unlimited” look exhaustible.Suggested fix
available(): bigint { - if (this.allocatable === UNLIMITED) return BigInt(Number.MAX_SAFE_INTEGER); + if (this.allocatable === UNLIMITED) return UNLIMITED; const diff = this.allocatable - this.allocated; return diff > 0n ? diff : 0n; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/lib/resource-pair.ts` around lines 14 - 17, The available() method currently converts the UNLIMITED sentinel to a finite Number.MAX_SAFE_INTEGER which breaks other code that checks for the -1n sentinel; update available() so when this.allocatable === UNLIMITED it returns UNLIMITED (the same bigint sentinel) instead of Number.MAX_SAFE_INTEGER, and otherwise compute and return the bigint diff (this.allocatable - this.allocated or 0n) so callers can still perform direct sentinel checks against UNLIMITED; check the method signature and use of UNLIMITED, allocatable, allocated to locate and change the return.apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts (2)
97-100:⚠️ Potential issue | 🟠 MajorMerge canonical GPU specs with the current request; don't replace it.
Once
canonical.gpuSpecsis set, the current group's GPU attrs are ignored. A broadvendor/nvidiamatch followed byvendor/nvidia/model/a100/ram/80Giis validated against the broad spec, so the stricter model/RAM requirement disappears.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 97 - 100, The code currently replaces the group's GPU attribute list with canonical.gpuSpecs causing stricter per-group requirements to be lost; change construction of effectiveSpecs so canonical.gpuSpecs is merged with the parsed group attributes instead of replacing them. Specifically, in the block that defines effectiveSpecs (around canonical.gpuSpecs, parseGPUAttributes, resolvedGpuSpecs, and the call to `#tryAdjustGPU`(nodeCopy,...)), build effectiveSpecs by parsing group.resources.gpu.attributes first and then adding canonical.gpuSpecs (e.g. include canonical.gpuSpecs in the array alongside the parsed specs) so both canonical and per-request constraints are considered when calling `#tryAdjustGPU`.
144-188:⚠️ Potential issue | 🔴 CriticalReserve concrete GPU entries, and pin all units to one concrete SKU.
This loop only decrements aggregate
quantity; it never consumes the matchednode.gpu.infoentries, and it can satisfy one request with mixed concrete devices while returning a single canonical result. That lets already-counted devices be reused and allows mixed variants like{A100 40Gi, A100 80Gi}to satisfy a multi-GPU wildcard request.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 144 - 188, The loop in `#tryAdjustGPU` currently decrements an aggregate counter while matching individual node.gpu.info entries, allowing mixed SKU consumption and double-counting; instead, pick a single concrete SKU and ensure enough concrete entries of that SKU exist before allocating. Change the logic in `#tryAdjustGPU` to group or scan node.gpu.info by a canonical key (vendor+name+ram+interface), find one group whose count >= Number(requestedUnits), set resolvedVendor/resolvedModel/resolvedAttr from that SKU, and only then call node.gpu.quantity.allocate(requestedUnits) (or otherwise mark/consume those specific info entries) so you reserve the exact units; avoid mixing different info entries to satisfy one request and return resolved based on the chosen SKU.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/api/src/bid-screening/lib/gpu-attribute-parser.ts`:
- Around line 25-57: The parser currently silently accepts malformed GPU
attribute paths (odd segment counts or unknown keys) and widens them to
wildcards; update the parsing in gpu-attribute-parser.ts so that if parts.length
is odd you throw an Error indicating the attribute key is malformed, and add a
default branch in the switch over key (or an explicit validation) to throw an
Error when an unexpected segment name is encountered (referencing variables
parts, key, value and the switch block that sets vendor/model/ram/interface);
also validate that recognized keys have a non-empty value before assigning to
vendor/model/ram/iface so malformed pairs like "vendor/" are rejected rather
than producing wildcard model.
In `@apps/api/src/bid-screening/lib/groupspec-mapper.ts`:
- Around line 13-40: In mapGroupSpecToResourceUnits validate unit.count on each
request.resources element: ensure Number.isInteger(unit.count) && unit.count >=
0 and if not, throw or return a descriptive error (e.g., throw new Error) so
invalid negative or fractional counts are rejected before being returned as the
RequestedResourceUnit.count; perform this check inside the map iteration
(referencing unit and unit.count) and do not silently coerce values.
In `@apps/api/src/bid-screening/lib/storage-attribute-parser.spec.ts`:
- Line 3: Update the test import to use the backend path alias: replace the
relative import of storage-attribute-parser with the `@src` path form so
parseStorageAttributes is imported via the project's backend alias; specifically
change the import that currently references "./storage-attribute-parser" to use
the corresponding "@src/..." module that exports parseStorageAttributes to keep
imports consistent with backend apps and test path-alias rules.
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 79-85: The code only checks aggregate CPU capacity
(node.cpu.canAllocate) but never verifies requested CPU attributes against
individual sockets, so requests with vendor/model constraints can match
impossible nodes; update the matching logic to: inspect group.resources.cpu
attribute fields (e.g. vendor/model or attrs) and search node.cpus for at least
one CPU entry whose attributes match and which can allocate the requested units,
fail if none match, then when reserving use that specific cpu on the copied node
(use nodeCopy.cpus[i].allocate(...)) instead of only nodeCopy.cpu.allocate; keep
existing memory/GPU allocation checks as-is.
---
Duplicate comments:
In `@apps/api/src/bid-screening/lib/inventory-mapper.ts`:
- Around line 20-29: The sort comparator for gpuInfo uses
memorySize.localeCompare which sorts lexicographically (so "100Gi" < "80Gi");
update the comparator in the mapping that builds gpuInfo (the code referencing
node.gpus and the memorySize field) to parse and normalize memorySize to a
numeric value (e.g., convert units like Ki/Mi/Gi to a base number or extract the
numeric portion) and compare those numeric values instead of using
localeCompare; keep the existing vendor and name comparisons and use the numeric
memory comparison as the final tie-breaker.
In `@apps/api/src/bid-screening/lib/resource-pair.ts`:
- Around line 14-17: The available() method currently converts the UNLIMITED
sentinel to a finite Number.MAX_SAFE_INTEGER which breaks other code that checks
for the -1n sentinel; update available() so when this.allocatable === UNLIMITED
it returns UNLIMITED (the same bigint sentinel) instead of
Number.MAX_SAFE_INTEGER, and otherwise compute and return the bigint diff
(this.allocatable - this.allocated or 0n) so callers can still perform direct
sentinel checks against UNLIMITED; check the method signature and use of
UNLIMITED, allocatable, allocated to locate and change the return.
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 97-100: The code currently replaces the group's GPU attribute list
with canonical.gpuSpecs causing stricter per-group requirements to be lost;
change construction of effectiveSpecs so canonical.gpuSpecs is merged with the
parsed group attributes instead of replacing them. Specifically, in the block
that defines effectiveSpecs (around canonical.gpuSpecs, parseGPUAttributes,
resolvedGpuSpecs, and the call to `#tryAdjustGPU`(nodeCopy,...)), build
effectiveSpecs by parsing group.resources.gpu.attributes first and then adding
canonical.gpuSpecs (e.g. include canonical.gpuSpecs in the array alongside the
parsed specs) so both canonical and per-request constraints are considered when
calling `#tryAdjustGPU`.
- Around line 144-188: The loop in `#tryAdjustGPU` currently decrements an
aggregate counter while matching individual node.gpu.info entries, allowing
mixed SKU consumption and double-counting; instead, pick a single concrete SKU
and ensure enough concrete entries of that SKU exist before allocating. Change
the logic in `#tryAdjustGPU` to group or scan node.gpu.info by a canonical key
(vendor+name+ram+interface), find one group whose count >=
Number(requestedUnits), set resolvedVendor/resolvedModel/resolvedAttr from that
SKU, and only then call node.gpu.quantity.allocate(requestedUnits) (or otherwise
mark/consume those specific info entries) so you reserve the exact units; avoid
mixing different info entries to satisfy one request and return resolved based
on the chosen SKU.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 249fe0b6-e2a9-40f7-aea5-d4251358ed04
📒 Files selected for processing (14)
apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.tsapps/api/src/bid-screening/lib/gpu-attribute-parser.tsapps/api/src/bid-screening/lib/groupspec-mapper.spec.tsapps/api/src/bid-screening/lib/groupspec-mapper.tsapps/api/src/bid-screening/lib/inventory-mapper.spec.tsapps/api/src/bid-screening/lib/inventory-mapper.tsapps/api/src/bid-screening/lib/resource-pair.spec.tsapps/api/src/bid-screening/lib/resource-pair.tsapps/api/src/bid-screening/lib/storage-attribute-parser.spec.tsapps/api/src/bid-screening/lib/storage-attribute-parser.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.tsapps/api/src/bid-screening/types/inventory.types.tsapps/api/src/bid-screening/types/provider.ts
✅ Files skipped from review due to trivial changes (4)
- apps/api/src/bid-screening/types/provider.ts
- apps/api/src/bid-screening/lib/inventory-mapper.spec.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.ts
- apps/api/src/bid-screening/lib/groupspec-mapper.spec.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/api/src/bid-screening/lib/resource-pair.spec.ts
- apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.ts
eb6d1bd to
8cd5771
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts (3)
97-100:⚠️ Potential issue | 🟠 MajorDon't replace the current group's GPU request with the canonical one.
At Line 99, once
canonical.gpuSpecsis set, the current group's GPU attrs are discarded. A broad first match can therefore erase later requirements like RAM or interface. Intersect/merge the canonical spec withgroup.resources.gpu.attributesand fail on conflicts instead of ignoring the current request.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 97 - 100, The code currently discards group.resources.gpu.attributes when canonical.gpuSpecs exists; instead compute an intersected/merged GPU spec and only pass that to `#tryAdjustGPU`: produce effectiveSpecs by intersecting canonical.gpuSpecs with group.resources.gpu.attributes (detecting conflicting fields like RAM, interface, vendor, etc.), set resolvedGpuSpecs to the merged result, and if any conflict is found throw/return a failure rather than silently using canonical.gpuSpecs; update the call site that currently uses [canonical.gpuSpecs] to use the merged/validated spec and keep nodeCopy usage and `#tryAdjustGPU` unchanged.
160-179:⚠️ Potential issue | 🔴 CriticalReserve concrete GPU entries, not just aggregate quantity.
This loop matches against
node.gpu.infobut never consumes the selected entries, and it also allows a multi-GPU request to mix different SKUs on the same node. On{A100, V100}, two successive1x A100placements can both reuse theA100entry, and a broad2x nvidiarequest can succeed with{A100, V100}. Track the exact devices you reserve, remove/mark them unavailable, and pin all units in the reservation to the same concrete SKU.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 160 - 179, The loop currently decrements remaining and sets resolvedVendor/resolvedModel/resolvedAttr but never removes or marks specific entries in node.gpu.info and still calls node.gpu.quantity.allocate(requestedUnits), allowing reuse/mixing of SKUs; change the logic to select and reserve concrete GPU entries: iterate node.gpu.info, use matchesGPU(spec, info) to collect and mark (or remove) the exact info objects into a reserved list until you have requestedUnits, ensure all reserved entries match the same resolvedAttr (fail if a different SKU would be required), and then perform allocation against those concrete entries (or update node.gpu.quantity after marking entries reserved) instead of blindly calling node.gpu.quantity.allocate(requestedUnits); update resolvedVendor/resolvedModel/resolvedAttr from the first reserved entry and return GPU_CHECK_FAIL when you cannot reserve the exact devices.
79-85:⚠️ Potential issue | 🟠 MajorCPU attribute constraints are still ignored.
Lines 79-85 only check aggregate CPU units. A request with CPU vendor/model attrs can still land on any node with enough capacity because
node.cpusis never consulted, so the matcher can accept impossible placements.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` around lines 79 - 85, The current check only uses node.cpu.canAllocate and then node.cpu.allocate, which ignores per-core attributes in node.cpus and allows placements that don't meet CPU vendor/model constraints; update the matching logic in cluster-inventory-matcher.service so that before calling node.cpu.allocate you (1) inspect group.resources.cpu.attrs (vendor/model) and verify node.cpus contains the required number/type of CPU entries that satisfy those attrs, (2) select/reserve the specific CPU entries (e.g., build a list of matching CPU cores from node.cpus), and (3) fail the node if there aren’t enough matching CPU cores; replace the generic node.cpu.allocate call with allocation of those selected cpu entries (or call a new helper like allocateCpusOnNode that marks the specific node.cpus as used) so allocations honor per-CPU attributes rather than only aggregate units.apps/api/src/bid-screening/lib/groupspec-mapper.ts (1)
40-40:⚠️ Potential issue | 🟠 MajorValidate
unit.countbefore returning it.
countbecomes the decrementing loop counter in the matcher. Negative values skip placement entirely, and fractional values over-allocate becauseremaining--runs on non-integers. Guard withNumber.isInteger(unit.count) && unit.count >= 0here instead of passing the raw value through.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/lib/groupspec-mapper.ts` at line 40, The code currently returns unit.count directly (count: unit.count) which can be negative or non-integer; change it to validate and sanitize before returning by checking Number.isInteger(unit.count) && unit.count >= 0 and using the validated value (or defaulting to 0) for the count field so the matcher never receives negative or fractional counts; update the mapping in groupspec-mapper.ts where unit.count is assigned to count to perform this guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.ts`:
- Around line 422-450: The test "places mixed CPU+GPU groups on appropriate
nodes" currently only passes cpuGroup to ClusterInventoryMatcherService.match(),
so it never exercises GPU placement; update the test to include a GPU-bearing
RequestedResourceUnit (e.g., create a gpuGroup with gpu.units > 0 and matching
memory/storage) and call service.match(provider, [cpuGroup, gpuGroup]) so
ClusterInventoryMatcherService.match() evaluates mixed placement, and add
meaningful assertions verifying GPU-specific placement (not just
result.matched), and refactor the spec to use a setup() helper instead of shared
mutable state to comply with spec conventions.
---
Duplicate comments:
In `@apps/api/src/bid-screening/lib/groupspec-mapper.ts`:
- Line 40: The code currently returns unit.count directly (count: unit.count)
which can be negative or non-integer; change it to validate and sanitize before
returning by checking Number.isInteger(unit.count) && unit.count >= 0 and using
the validated value (or defaulting to 0) for the count field so the matcher
never receives negative or fractional counts; update the mapping in
groupspec-mapper.ts where unit.count is assigned to count to perform this guard.
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Around line 97-100: The code currently discards group.resources.gpu.attributes
when canonical.gpuSpecs exists; instead compute an intersected/merged GPU spec
and only pass that to `#tryAdjustGPU`: produce effectiveSpecs by intersecting
canonical.gpuSpecs with group.resources.gpu.attributes (detecting conflicting
fields like RAM, interface, vendor, etc.), set resolvedGpuSpecs to the merged
result, and if any conflict is found throw/return a failure rather than silently
using canonical.gpuSpecs; update the call site that currently uses
[canonical.gpuSpecs] to use the merged/validated spec and keep nodeCopy usage
and `#tryAdjustGPU` unchanged.
- Around line 160-179: The loop currently decrements remaining and sets
resolvedVendor/resolvedModel/resolvedAttr but never removes or marks specific
entries in node.gpu.info and still calls
node.gpu.quantity.allocate(requestedUnits), allowing reuse/mixing of SKUs;
change the logic to select and reserve concrete GPU entries: iterate
node.gpu.info, use matchesGPU(spec, info) to collect and mark (or remove) the
exact info objects into a reserved list until you have requestedUnits, ensure
all reserved entries match the same resolvedAttr (fail if a different SKU would
be required), and then perform allocation against those concrete entries (or
update node.gpu.quantity after marking entries reserved) instead of blindly
calling node.gpu.quantity.allocate(requestedUnits); update
resolvedVendor/resolvedModel/resolvedAttr from the first reserved entry and
return GPU_CHECK_FAIL when you cannot reserve the exact devices.
- Around line 79-85: The current check only uses node.cpu.canAllocate and then
node.cpu.allocate, which ignores per-core attributes in node.cpus and allows
placements that don't meet CPU vendor/model constraints; update the matching
logic in cluster-inventory-matcher.service so that before calling
node.cpu.allocate you (1) inspect group.resources.cpu.attrs (vendor/model) and
verify node.cpus contains the required number/type of CPU entries that satisfy
those attrs, (2) select/reserve the specific CPU entries (e.g., build a list of
matching CPU cores from node.cpus), and (3) fail the node if there aren’t enough
matching CPU cores; replace the generic node.cpu.allocate call with allocation
of those selected cpu entries (or call a new helper like allocateCpusOnNode that
marks the specific node.cpus as used) so allocations honor per-CPU attributes
rather than only aggregate units.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 65809fed-4fe7-4539-9156-f5419c6e57c2
📒 Files selected for processing (14)
apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.tsapps/api/src/bid-screening/lib/gpu-attribute-parser.tsapps/api/src/bid-screening/lib/groupspec-mapper.spec.tsapps/api/src/bid-screening/lib/groupspec-mapper.tsapps/api/src/bid-screening/lib/inventory-mapper.spec.tsapps/api/src/bid-screening/lib/inventory-mapper.tsapps/api/src/bid-screening/lib/resource-pair.spec.tsapps/api/src/bid-screening/lib/resource-pair.tsapps/api/src/bid-screening/lib/storage-attribute-parser.spec.tsapps/api/src/bid-screening/lib/storage-attribute-parser.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.tsapps/api/src/bid-screening/types/inventory.types.tsapps/api/src/bid-screening/types/provider.ts
✅ Files skipped from review due to trivial changes (5)
- apps/api/src/bid-screening/lib/resource-pair.spec.ts
- apps/api/src/bid-screening/types/provider.ts
- apps/api/src/bid-screening/lib/groupspec-mapper.spec.ts
- apps/api/src/bid-screening/lib/inventory-mapper.spec.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- apps/api/src/bid-screening/lib/storage-attribute-parser.spec.ts
- apps/api/src/bid-screening/lib/gpu-attribute-parser.ts
- apps/api/src/bid-screening/lib/resource-pair.ts
- apps/api/src/bid-screening/types/inventory.types.ts
8cd5771 to
affd870
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts (1)
4-4: Useimport typefor type-only imports.
ProviderWithSnapshotis only used as a type annotation. Useimport typefor consistency with line 7 and to allow the compiler to elide the import.♻️ Suggested fix
-import { ProviderWithSnapshot } from "@src/bid-screening/types/provider"; +import type { ProviderWithSnapshot } from "@src/bid-screening/types/provider";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts` at line 4, The import of ProviderWithSnapshot in cluster-inventory-matcher.service.ts is type-only and should use a type-only import to allow TS to elide it; update the import statement that currently imports ProviderWithSnapshot from "@src/bid-screening/types/provider" to use "import type" (matching the style used on line 7), ensuring ProviderWithSnapshot remains only a type reference in functions or method signatures within the ClusterInventoryMatcherService.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@apps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.ts`:
- Line 4: The import of ProviderWithSnapshot in
cluster-inventory-matcher.service.ts is type-only and should use a type-only
import to allow TS to elide it; update the import statement that currently
imports ProviderWithSnapshot from "@src/bid-screening/types/provider" to use
"import type" (matching the style used on line 7), ensuring ProviderWithSnapshot
remains only a type reference in functions or method signatures within the
ClusterInventoryMatcherService.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c0bfa879-b5a4-43dc-ac09-a97ac25e1054
📒 Files selected for processing (14)
apps/api/src/bid-screening/lib/gpu-attribute-parser.spec.tsapps/api/src/bid-screening/lib/gpu-attribute-parser.tsapps/api/src/bid-screening/lib/groupspec-mapper.spec.tsapps/api/src/bid-screening/lib/groupspec-mapper.tsapps/api/src/bid-screening/lib/inventory-mapper.spec.tsapps/api/src/bid-screening/lib/inventory-mapper.tsapps/api/src/bid-screening/lib/resource-pair.spec.tsapps/api/src/bid-screening/lib/resource-pair.tsapps/api/src/bid-screening/lib/storage-attribute-parser.spec.tsapps/api/src/bid-screening/lib/storage-attribute-parser.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.spec.tsapps/api/src/bid-screening/services/cluster-inventory-matcher/cluster-inventory-matcher.service.tsapps/api/src/bid-screening/types/inventory.types.tsapps/api/src/bid-screening/types/provider.ts
✅ Files skipped from review due to trivial changes (1)
- apps/api/src/bid-screening/types/provider.ts
🚧 Files skipped from review as they are similar to previous changes (6)
- apps/api/src/bid-screening/lib/resource-pair.spec.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.spec.ts
- apps/api/src/bid-screening/lib/groupspec-mapper.ts
- apps/api/src/bid-screening/lib/storage-attribute-parser.ts
- apps/api/src/bid-screening/lib/resource-pair.ts
- apps/api/src/bid-screening/types/inventory.types.ts
affd870 to
f689d76
Compare
ec3cf7c to
4be54de
Compare
4be54de to
b76c5d5
Compare
b76c5d5 to
8824f1c
Compare
Why
This is the 1st part that implements greedy packing algorithm. It will be executed as the last gate on top of what database returns. Unfortunately there is no way to implement such logic only on SQL side.
Theoretically, database pre-filtering will eliminate most of the providers and will keep about 50-100 providers. Then
ClusterInventoryMatcherServicewill do the rest.At the current amount of providers (~70), event loop is not expected to be blocked.
What
The logic follows specification
Additionally:
Summary by CodeRabbit
New Features
Tests