threading for instrument records export#1261
Conversation
…convert groupid to string
Add threading
|
Caution Review failedThe pull request is closed. WalkthroughAdds a worker-based, multi-threaded export pipeline for instrument records, refactors exportRecords to use workers and a raw-query helper, introduces inter-worker message types, adds unit tests for InstrumentRecordsService, and tightens clinical subject input validation. Changes
Sequence DiagramsequenceDiagram
participant Service as InstrumentRecordsService
participant Worker as Worker Thread
participant DB as Database
participant Instruments as Instruments Service
Service->>DB: queryRecordsRaw(appAbility, groupId?)
DB-->>Service: records[]
Service->>Instruments: fetch instruments by ids (bulk)
Instruments-->>Service: instrumentsMap
Service->>Worker: INIT (availableInstrumentArray)
Worker-->>Service: InitialMessage { success: true }
Service->>Worker: BEGIN_CHUNK_PROCESSING (records chunk)
Worker->>Worker: expand array measures, map to export rows
alt expansion error
Worker-->>Service: WorkerMessage { success: false, error }
else success
Worker-->>Service: WorkerMessage { success: true, data: exportRows[] }
end
Service->>Service: aggregate chunk results
Service-->>Client: final exportRows[]
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In
`@apps/api/src/instrument-records/__tests__/instrument-records.service.spec.ts`:
- Around line 81-112: The mockRecords in instrument-records.service.spec.ts uses
Date objects for fields like createdAt, date, session.date, subject.dateOfBirth,
and updatedAt while queryRecordsRaw returns serialized date strings; change
those Date instances to ISO date strings (e.g., "2023-01-01T00:00:00.000Z") for
createdAt, date, session.date, subject.dateOfBirth, updatedAt (and any other
date fields) so the mock matches the export shape expected by
expect.any(String); update the mockRecords declaration (and its type assertion
if needed) to accept strings (cast to the same (InstrumentRecord & {...})[] or
to unknown then to that type) so TypeScript still compiles.
In `@apps/api/src/instrument-records/instrument-records.service.ts`:
- Around line 183-193: The INIT message listener (worker.on('message', (message:
InitialMessage) => { ... })) is also matching chunk responses because both
include success, causing duplicate CHUNK_COMPLETE posts; change the INIT handler
to either use worker.once('message', ...) or explicitly gate on message.type ===
'INIT' (or similar discriminant) before posting CHUNK_COMPLETE, and keep the
chunk response handling separate (the inner worker.on('message', (message:
WorkerMessage) => { ... })) so resolve/reject and worker.terminate() only run
for chunk responses.
- Around line 454-460: The $expr currently gets an empty object when groupId is
undefined which is invalid; update the aggregation so the conditional is applied
at the $match level instead of inside $expr: in the method that builds the
pipeline (referencing groupId and the $match stage in
instrument-records.service.ts), make $match either include a filter that
compares groupId to {$toObjectId: groupId} (when groupId is present) or be
omitted/empty (when groupId is absent) rather than passing {} into $expr; ensure
you remove the conditional from $expr and construct $match accordingly so the
pipeline receives a valid expression only when groupId is provided.
- Around line 167-169: The current use of Iterator.prototype.toArray() on
instruments.values() (creating availableInstrumentArray of type InitData) will
crash on Node.js 20 LTS; replace the toArray() call with Array.from(...) to
materialize the iterator (e.g., build availableInstrumentArray =
Array.from(instruments.values()) while preserving the InitData typing) so it
works on Node.js 20.
| const availableInstrumentArray: InitData = instruments | ||
| .values() | ||
| .toArray() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the Node.js version to understand what LTS versions are current
cat /etc/os-release 2>/dev/null || echo "OS info unavailable"
# Try to find information about Node.js version in the project
find . -type f \( -name ".nvmrc" -o -name ".node-version" -o -name "package.json" \) | head -5Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 223
🏁 Script executed:
# Read the actual file to see the context at lines 167-169
head -180 apps/api/src/instrument-records/instrument-records.service.ts | tail -20Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 747
🌐 Web query:
Iterator.prototype.toArray() Node.js support LTS availability 2024 2025
💡 Result:
Short answer: Iterator.prototype.toArray() is available in Node.js starting with v22 (V8 12.4) and therefore in Node.js 22.x LTS (entered LTS October 2024). Sources: MDN and the Node.js v22 release notes. [1][2]
Details / dates:
- Node.js 22.0.0 (added V8 12.4 which includes iterator helpers / toArray): released Apr 24, 2024. [2]
- Node.js 22 was promoted to LTS in October 2024, so toArray is available on Node.js LTS starting Oct 2024 (and in all 22.x releases thereafter). [2][3]
References:
[1] MDN — Iterator.prototype.toArray (compatibility shows Node.js 22.0.0+).
[2] Node.js v22.0.0 release notes (Apr 24, 2024) — V8 12.4 / iterator helpers.
[3] Node.js documentation/blog noting Node.js 22 enters LTS in October 2024.
🏁 Script executed:
# Check what Node.js version the project targets
cat .nvmrcRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 90
🏁 Script executed:
# Also check package.json for engines specification
cat apps/api/package.json | grep -A 5 '"engines"'Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 65
🏁 Script executed:
# Let me verify the actual lines in context to see the full picture
cat -n apps/api/src/instrument-records/instrument-records.service.ts | sed -n '160,180p'Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 976
🏁 Script executed:
# Search for the actual toArray() call in the file
rg -n "toArray" apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 103
🏁 Script executed:
# Also search for availableInstrumentArray to find where it's defined
rg -n "availableInstrumentArray" apps/api/src/instrument-records/instrument-records.service.ts -A 3Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 404
🏁 Script executed:
# Verify Node.js 20 (Jod) LTS does NOT have Iterator.prototype.toArray()
# Let's check the Node.js documentation for version 20
python3 << 'EOF'
# Node.js 20 (Jod) was released April 2023, entered LTS October 2023
# Iterator.prototype.toArray() was added in Node.js 22 (April 2024)
print("Node.js 20 LTS (Jod) - released April 2023, LTS October 2023")
print("Iterator.prototype.toArray() - added in Node.js 22.0.0 (April 2024)")
print("\nConclusion: Node.js 20 LTS does NOT support Iterator.prototype.toArray()")
EOFRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 169
Avoid Iterator.prototype.toArray() to prevent runtime crashes on Node.js 20 LTS.
The project targets Node.js 20 LTS (lts/jod), which does not support Iterator helpers. The toArray() method was added in Node.js 22.0.0 (April 2024). Use Array.from() instead.
Lines 147-149:
♻️ Proposed fix
const availableInstrumentArray: InitData = instruments
- .values()
- .toArray()
- .map((item) => {
+ = Array.from(instruments.values()).map((item) => {
return {
edition: item.internal.edition,
id: item.id!,
name: item.internal.name
};
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const availableInstrumentArray: InitData = instruments | |
| .values() | |
| .toArray() | |
| const availableInstrumentArray: InitData = Array.from(instruments.values()).map((item) => { | |
| return { | |
| edition: item.internal.edition, | |
| id: item.id!, | |
| name: item.internal.name | |
| }; | |
| }); |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
167 - 169, The current use of Iterator.prototype.toArray() on
instruments.values() (creating availableInstrumentArray of type InitData) will
crash on Node.js 20 LTS; replace the toArray() call with Array.from(...) to
materialize the iterator (e.g., build availableInstrumentArray =
Array.from(instruments.values()) while preserving the InitData typing) so it
works on Node.js 20.
| worker.on('message', (message: InitialMessage) => { | ||
| if (message.success) { | ||
| worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' }); | ||
| worker.on('message', (message: WorkerMessage) => { | ||
| if (message.success) { | ||
| resolve(message.data); | ||
| } else { | ||
| reject(new Error(message.error)); | ||
| } | ||
| void worker.terminate(); | ||
| }); |
There was a problem hiding this comment.
INIT listener also handles chunk responses, causing extra CHUNK_COMPLETE posts.
Both responses include success, so the INIT handler will fire on the chunk response and resend work. Use once or differentiate message shapes.
🛠️ Proposed fix
- worker.on('message', (message: InitialMessage) => {
- if (message.success) {
- worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' });
- worker.on('message', (message: WorkerMessage) => {
- if (message.success) {
- resolve(message.data);
- } else {
- reject(new Error(message.error));
- }
- void worker.terminate();
- });
- }
- });
+ worker.once('message', (message: InitialMessage) => {
+ if (!message.success) {
+ reject(new Error('Worker init failed'));
+ void worker.terminate();
+ return;
+ }
+ worker.once('message', (message: WorkerMessage) => {
+ if (message.success) resolve(message.data);
+ else reject(new Error(message.error));
+ void worker.terminate();
+ });
+ worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' });
+ });🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
183 - 193, The INIT message listener (worker.on('message', (message:
InitialMessage) => { ... })) is also matching chunk responses because both
include success, causing duplicate CHUNK_COMPLETE posts; change the INIT handler
to either use worker.once('message', ...) or explicitly gate on message.type ===
'INIT' (or similar discriminant) before posting CHUNK_COMPLETE, and keep the
chunk response handling separate (the inner worker.on('message', (message:
WorkerMessage) => { ... })) so resolve/reject and worker.terminate() only run
for chunk responses.
| $match: { | ||
| $expr: groupId | ||
| ? { | ||
| $eq: ['$groupId', { $toObjectId: groupId }] | ||
| } | ||
| : {} | ||
| } |
There was a problem hiding this comment.
$expr receives {} when groupId is undefined.
This is likely an invalid expression and can break exports without a group filter. Move the conditional to $match instead.
🛠️ Proposed fix
- {
- $match: {
- $expr: groupId
- ? {
- $eq: ['$groupId', { $toObjectId: groupId }]
- }
- : {}
- }
- },
+ {
+ $match: groupId
+ ? {
+ $expr: { $eq: ['$groupId', { $toObjectId: groupId }] }
+ }
+ : {}
+ },📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| $match: { | |
| $expr: groupId | |
| ? { | |
| $eq: ['$groupId', { $toObjectId: groupId }] | |
| } | |
| : {} | |
| } | |
| { | |
| $match: groupId | |
| ? { | |
| $expr: { $eq: ['$groupId', { $toObjectId: groupId }] } | |
| } | |
| : {} | |
| }, |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
454 - 460, The $expr currently gets an empty object when groupId is undefined
which is invalid; update the aggregation so the conditional is applied at the
$match level instead of inside $expr: in the method that builds the pipeline
(referencing groupId and the $match stage in instrument-records.service.ts),
make $match either include a filter that compares groupId to {$toObjectId:
groupId} (when groupId is present) or be omitted/empty (when groupId is absent)
rather than passing {} into $expr; ensure you remove the conditional from $expr
and construct $match accordingly so the pipeline receives a valid expression
only when groupId is provided.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In
`@apps/api/src/instrument-records/__tests__/instrument-records.service.spec.ts`:
- Around line 81-107: The mockRecords entry uses Date objects for record.date
and session.date but RecordType and the export worker expect ISO strings; update
mockRecords (the object in the test) so that the properties "date" and
"session.date" are ISO strings (e.g., via toISOString or literal ISO strings) to
match the RecordType contract and ensure the export worker produces a string
"timestamp" as asserted in the test.
In `@apps/api/src/instrument-records/instrument-records.service.ts`:
- Around line 160-174: The Worker message handlers currently use
worker.on('message') causing duplicate listeners; change both the outer handler
that listens for the INIT response (the callback receiving InitialMessage) and
the inner handler that listens for CHUNK_COMPLETE (the callback receiving
WorkerMessage) to use once('message') so each handler runs only once, and ensure
the outer INIT handler checks message.success === false and immediately rejects
with message.error and terminates the Worker (calling worker.terminate()) to
avoid cascading failures; keep the existing worker.postMessage({ data:
availableInstrumentArray, type: 'INIT' }) and worker.postMessage({ data: chunk,
type: 'CHUNK_COMPLETE' }) calls but only send CHUNK_COMPLETE from the
INIT-success path once.
- Around line 445-449: The export projection is using the database creation
timestamp field '$createdAt' instead of the measurement timestamp field '$date',
so change the $dateToString entry in the export pipeline/projection to use
'date: { $dateToString: { date: '$date', format: '%Y-%m-%d' } }' (i.e., replace
'$createdAt' with '$date') so exported rows reflect the measurement timestamp;
keep the output field name 'date' unchanged.
In `@packages/subject-utils/src/index.ts`:
- Around line 8-31: The runtime checks in parseClinicalSubjectIdentificationData
are too strict and out of sync with the Zod schema; update the function to
delegate validation to the schema (use the schema's parse/safeParse so
z.coerce.date() can handle strings/numbers, the .min(1) constraints on
firstName/lastName are enforced, and invalid Date instances are rejected by the
schema), remove the manual typeof/instanceof checks, and return the parsed value
from the schema (or throw the schema error) so callers get consistent,
canonicalized data.
| const worker = new Worker(join(import.meta.dirname, 'export-worker.ts')); | ||
| worker.postMessage({ data: availableInstrumentArray, type: 'INIT' }); | ||
|
|
||
| worker.on('message', (message: InitialMessage) => { | ||
| if (message.success) { | ||
| worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' }); | ||
| worker.on('message', (message: WorkerMessage) => { | ||
| if (message.success) { | ||
| resolve(message.data); | ||
| } else { | ||
| reject(new Error(message.error)); | ||
| } | ||
| void worker.terminate(); | ||
| }); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
wc -l apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 222
🏁 Script executed:
sed -n '150,190p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1323
🏁 Script executed:
sed -n '1,50p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2291
🏁 Script executed:
find . -name "thread-types.ts" -o -name "thread-types.d.ts"Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 132
🏁 Script executed:
cat apps/api/src/instrument-records/thread-types.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1102
Use once for Worker message handlers to prevent duplicate listener stacking.
The outer on('message') handler remains registered after the INIT response arrives. When the CHUNK_COMPLETE response is received later, both the original outer handler and the newly registered inner handler fire. The outer handler then incorrectly re-sends CHUNK_COMPLETE and registers another listener, causing duplicate work.
Replace on with once for both message handlers. Additionally, add error handling for failed INIT messages to avoid cascading failures.
Proposed fix
- worker.on('message', (message: InitialMessage) => {
- if (message.success) {
- worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' });
- worker.on('message', (message: WorkerMessage) => {
- if (message.success) {
- resolve(message.data);
- } else {
- reject(new Error(message.error));
- }
- void worker.terminate();
- });
- }
- });
+ worker.once('message', (message: InitialMessage) => {
+ if (!message.success) {
+ reject(new Error('Worker init failed'));
+ void worker.terminate();
+ return;
+ }
+ worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' });
+ worker.once('message', (message: WorkerMessage) => {
+ if (message.success) {
+ resolve(message.data);
+ } else {
+ reject(new Error(message.error));
+ }
+ void worker.terminate();
+ });
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const worker = new Worker(join(import.meta.dirname, 'export-worker.ts')); | |
| worker.postMessage({ data: availableInstrumentArray, type: 'INIT' }); | |
| worker.on('message', (message: InitialMessage) => { | |
| if (message.success) { | |
| worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' }); | |
| worker.on('message', (message: WorkerMessage) => { | |
| if (message.success) { | |
| resolve(message.data); | |
| } else { | |
| reject(new Error(message.error)); | |
| } | |
| void worker.terminate(); | |
| }); | |
| } | |
| const worker = new Worker(join(import.meta.dirname, 'export-worker.ts')); | |
| worker.postMessage({ data: availableInstrumentArray, type: 'INIT' }); | |
| worker.once('message', (message: InitialMessage) => { | |
| if (!message.success) { | |
| reject(new Error('Worker init failed')); | |
| void worker.terminate(); | |
| return; | |
| } | |
| worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' }); | |
| worker.once('message', (message: WorkerMessage) => { | |
| if (message.success) { | |
| resolve(message.data); | |
| } else { | |
| reject(new Error(message.error)); | |
| } | |
| void worker.terminate(); | |
| }); | |
| }); |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
160 - 174, The Worker message handlers currently use worker.on('message')
causing duplicate listeners; change both the outer handler that listens for the
INIT response (the callback receiving InitialMessage) and the inner handler that
listens for CHUNK_COMPLETE (the callback receiving WorkerMessage) to use
once('message') so each handler runs only once, and ensure the outer INIT
handler checks message.success === false and immediately rejects with
message.error and terminates the Worker (calling worker.terminate()) to avoid
cascading failures; keep the existing worker.postMessage({ data:
availableInstrumentArray, type: 'INIT' }) and worker.postMessage({ data: chunk,
type: 'CHUNK_COMPLETE' }) calls but only send CHUNK_COMPLETE from the
INIT-success path once.
| date: { | ||
| $dateToString: { | ||
| date: '$createdAt', | ||
| format: '%Y-%m-%d' | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd "instrument-records.service.ts" --type fRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 144
🏁 Script executed:
wc -l apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 148
🏁 Script executed:
sed -n '430,465p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 904
🏁 Script executed:
fd "instrument-record" --type f -e ts -e schema -e model | grep -E "(schema|model|interface|entity|type)"Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 144
🏁 Script executed:
cat apps/api/src/instrument-records/instrument-records.service.ts | head -100Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 3989
🏁 Script executed:
cat packages/schemas/src/instrument-records/instrument-records.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2603
🏁 Script executed:
cat packages/schemas/src/core/core.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2386
🏁 Script executed:
sed -n '400,450p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1501
🏁 Script executed:
sed -n '440,480p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1121
🏁 Script executed:
sed -n '60,120p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2009
🏁 Script executed:
rg "queryRecordsRaw" apps/api/src/instrument-records/instrument-records.service.ts -A 5 -B 5Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 925
🏁 Script executed:
sed -n '330,380p' apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1830
Use the date field instead of createdAt in the export projection.
The date field stores the actual measurement timestamp provided when creating records, while createdAt is the database creation timestamp. Using createdAt causes exports to reflect when records were entered into the system rather than when measurements were taken—problematic for retrospective data and downstream analyses.
Proposed fix
- date: {
- $dateToString: {
- date: '$createdAt',
- format: '%Y-%m-%d'
- }
- },
+ date: {
+ $dateToString: {
+ date: '$date',
+ format: '%Y-%m-%d'
+ }
+ },📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| date: { | |
| $dateToString: { | |
| date: '$createdAt', | |
| format: '%Y-%m-%d' | |
| } | |
| date: { | |
| $dateToString: { | |
| date: '$date', | |
| format: '%Y-%m-%d' | |
| } | |
| }, |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
445 - 449, The export projection is using the database creation timestamp field
'$createdAt' instead of the measurement timestamp field '$date', so change the
$dateToString entry in the export pipeline/projection to use 'date: {
$dateToString: { date: '$date', format: '%Y-%m-%d' } }' (i.e., replace
'$createdAt' with '$date') so exported rows reflect the measurement timestamp;
keep the output field name 'date' unchanged.
| function parseClinicalSubjectIdentificationData({ | ||
| dateOfBirth, | ||
| firstName, | ||
| lastName, | ||
| sex | ||
| }: { | ||
| [key: string]: unknown; | ||
| }): ClinicalSubjectIdentificationData { | ||
| if (!(dateOfBirth instanceof Date)) { | ||
| throw new Error('Invalid or missing dateOfBirth: Expected a Date object.'); | ||
| } else if (typeof firstName !== 'string') { | ||
| throw new Error('Invalid or missing firstName: Expected a string.'); | ||
| } else if (typeof lastName !== 'string') { | ||
| throw new Error('Invalid or missing lastName: Expected a string.'); | ||
| } else if (!(sex === 'MALE' || sex === 'FEMALE')) { | ||
| throw new Error('Invalid or missing sex: Expected "MALE" or "FEMALE".'); | ||
| } | ||
| return { | ||
| dateOfBirth, | ||
| firstName, | ||
| lastName, | ||
| sex | ||
| }; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's understand the repository structure and locate relevant files
find packages/subject-utils -type f -name "*.ts" -o -name "*.js" | head -20Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 222
🏁 Script executed:
# Search for ClinicalSubjectIdentificationData type definition
rg "ClinicalSubjectIdentificationData" packages/subject-utils/ -t typescriptRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 117
🏁 Script executed:
# Look for schema definitions, likely in the same directory or a related location
rg "schema\|zod\|yup" packages/subject-utils/ -t typescript -iRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 117
🏁 Script executed:
# Check the full context of the index.ts file
cat packages/subject-utils/src/index.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2437
🏁 Script executed:
# Search for generateSubjectHash function
rg "generateSubjectHash" packages/subject-utils/ -t typescript -A 10Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 117
🏁 Script executed:
# The type is imported from '@opendatacapture/schemas/subject'
# Let's find this schema file
find . -type f -path "*schemas*subject*" | head -20Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 124
🏁 Script executed:
# Try broader search for the schemas package
find . -type f -name "*.ts" -path "*schemas*" | head -30Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1526
🏁 Script executed:
# Look for ClinicalSubjectIdentificationData definition
rg "ClinicalSubjectIdentificationData" . -A 5 -B 2 2>/dev/null | head -50Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 3389
🏁 Script executed:
# Check the structure around the repository root
ls -laRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2620
🏁 Script executed:
# Read the subject schema file to understand ClinicalSubjectIdentificationData definition
cat packages/schemas/src/subject/subject.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1226
🏁 Script executed:
# Also check the test file to understand how the function is being used
cat packages/subject-utils/src/test.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1207
🏁 Script executed:
# Find all usages of parseClinicalSubjectIdentificationData and generateSubjectHash
rg "parseClinicalSubjectIdentificationData|generateSubjectHash" . --no-heading -B 2 -A 2 2>/dev/null | head -100Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 6320
🏁 Script executed:
# Check if generateSubjectHash is called with schema-validated data or raw data
rg "generateSubjectHash\(" . --no-heading -B 5 -A 2 2>/dev/null | head -80Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 4458
🏁 Script executed:
# Check the IdentificationForm to see if it validates with schema before calling generateSubjectHash
cat apps/web/src/components/IdentificationForm/IdentificationForm.tsx | head -150Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 5509
🏁 Script executed:
# Check the StartSessionForm similarly
cat apps/web/src/components/StartSessionForm/StartSessionForm.tsx | head -150Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 5508
🏁 Script executed:
# Check demo.service.ts to see the data flow there
cat apps/api/src/demo/demo.service.ts | grep -A 20 -B 5 "generateSubjectHash"Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 2139
Align runtime validation with the schema definition.
The schema allows dateOfBirth: z.coerce.date() (accepts strings, numbers, or Date), but the implementation requires instanceof Date, which will reject coerced values. Additionally, firstName and lastName pass the typeof string check but do not verify the .min(1) constraint, allowing empty strings. Invalid Date instances (NaN timestamps) will cause toISOString() to throw at runtime.
Delegate validation to the schema:
Proposed fix
-function parseClinicalSubjectIdentificationData({
- dateOfBirth,
- firstName,
- lastName,
- sex
-}: {
- [key: string]: unknown;
-}): ClinicalSubjectIdentificationData {
- if (!(dateOfBirth instanceof Date)) {
- throw new Error('Invalid or missing dateOfBirth: Expected a Date object.');
- } else if (typeof firstName !== 'string') {
- throw new Error('Invalid or missing firstName: Expected a string.');
- } else if (typeof lastName !== 'string') {
- throw new Error('Invalid or missing lastName: Expected a string.');
- } else if (!(sex === 'MALE' || sex === 'FEMALE')) {
- throw new Error('Invalid or missing sex: Expected "MALE" or "FEMALE".');
- }
- return {
- dateOfBirth,
- firstName,
- lastName,
- sex
- };
-}
+function parseClinicalSubjectIdentificationData(data: unknown): ClinicalSubjectIdentificationData {
+ const result = $ClinicalSubjectIdentificationData.safeParse(data);
+ if (!result.success) {
+ throw new Error('Invalid clinical subject identification data.');
+ }
+ return result.data;
+}🤖 Prompt for AI Agents
In `@packages/subject-utils/src/index.ts` around lines 8 - 31, The runtime checks
in parseClinicalSubjectIdentificationData are too strict and out of sync with
the Zod schema; update the function to delegate validation to the schema (use
the schema's parse/safeParse so z.coerce.date() can handle strings/numbers, the
.min(1) constraints on firstName/lastName are enforced, and invalid Date
instances are rejected by the schema), remove the manual typeof/instanceof
checks, and return the parsed value from the schema (or throw the schema error)
so callers get consistent, canonicalized data.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@apps/api/src/instrument-records/export-worker.ts`:
- Around line 40-47: The Map declaration for initData uses a key type of `string
| undefined`, which allows accidental lookups like
`instrumentsMap.get(undefined)`; change the key type to `string` for `initData:
Map<string, { edition: number; id: string; name: string }>` and then audit
usages of `initData`/`instrumentsMap` to ensure callers never pass
`undefined`—add guards or filter out undefined ids before calling `set`/`get`
where necessary (e.g., where ids are derived from possibly-absent values) so all
keys are guaranteed to be strings.
- Around line 126-135: The code assumes parentPort is always present and will
throw when imported on the main thread; add a null-check guard before attaching
the message handler so the file is safe to import outside a worker. Surround the
parentPort!.on(...) logic with a conditional that verifies parentPort !== null
(or return early if null) and only calls parentPort.on('message', ...) to
dispatch to handleChunkComplete and handleInit for ParentMessage types; keep the
existing default error for unexpected message types but ensure it only runs
inside the worker context.
- Around line 61-66: The function processRecord currently uses a non-null
assertion on instrumentsMap.get(record.instrumentId) which can throw if the key
is missing; update processRecord to guard by retrieving const instrument =
instrumentsMap.get(record.instrumentId); if (!instrument) return []; (or handle
missing instrument appropriately) before using it, remove the dead commented
line ("//const instrument = instrumentsMap.get(record.instrumentId)!;"), and
ensure downstream logic references the guarded instrument variable instead of
assuming it exists.
In `@apps/api/src/instrument-records/instrument-records.service.ts`:
- Around line 140-142: instrumentIds.values() returns an iterator, so calling
.map() on it fails; convert the iterator to an array before mapping. Replace the
expression that builds instrumentsArray to first create an array from
instrumentIds (e.g., Array.from(instrumentIds) or
Array.from(instrumentIds.values()) or [...instrumentIds]) then map over that
array calling this.instrumentsService.findById(id) (returning
Promise<ScalarInstrument>) and wrap with Promise.all to produce
instrumentsArray. Ensure you reference instrumentIds, instrumentsArray, and
instrumentsService.findById in the updated code.
- Around line 146-148: The code computes numWorkers and chunkSize from
records.length and will produce numWorkers = 0 and chunkSize = Infinity when
records.length === 0; add an early return at the start of the processing
function (before computing numWorkers/chunkSize/chunks) that returns an empty
array or appropriate empty result when records.length === 0, referencing the
existing variables records, numWorkers, and chunkSize so you avoid division by
zero and unnecessary work.
- Around line 418-509: queryRecordsRaw is applying authorization after fetching
data (using appAbility.can), which can load unauthorized records; fix by pushing
the ability constraints into the DB aggregation instead of post-filtering:
derive the same access-filter used by accessibleQuery (or build equivalent
filter from AppAbility) and merge it into the aggregation pipeline's $match
(before projection), or switch to a Prisma query that uses accessibleQuery and
then perform any necessary $lookup/$project only on the already-filtered ids;
target symbols: queryRecordsRaw, accessibleQuery,
instrumentRecordModel.aggregateRaw, appAbility.can and ensure the $match added
to the pipeline encodes the user's allowed conditions so unauthorized documents
are never returned from aggregateRaw.
♻️ Duplicate comments (4)
apps/api/src/instrument-records/instrument-records.service.ts (4)
154-156:Iterator.prototype.toArray()unavailable on Node.js 20 LTS.Project targets
lts/jod(Node.js 20), buttoArray()was added in Node.js 22. This will crash at runtime.Proposed fix
- const availableInstrumentArray: InitData = instruments - .values() - .toArray() - .map((item) => { + const availableInstrumentArray: InitData = Array.from(instruments.values()).map((item) => { return { edition: item.internal.edition, id: item.id!, name: item.internal.name }; });
170-182: Useonceto prevent listener stacking and duplicateCHUNK_COMPLETEposts.Using
oninstead ofoncemeans the outer handler remains registered and fires again on the chunk response, causing duplicate work and potential race conditions.Proposed fix
- worker.on('message', (message: InitialMessage) => { - if (message.success) { - worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' } satisfies ChunkCompleteMessage); - worker.on('message', (message: WorkerMessage) => { - if (message.success) { - resolve(message.data); - } else { - reject(new Error(message.error)); - } - void worker.terminate(); - }); - } + worker.once('message', (message: InitialMessage) => { + if (!message.success) { + reject(new Error('Worker init failed')); + void worker.terminate(); + return; + } + worker.postMessage({ data: chunk, type: 'CHUNK_COMPLETE' } satisfies ChunkCompleteMessage); + worker.once('message', (message: WorkerMessage) => { + if (message.success) { + resolve(message.data); + } else { + reject(new Error(message.error)); + } + void worker.terminate(); + }); });
440-448:$expr: {}is invalid whengroupIdis undefined.An empty object in
$exprcan cause unexpected behavior or errors. Move the conditional to$matchlevel.Proposed fix
{ - $match: { - $expr: groupId - ? { - $eq: ['$groupId', { $toObjectId: groupId }] - } - : {} - } + $match: groupId + ? { + $expr: { $eq: ['$groupId', { $toObjectId: groupId }] } + } + : {} },
452-457: Using$createdAtinstead of$datefor the timestamp.The
datefield stores the actual measurement timestamp, whilecreatedAtis the database insertion time. For retrospective data, these differ significantly.Proposed fix
date: { $dateToString: { - date: '$createdAt', + date: '$date', format: '%Y-%m-%d' } },
🧹 Nitpick comments (1)
apps/api/src/instrument-records/export-worker.ts (1)
21-38: RedundantObject.values()on array and unsafe typing.
listEntryis already an array, soObject.values(listEntry)is redundant and confusing—just iterate directly. Also,any[]loses type safety.Proposed fix
-function expandData(listEntry: any[]): ExpandDataType[] { +function expandData(listEntry: Array<{ [key: string]: FormTypes.RecordArrayFieldValue }>): ExpandDataType[] { const validRecordArrayList: ExpandDataType[] = []; if (listEntry.length < 1) { throw new Error('Record Array is Empty'); } - for (const objectEntry of Object.values(listEntry)) { + for (const objectEntry of listEntry) { for (const [dataKey, dataValue] of Object.entries( objectEntry as { [key: string]: FormTypes.RecordArrayFieldValue } )) {
| let initData: Map< | ||
| string | undefined, | ||
| { | ||
| edition: number; | ||
| id: string; | ||
| name: string; | ||
| } | ||
| >; |
There was a problem hiding this comment.
Map key type should not include undefined.
string | undefined as the key type is suspicious—instrumentsMap.get(undefined) would be a bug. This should be string.
Proposed fix
let initData: Map<
- string | undefined,
+ string,
{
edition: number;
id: string;
name: string;
}
>;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let initData: Map< | |
| string | undefined, | |
| { | |
| edition: number; | |
| id: string; | |
| name: string; | |
| } | |
| >; | |
| let initData: Map< | |
| string, | |
| { | |
| edition: number; | |
| id: string; | |
| name: string; | |
| } | |
| >; |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/export-worker.ts` around lines 40 - 47, The
Map declaration for initData uses a key type of `string | undefined`, which
allows accidental lookups like `instrumentsMap.get(undefined)`; change the key
type to `string` for `initData: Map<string, { edition: number; id: string; name:
string }>` and then audit usages of `initData`/`instrumentsMap` to ensure
callers never pass `undefined`—add guards or filter out undefined ids before
calling `set`/`get` where necessary (e.g., where ids are derived from
possibly-absent values) so all keys are guaranteed to be strings.
| const processRecord = (record: RecordType): InstrumentRecordsExport => { | ||
| const instrument = instrumentsMap.get(record.instrumentId)!; | ||
|
|
||
| if (!record.computedMeasures) return []; | ||
|
|
||
| //const instrument = instrumentsMap.get(record.instrumentId)!; |
There was a problem hiding this comment.
Non-null assertion risks runtime crash; remove dead code.
If record.instrumentId isn't in the map, line 62 throws. Add a guard. Also, line 66 is commented-out code—remove it.
Proposed fix
const processRecord = (record: RecordType): InstrumentRecordsExport => {
- const instrument = instrumentsMap.get(record.instrumentId)!;
+ const instrument = instrumentsMap.get(record.instrumentId);
+ if (!instrument) {
+ throw new Error(`Unknown instrument ID: ${record.instrumentId}`);
+ }
if (!record.computedMeasures) return [];
- //const instrument = instrumentsMap.get(record.instrumentId)!;
const rows: InstrumentRecordsExport = [];📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const processRecord = (record: RecordType): InstrumentRecordsExport => { | |
| const instrument = instrumentsMap.get(record.instrumentId)!; | |
| if (!record.computedMeasures) return []; | |
| //const instrument = instrumentsMap.get(record.instrumentId)!; | |
| const processRecord = (record: RecordType): InstrumentRecordsExport => { | |
| const instrument = instrumentsMap.get(record.instrumentId); | |
| if (!instrument) { | |
| throw new Error(`Unknown instrument ID: ${record.instrumentId}`); | |
| } | |
| if (!record.computedMeasures) return []; | |
| const rows: InstrumentRecordsExport = []; |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/export-worker.ts` around lines 61 - 66, The
function processRecord currently uses a non-null assertion on
instrumentsMap.get(record.instrumentId) which can throw if the key is missing;
update processRecord to guard by retrieving const instrument =
instrumentsMap.get(record.instrumentId); if (!instrument) return []; (or handle
missing instrument appropriately) before using it, remove the dead commented
line ("//const instrument = instrumentsMap.get(record.instrumentId)!;"), and
ensure downstream logic references the guarded instrument variable instead of
assuming it exists.
| parentPort!.on('message', (message: ParentMessage) => { | ||
| switch (message.type) { | ||
| case 'CHUNK_COMPLETE': | ||
| return handleChunkComplete(message.data); | ||
| case 'INIT': | ||
| return handleInit(message.data); | ||
| default: | ||
| throw new Error(`Unexpected message type: ${(message satisfies never as { [key: string]: any }).type}`); | ||
| } | ||
| }); |
There was a problem hiding this comment.
parentPort! assertion fails if file is run outside worker context.
If this file is accidentally imported in the main thread, parentPort is null and this throws. Add a guard for robustness.
Proposed fix
+if (!parentPort) {
+ throw new Error('This module must be run as a worker thread');
+}
+
parentPort!.on('message', (message: ParentMessage) => {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parentPort!.on('message', (message: ParentMessage) => { | |
| switch (message.type) { | |
| case 'CHUNK_COMPLETE': | |
| return handleChunkComplete(message.data); | |
| case 'INIT': | |
| return handleInit(message.data); | |
| default: | |
| throw new Error(`Unexpected message type: ${(message satisfies never as { [key: string]: any }).type}`); | |
| } | |
| }); | |
| if (!parentPort) { | |
| throw new Error('This module must be run as a worker thread'); | |
| } | |
| parentPort!.on('message', (message: ParentMessage) => { | |
| switch (message.type) { | |
| case 'CHUNK_COMPLETE': | |
| return handleChunkComplete(message.data); | |
| case 'INIT': | |
| return handleInit(message.data); | |
| default: | |
| throw new Error(`Unexpected message type: ${(message satisfies never as { [key: string]: any }).type}`); | |
| } | |
| }); |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/export-worker.ts` around lines 126 - 135, The
code assumes parentPort is always present and will throw when imported on the
main thread; add a null-check guard before attaching the message handler so the
file is safe to import outside a worker. Surround the parentPort!.on(...) logic
with a conditional that verifies parentPort !== null (or return early if null)
and only calls parentPort.on('message', ...) to dispatch to handleChunkComplete
and handleInit for ParentMessage types; keep the existing default error for
unexpected message types but ensure it only runs inside the worker context.
| const instrumentsArray = await Promise.all( | ||
| instrumentIds.values().map((id) => this.instrumentsService.findById(id) as Promise<ScalarInstrument>) | ||
| ); |
There was a problem hiding this comment.
Set.values().map() doesn't exist—iterators lack .map() in Node.js 20.
instrumentIds.values() returns an iterator, not an array. Use Array.from() to convert first.
Proposed fix
- const instrumentsArray = await Promise.all(
- instrumentIds.values().map((id) => this.instrumentsService.findById(id) as Promise<ScalarInstrument>)
- );
+ const instrumentsArray = await Promise.all(
+ Array.from(instrumentIds).map((id) => this.instrumentsService.findById(id) as Promise<ScalarInstrument>)
+ );🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
140 - 142, instrumentIds.values() returns an iterator, so calling .map() on it
fails; convert the iterator to an array before mapping. Replace the expression
that builds instrumentsArray to first create an array from instrumentIds (e.g.,
Array.from(instrumentIds) or Array.from(instrumentIds.values()) or
[...instrumentIds]) then map over that array calling
this.instrumentsService.findById(id) (returning Promise<ScalarInstrument>) and
wrap with Promise.all to produce instrumentsArray. Ensure you reference
instrumentIds, instrumentsArray, and instrumentsService.findById in the updated
code.
| const numWorkers = Math.min(cpus().length, Math.ceil(records.length / 100)); // Use up to CPU count, chunk size 100 | ||
| const chunkSize = Math.ceil(records.length / numWorkers); | ||
| const chunks = []; |
There was a problem hiding this comment.
Edge case: zero records results in division by zero chunks.
When records.length is 0, numWorkers becomes 0, and chunkSize becomes Infinity. The loop won't execute, but returning early is cleaner.
Proposed fix
+ if (records.length === 0) {
+ return [];
+ }
+
const numWorkers = Math.min(cpus().length, Math.ceil(records.length / 100));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const numWorkers = Math.min(cpus().length, Math.ceil(records.length / 100)); // Use up to CPU count, chunk size 100 | |
| const chunkSize = Math.ceil(records.length / numWorkers); | |
| const chunks = []; | |
| if (records.length === 0) { | |
| return []; | |
| } | |
| const numWorkers = Math.min(cpus().length, Math.ceil(records.length / 100)); // Use up to CPU count, chunk size 100 | |
| const chunkSize = Math.ceil(records.length / numWorkers); | |
| const chunks = []; |
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
146 - 148, The code computes numWorkers and chunkSize from records.length and
will produce numWorkers = 0 and chunkSize = Infinity when records.length === 0;
add an early return at the start of the processing function (before computing
numWorkers/chunkSize/chunks) that returns an empty array or appropriate empty
result when records.length === 0, referencing the existing variables records,
numWorkers, and chunkSize so you avoid division by zero and unnecessary work.
| private async queryRecordsRaw(appAbility: AppAbility, groupId?: string) { | ||
| const pipeline = [ | ||
| { | ||
| // Join with Session collection | ||
| $lookup: { | ||
| as: 'session', | ||
| foreignField: '_id', | ||
| from: 'SessionModel', | ||
| localField: 'sessionId' // Ensure this matches your @map or field name in Prisma | ||
| } | ||
| }, | ||
| { $unwind: { path: '$session', preserveNullAndEmptyArrays: true } }, | ||
| { | ||
| // Join with Subject collection | ||
| $lookup: { | ||
| as: 'subject', | ||
| foreignField: '_id', | ||
| from: 'SubjectModel', | ||
| localField: 'subjectId' | ||
| } | ||
| }, | ||
| { $unwind: { path: '$subject', preserveNullAndEmptyArrays: true } }, | ||
| { | ||
| $match: { | ||
| $expr: groupId | ||
| ? { | ||
| $eq: ['$groupId', { $toObjectId: groupId }] | ||
| } | ||
| : {} | ||
| } | ||
| }, | ||
| { | ||
| $project: { | ||
| computedMeasures: 1, | ||
| date: { | ||
| $dateToString: { | ||
| date: '$createdAt', | ||
| format: '%Y-%m-%d' | ||
| } | ||
| }, | ||
| groupId: { | ||
| $toString: '$groupId' | ||
| }, | ||
| id: { | ||
| $toString: '$_id' | ||
| }, | ||
| instrumentId: 1, | ||
| session: { | ||
| date: { | ||
| $dateToString: { | ||
| date: '$session.date', | ||
| format: '%Y-%m-%d' | ||
| } | ||
| }, | ||
| id: { | ||
| $toString: '$session._id' | ||
| }, | ||
| type: '$session.type', | ||
| user: { username: '$session.user.username' } // TBD test this works | ||
| }, | ||
| // sessionId: 1, | ||
| subject: { | ||
| age: { | ||
| $dateDiff: { | ||
| endDate: '$$NOW', | ||
| startDate: '$subject.dateOfBirth', | ||
| unit: 'year' | ||
| } | ||
| }, | ||
| dateOfBirth: '$subject.dateOfBirth', | ||
| groupIds: '$subject.groupIds', // TBD make sure groupIds is string array | ||
| id: { | ||
| $toString: '$subject._id' | ||
| }, | ||
| sex: '$subject.sex' | ||
| } | ||
| } | ||
| } | ||
| ]; | ||
|
|
||
| const records = (await this.instrumentRecordModel.aggregateRaw({ pipeline })) as unknown as unknown[]; | ||
|
|
||
| /** | ||
| * We need to create a shallow copy of all records here, as the aggregateRaw method returns objects | ||
| * with the Symbol(nodejs.util.inspect.custom) property defined, which is not serializable. | ||
| */ | ||
|
|
||
| // eslint-disable-next-line @typescript-eslint/no-unsafe-argument | ||
| return (records.filter((record) => appAbility.can('read', record as any)) as RecordType[]).map((record) => ({ | ||
| ...record | ||
| })); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "accessibleQuery" apps/api/src/instrument-records/instrument-records.service.ts -B2 -A2Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 1550
🏁 Script executed:
# Also check for other methods using appAbility in this file
rg -n "appAbility" apps/api/src/instrument-records/instrument-records.service.ts -B1 -A3Repository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 486
🏁 Script executed:
# Check the overall structure of the service
head -100 apps/api/src/instrument-records/instrument-records.service.tsRepository: DouglasNeuroInformatics/OpenDataCapture
Length of output: 3899
Post-query filtering bypasses database-level access control.
Unlike other methods in this service (e.g., count(), findById(), findMany()), which use accessibleQuery() to enforce access control at the Prisma query level, queryRecordsRaw() fetches records matching only the groupId filter and applies ability checks afterward via appAbility.can(). This fetches potentially unauthorized records into memory before filtering. Consider pushing ability checks into the aggregation pipeline or documenting if this approach is intentional.
🤖 Prompt for AI Agents
In `@apps/api/src/instrument-records/instrument-records.service.ts` around lines
418 - 509, queryRecordsRaw is applying authorization after fetching data (using
appAbility.can), which can load unauthorized records; fix by pushing the ability
constraints into the DB aggregation instead of post-filtering: derive the same
access-filter used by accessibleQuery (or build equivalent filter from
AppAbility) and merge it into the aggregation pipeline's $match (before
projection), or switch to a Prisma query that uses accessibleQuery and then
perform any necessary $lookup/$project only on the already-filtered ids; target
symbols: queryRecordsRaw, accessibleQuery, instrumentRecordModel.aggregateRaw,
appAbility.can and ensure the $match added to the pipeline encodes the user's
allowed conditions so unauthorized documents are never returned from
aggregateRaw.
Summary by CodeRabbit
Improvements
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.