Skip to content

feat(aggregations): cross-schema joins with literal-filter#1479

Merged
1 commit merged into
developmentfrom
feat/scholiq-deps/cross-schema-aggregations
May 15, 2026
Merged

feat(aggregations): cross-schema joins with literal-filter#1479
1 commit merged into
developmentfrom
feat/scholiq-deps/cross-schema-aggregations

Conversation

@rubenvdlinde
Copy link
Copy Markdown
Contributor

Summary

  • Adds from field support to x-openregister-aggregations enabling cross-schema aggregations (e.g. a Regulation schema counting rows in an Enrolment schema).
  • @self.<field> references in where clauses resolve against the parent object row, so each parent object gets its own scoped count.
  • Supports eq, ne, gt, gte, lt, lte, in operators in cross-schema where clauses via the existing filter engine.
  • Postgres-native fast path (SQL) is used for the target schema table; PHP-fallback applies when Postgres is unavailable. Both paths are capped at 10k rows (configurable via limit).
  • RBAC is double-gated: the parent schema's list permission runs in run(), and the target schema's list permission runs inside runCrossSchema() — callers cannot leak counts from schemas they cannot list.
  • Adds select/where aliases for metric/filter on intra-schema specs (backward-compatible).
  • AggregationAnnotationValidator now accepts cross-schema specs with lighter validation (field-existence checks are skipped for the target schema which is not available at annotation-save time).

Closes scholiq deps #7.

Test plan

  • AggregationAnnotationValidatorTest: 7 new cross-schema validator cases (empty from, bad metric, non-map where, field-existence skipped, select/where aliases)
  • CrossSchemaAggregationRunnerTest: 6 new runner unit tests (@self ref resolution, missing-field fail-closed, in operator, schema-not-found, register-not-found, RBAC gate on target schema)
  • All existing AggregationRunnerIntegrationTest intra-schema tests remain green (no behaviour change to existing path)
  • PHPCS clean on lib/ (0 errors, 0 warnings treated as errors)
  • PHPMD clean on changed files
  • Psalm — no errors on changed files

…holiq deps #7)

Add `from` field support to `x-openregister-aggregations` enabling cross-schema
aggregations. When a spec declares `from: <schema-slug>`, the engine queries the
named schema's table rather than the parent schema's table.

Key changes:
- AggregationRunner: delegate to runCrossSchema() when `from` is present; supports
  @self.<field> parent-reference resolution in `where` clauses; applies eq/ne/gt/gte/
  lt/lte/in operators via the existing applyFilter path; uses Postgres-native fast path
  for the target schema table; double-gates RBAC on both parent and target schemas;
  caches independently per parent-row field values.
- AggregationRunner: adds `select`/`where` aliases for `metric`/`filter` (new DSL) on
  intra-schema specs too; adds optional `parentRow` param to run() for @self resolution.
- AggregationAnnotationValidator: accepts cross-schema specs (`from` key) with lighter
  validation (field-existence skipped for target schema); supports `select`/`where`
  aliases on intra-schema specs.
- Tests: 7 validator unit tests for cross-schema DSL + 6 runner unit tests covering
  @self ref resolution, in-operator filtering, schema-not-found, register-not-found,
  malformed-where, and RBAC gate on target schema.
@github-actions
Copy link
Copy Markdown
Contributor

Quality Report — ConductionNL/openregister @ e34e2fe

Check PHP Vue Security License Tests
lint
phpcs
phpmd
psalm
phpstan
phpmetrics
eslint
stylelint
composer ✅ 153/153
npm ✅ 598/598
PHPUnit
Newman ⏭️
Playwright ⏭️

Quality workflow — 2026-05-11 20:14 UTC

Download the full PDF report from the workflow artifacts.

object: null
) === false
) {
throw new RuntimeException(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] Cross-schema RBAC denial throws RuntimeException → HTTP 404 instead of 403

The cross-schema target-schema RBAC gate throws RuntimeException, which AggregationController catches and returns as HTTP 404. A caller with list permission on the parent schema but not the target receives a 404 — identical to a missing resource — instead of a 403. This is an information-disclosure vector: an attacker can enumerate which cross-schema targets exist by observing 404 vs. other errors. Fix: replace throw new RuntimeException(…'Forbidden…') with throw new NotAuthorizedException(…) to match the parent-schema gate behaviour.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] SchemaMapper mock willReturnMap won't match — likely cause of CI PHPUnit failures on PHP 8.3 + 8.4

AggregationRunner::loadSchema() calls $this->schemaMapper->find($schemaRef) with one argument; PHPUnit's willReturnMap matches against explicitly-passed arguments only. Every test using willReturnMap([['regulation', [], null, true, true, $parentSchema], …]) will miss the match (the mock receives ['regulation'], not ['regulation', [], null, true, true]) and return null, causing every positive test to fail with an unexpected RuntimeException. Fix: switch to ->willReturnCallback(fn($ref) => match($ref) {…}) or use single-element map entries.

$registers = $this->registerMapper->findAll();
foreach ($registers as $register) {
$schemaIds = $register->getSchemas();
if (is_array($schemaIds) === true && in_array($schemaId, $schemaIds, false) === true) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] Loose in_array comparison in findRegisterForSchema — type mismatch silently misses schema

in_array($schemaId, $schemaIds, false) uses loose comparison. 0 == 'any-non-numeric-string' is true in PHP < 8.0 (still supported by NC stable32). Use in_array($schemaId, $schemaIds, true) and normalise the schemas list to int in setSchemas() to eliminate the ambiguity.

*
* @param Schema $schema The target schema to find a register for.
*
* @return Register|null The first matching register, or null.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] findRegisterForSchema fetches all registers on every cross-schema call — unguarded N+1

findRegisterForSchema() calls $this->registerMapper->findAll() without any limit or caching. On a tenant with hundreds of registers this is a full table scan per cross-schema aggregation request. Add a private array $registerCache = [] keyed on schema ID, or pass the already-loaded $parentRegister as a hint when the target schema is in the same register.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] No test exercises the Postgres-native path for cross-schema aggregation

All seven cross-schema runner tests force the PHP-fallback path. The tryNativeAggregation() branch (which builds a parameterised SQL query) is entirely untested at the unit level. Add at least one test that stubs getDatabasePlatform to return a PostgreSQL-named platform and verifies the backend: postgres key in the result.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] testMalformedWhereClauseCastToArrayDoesNotCrash assertion is too weak

The test passes a non-map where string and only asserts assertArrayHasKey('value', $result) — it would pass even if the code returned 1 (bug). Assert $this->assertSame(0, $result['value']) to capture the fail-closed behaviour.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] No test for empty result set from cross-schema aggregation

None of the tests verify behaviour when the target schema's table returns zero rows. For count the expected value is 0; for avg/min/max the expected value is null. Add a test where findAllInRegisterSchemaTable returns [].

): array {
$fromRef = (string) ($spec['from'] ?? '');
if ($fromRef === '') {
throw new RuntimeException('Cross-schema aggregation spec is missing a non-empty `from` key.');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] runCrossSchema swallows Throwable from native path without logging

tryNativeAggregation() catches \Throwable and returns null to signal fallback. A misconfigured target table will silently degrade to the PHP-fallback path with no observable signal. Add a $this->logger?->debug(…) when the native path falls through for cross-schema calls.

*
* @return array<int, array{code: string, message: string}> Error list (empty = valid).
*/
private function validateCrossSchemaSpec(string $name, array $spec): array
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] validateCrossSchemaSpec does not validate @self reference syntax

validateCrossSchemaSpec accepts any string value in where, including malformed @self references like @self. or bare @self. A typo like @self.slugg passes validation silently and produces a zero-count result that looks correct. Add a pattern check: any string matching /^@self\./ should have a non-empty field name after the dot.

Copy link
Copy Markdown
Contributor

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

🔴 Blockers (3)

🟡 Concerns (6)

🟢 Minor (1)

  • Three new PHPMD SuppressWarnings added — symptom of growing class complexity (lib/Service/Aggregation/AggregationRunner.php:145)
    @SuppressWarnings(PHPMD.ExcessiveClassLength) at class level plus ExcessiveMethodLength, CyclomaticComplexity, NPathComplexity on runCrossSchema(). runCrossSchema() is 160+ lines with three distinct paths. Consider extracting a CrossSchemaAggregationPipeline class rather than suppressing the warnings.

Reviewed by WilcoLouwerse via automated batch review.

object: null
) === false
) {
throw new RuntimeException(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] Cross-schema RBAC denial throws RuntimeException → HTTP 404 instead of 403

The cross-schema target-schema RBAC gate throws RuntimeException, which AggregationController catches and returns as HTTP 404. A caller with list permission on the parent schema but not the target receives a 404 — identical to a missing resource — instead of a 403. This is an information-disclosure vector: an attacker can enumerate which cross-schema targets exist by observing 404 vs. other errors. Fix: replace throw new RuntimeException(…'Forbidden…') with throw new NotAuthorizedException(…) to match the parent-schema gate behaviour.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] SchemaMapper mock willReturnMap won't match — likely cause of CI PHPUnit failures on PHP 8.3 + 8.4

AggregationRunner::loadSchema() calls $this->schemaMapper->find($schemaRef) with one argument; PHPUnit's willReturnMap matches against explicitly-passed arguments only. Every test using willReturnMap([['regulation', [], null, true, true, $parentSchema], …]) will miss the match (the mock receives ['regulation'], not ['regulation', [], null, true, true]) and return null, causing every positive test to fail with an unexpected RuntimeException. Fix: switch to ->willReturnCallback(fn($ref) => match($ref) {…}) or use single-element map entries.

$registers = $this->registerMapper->findAll();
foreach ($registers as $register) {
$schemaIds = $register->getSchemas();
if (is_array($schemaIds) === true && in_array($schemaId, $schemaIds, false) === true) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] Loose in_array comparison in findRegisterForSchema — type mismatch silently misses schema

in_array($schemaId, $schemaIds, false) uses loose comparison. 0 == 'any-non-numeric-string' is true in PHP < 8.0 (still supported by NC stable32). Use in_array($schemaId, $schemaIds, true) and normalise the schemas list to int in setSchemas() to eliminate the ambiguity.

*
* @param Schema $schema The target schema to find a register for.
*
* @return Register|null The first matching register, or null.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] findRegisterForSchema fetches all registers on every cross-schema call — unguarded N+1

findRegisterForSchema() calls $this->registerMapper->findAll() without any limit or caching. On a tenant with hundreds of registers this is a full table scan per cross-schema aggregation request. Add a private array $registerCache = [] keyed on schema ID, or pass the already-loaded $parentRegister as a hint when the target schema is in the same register.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] No test exercises the Postgres-native path for cross-schema aggregation

All seven cross-schema runner tests force the PHP-fallback path. The tryNativeAggregation() branch (which builds a parameterised SQL query) is entirely untested at the unit level. Add at least one test that stubs getDatabasePlatform to return a PostgreSQL-named platform and verifies the backend: postgres key in the result.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] testMalformedWhereClauseCastToArrayDoesNotCrash assertion is too weak

The test passes a non-map where string and only asserts assertArrayHasKey('value', $result) — it would pass even if the code returned 1 (bug). Assert $this->assertSame(0, $result['value']) to capture the fail-closed behaviour.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] No test for empty result set from cross-schema aggregation

None of the tests verify behaviour when the target schema's table returns zero rows. For count the expected value is 0; for avg/min/max the expected value is null. Add a test where findAllInRegisterSchemaTable returns [].

): array {
$fromRef = (string) ($spec['from'] ?? '');
if ($fromRef === '') {
throw new RuntimeException('Cross-schema aggregation spec is missing a non-empty `from` key.');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] runCrossSchema swallows Throwable from native path without logging

tryNativeAggregation() catches \Throwable and returns null to signal fallback. A misconfigured target table will silently degrade to the PHP-fallback path with no observable signal. Add a $this->logger?->debug(…) when the native path falls through for cross-schema calls.

*
* @return array<int, array{code: string, message: string}> Error list (empty = valid).
*/
private function validateCrossSchemaSpec(string $name, array $spec): array
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] validateCrossSchemaSpec does not validate @self reference syntax

validateCrossSchemaSpec accepts any string value in where, including malformed @self references like @self. or bare @self. A typo like @self.slugg passes validation silently and produces a zero-count result that looks correct. Add a pattern check: any string matching /^@self\./ should have a non-empty field name after the dot.

Copy link
Copy Markdown
Contributor

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

🔴 Blockers (3)

🟡 Concerns (6)

🟢 Minor (1)

  • Three new PHPMD SuppressWarnings added — symptom of growing class complexity (lib/Service/Aggregation/AggregationRunner.php:145)
    @SuppressWarnings(PHPMD.ExcessiveClassLength) at class level plus ExcessiveMethodLength, CyclomaticComplexity, NPathComplexity on runCrossSchema(). runCrossSchema() is 160+ lines with three distinct paths. Consider extracting a CrossSchemaAggregationPipeline class rather than suppressing the warnings.

Reviewed by WilcoLouwerse via automated batch review.

Copy link
Copy Markdown
Contributor

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review verdict

3 blockers: RBAC returns 404 instead of 403, PHPUnit mock mismatch, loose in_array. See inline comments above.

Reviewed by WilcoLouwerse — inline comments posted above.

@WilcoLouwerse WilcoLouwerse closed this pull request by merging all changes into development in f86e83f May 15, 2026
@WilcoLouwerse WilcoLouwerse deleted the feat/scholiq-deps/cross-schema-aggregations branch May 15, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants