Skip to content

feat(pdok): PDOK Locatieserver adapter (closes #751)#752

Open
rubenvdlinde wants to merge 1 commit into
developmentfrom
feature/add-pdok-adapter
Open

feat(pdok): PDOK Locatieserver adapter (closes #751)#752
rubenvdlinde wants to merge 1 commit into
developmentfrom
feature/add-pdok-adapter

Conversation

@rubenvdlinde
Copy link
Copy Markdown
Contributor

Implements add-pdok-adapter — the openconnector subset of the hydra umbrella shared-pdok-via-openconnector.

What

  • PdokConnector (lib/Connectors/PdokConnector.php) — server-side proxy for PDOK Locatieserver v3.1 with suggest/lookup/free/reverse, APCu caching, 429 exponential backoff (3 retries, cap 5000ms, ±10% jitter), 5-failure circuit breaker (30s open window), write-through to OR's addresses register, and structured observability logging (one entry per upstream call).
  • PdokController — four REST endpoints, NoAdminRequired + NoCSRFRequired, parameter validation returning 400 on missing q/id/coords, 503 with message_key: pdok.unavailable when degraded.
  • Routes /api/pdok/{suggest|lookup|free|reverse} in appinfo/routes.php.
  • i18n keys in l10n/{en,nl}.{js,json}.
  • PHPUnit tests covering normalisation, cache hit, circuit-open short-circuit, parameter validation.
  • Three raw PDOK fixtures (Conduction HQ, Stadhuisplein Tilburg, woonplaats Tilburg).
  • Strict-validated openspec change with all OC tasks marked done.

Dependencies

Depends on OR's addresses register from openregister#1483 being available at runtime (write-through gracefully no-ops if OR isn't installed).

Test plan

  • Quality gate (PHPCS/Psalm/PHPStan) green in CI
  • Unit tests run green once a proper PHPUnit bootstrap lands
  • Manual smoke: curl -u admin:admin http://localhost:8080/index.php/apps/openconnector/api/pdok/suggest?q=Lauriergra

Spec

openspec/changes/add-pdok-adapter/ — strict-validates clean.

Closes #751.

Implements openspec change add-pdok-adapter — the openconnector subset of
the Hydra umbrella shared-pdok-via-openconnector. Adds:

- lib/Connectors/PdokConnector.php — server-side PDOK Locatieserver v3.1
  proxy with suggest/lookup/free/reverse, APCu caching, 429 backoff,
  5-failure circuit breaker, write-through to OR's addresses register,
  and structured observability logging.
- lib/Connectors/PdokUpstreamException.php — carries upstream HTTP status.
- lib/Controller/PdokController.php — REST endpoints with NoAdminRequired
  + NoCSRFRequired, parameter validation returning 400 on missing q/id/coords.
- appinfo/routes.php — four GET routes at /api/pdok/{suggest|lookup|free|reverse}.
- l10n/{en,nl}.{js,json} — four i18n keys (pdok.unavailable,
  pdok.error.missing_query, pdok.error.not_found, pdok.error.missing_coordinates).
- tests/Unit/Connectors/PdokConnectorTest.php — normalisation, cache hit,
  circuit-open short-circuit, empty-query short-circuit, successful free search.
- tests/Unit/Controller/PdokControllerTest.php — parameter validation, 400 envelopes,
  delegation to connector.
- tests/fixtures/pdok/ — three raw PDOK fixtures (Lauriergracht HQ,
  Stadhuisplein Tilburg, woonplaats Tilburg with no postcode/huisnummer).
- openspec/changes/add-pdok-adapter/ — strict-validated spec + tasks all marked done.

Per ADR-022: apps consume the PDOK adapter via openconnector; OR addresses
register is the canonical read path. Per ADR-005: requireLogin on every endpoint.
Per ADR-007: Dutch + English i18n. Per ADR-008: PHPUnit coverage for connector +
controller paths.

Implements #751
@github-actions
Copy link
Copy Markdown
Contributor

Quality Report — ConductionNL/openconnector @ ffcc948

Check PHP Vue Security License Tests
lint
phpcs
phpmd
psalm
phpstan
phpmetrics
eslint
stylelint
composer ✅ 148/148
npm ❌ 1/573 denied
PHPUnit ⏭️
Newman ⏭️
Playwright ⏭️

❌ Denied npm licenses

Package Version License
@fortawesome/free-solid-svg-icons 6.7.2 (CC-BY-4.0 AND MIT)

Quality workflow — 2026-05-11 21:40 UTC

Download the full PDF report from the workflow artifacts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] NoCSRFRequired on all 4 actions without justification — policy violation

All four action methods carry #[NoCSRFRequired]. Because these are authenticated GET endpoints that proxy user-supplied parameters, CSRF protection is low-cost and omitting it is a policy violation for Nextcloud apps. GET requests that trigger write-through side-effects (writing to OR's addresses register) MUST NOT skip CSRF checks. Remove #[NoCSRFRequired] from all four actions and let the Nextcloud framework enforce the token header. If the endpoints must be callable from a non-browser context (e.g. API token auth), use #[CORS] and the BruteForce attributes instead.

if ($this->cache === null) {
return;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] strtotime() on untrusted OR data can return false — always treats OR record as stale

strtotime(($docs[0]['fetchedAt'] ?? '1970-01-01T00:00:00Z')) will return false when fetchedAt is an unrecognised string from the OR store. time() - false evaluates to time() (a very large int), so the OR record is always treated as stale and upstream PDOK is always called — silently breaking the OR-hit cache path. Fix: $fetchedAt = strtotime($docs[0]['fetchedAt'] ?? ''); if ($fetchedAt === false || (time() - $fetchedAt) > self::TTL_RESOLVED) { return null; }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] 503/404 status logic is inverted — zero results with stale=true returns 503, without returns 404

The two numFound === 0 branches check empty($payload['stale']) first and return 503 when stale is NOT set (the plain upstream-empty case), then return 404 when stale IS set. That is backwards: an upstream empty result should be 404; a circuit-open/rate-limit result (stale=true) should be 503. Fix: swap the condition — if ($numFound === 0 && !empty($payload['stale'])) { return 503; } if ($numFound === 0) { return 404; }. The same bug exists in reverseAction.



/**
* Map a single PDOK document to the canonical PostalAddress shape.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BLOCKER] 429 retry loop does not call circuitOnFailure() per retry — breaker won't open on repeated 429s

On 429 responses, circuitOnFailure() is only called at exhaustion (after all retries), not per-retry. Five distinct requests each exhausting their 3 retries on 429s will accumulate only 5 circuitOnFailure() calls total — but the exhaustion-per-request logic should accumulate them, not the per-retry loop. As a result the circuit breaker may not open after repeated rate-limit exhaustions from different requests as the spec requires.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] NoAdminRequired but no requireLogin() call — spec requires requireLogin() guard

#[NoAdminRequired] only removes the admin requirement but does not enforce that the user is logged in at all. The spec, proposal, design, and tasks all explicitly state all four routes MUST use requireLogin(). Add $this->requireLogin() at the start of each action method, or verify that the absence of #[PublicPage] alone is sufficient for the targeted Nextcloud versions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] usleep() up to 5000ms blocks the PHP-FPM worker — cascading request failure risk

usleep($delay * 1000) where $delay can be up to 5000ms will block the entire PHP-FPM worker thread for up to 5 seconds per retry, multiplied by up to 3 retries = 15 seconds. On a busy server this exhausts the worker pool and causes cascading request failures. Reduce BACKOFF_CAP_MS significantly (e.g., 1000ms) or document this as a known resource constraint.

if ($this->cache === null) {
return;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] writeThrough always upserts — should skip write when OR record is still fresh

writeThrough() unconditionally calls $or->saveObject(...) for every successful upstream fetch. The spec requires: 'if present and fetchedAt older than 1 hour, update'. Add a pre-check: call orLookup() first and skip the write if a fresh record exists.

throw $e;
} catch (Throwable $e) {
$this->circuitOnFailure();
throw new PdokUpstreamException(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] OR hit logged with cacheHit=true instead of orHit=true — wrong metrics

Line 322: logCall($endpoint, true, true, ...) on the OR-hit path passes cacheHit=true and orHit=true. The first boolean should be false (no APCu cache hit). Fix: $this->logCall($endpoint, false, true, null, null, $this->circuitState(), false) on the OR-hit path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] No test for 429-exhaustion→503, write-through, or stale-OR-record paths

Missing coverage for: three 429s → exception (task OC-5.1), write-through create/update (OC-4.1), stale header (OC-7.1), and logging assertions (OC-9.1). The 429 retry loop and write-through logic have no automated verification.

*/
private function orLookup(string $endpoint, array $params): ?array
{
$or = $this->getOpenRegisterObjectService();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] WKT regex character class [-\d.]+ matches malformed inputs like --4.8 or 4..8

[-\d.]+ in the POINT regex is a character class, not a structured float pattern. It matches strings like --4.8 or 4..8. Use a stricter pattern: /POINT\((-?\d+\.\d+)\s+(-?\d+\.\d+)\)/i.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN] loadFixture() does not check for file_get_contents failure — silently errors on missing fixture

If the fixture file does not exist, file_get_contents returns false, (string) false is '', and json_decode('') returns null. The subsequent $fixture['response'] access throws a PHP fatal. Add: $this->assertFileExists($path, "Fixture $name not found"); return json_decode(file_get_contents($path), true, 512, JSON_THROW_ON_ERROR);

Copy link
Copy Markdown
Contributor

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

🔴 Blockers (4)

🟡 Concerns (8)

🟢 Minor (4)

  • bagBuildingId maps pandid array directly — PDOK returns pandid as a JSON array (lib/Connectors/PdokConnector.php:492)
    In the fixture, pandid is ["0363100012170432"] (a JSON array). The normalize method does $pdokDoc['pandid'] ?? null which stores the entire array as bagBuildingId instead of the first element. Fix: $pandid = $pdokDoc['pandid'] ?? null; 'bagBuildingId' => is_array($pandid) ? ($pandid[0] ?? null) : $pandid.
  • BASE_URL hardcodes v3_1 path — should be a named constant (lib/Connectors/PdokConnector.php:134)
    PDOK Locatieserver versioning is done via URL path (v3_1). When PDOK releases v4, all deployments require a code change. Extract the version segment to a named constant API_VERSION = 'v3_1' and construct BASE_URL from it, or document this as a known limitation with a follow-up issue reference.
  • $rows and $start not validated — caller can request unlimited rows from PDOK (lib/Controller/PdokController.php:1021)
    freeAction(string $q = '', int $rows = 10, int $start = 0) passes $rows and $start directly from the HTTP request to PDOK. A caller could send rows=10000 causing a large upstream response and memory pressure. Add: $rows = max(1, min($rows, 100)); $start = max(0, $start); in the controller before delegation.
  • reverse() does not validate coordinate ranges before forwarding to PDOK (lib/Connectors/PdokConnector.php:287)
    Latitude must be in [-90, 90] and longitude in [-180, 180]. Invalid values are forwarded to PDOK which may return a confusing error. Add range validation: if ($lat < -90 || $lat > 90 || $lng < -180 || $lng > 180) { return ['docs' => [], 'numFound' => 0, 'error' => 'invalid_coordinates']; }.

Reviewed by WilcoLouwerse via automated batch review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants