Skip to content

remove the hardcoded dataset name and use the api to get the name#1081

Merged
pooleycodes merged 24 commits into
mainfrom
update-dataset-name-reference
Oct 29, 2025
Merged

remove the hardcoded dataset name and use the api to get the name#1081
pooleycodes merged 24 commits into
mainfrom
update-dataset-name-reference

Conversation

@feba-rajan
Copy link
Copy Markdown
Contributor

@feba-rajan feba-rajan commented Aug 6, 2025

Description

PR updates the providers service to use the planning data api to get dataset names instead of hardcoded text.

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

Before

Before screenshot here

After

After screenshot here

Added/updated tests?

We encourage you to keep the code coverage percentage at 80% and above.

  • Yes
  • No, and this is why: Please replace this line with details on why tests have not been included
  • I need help with writing tests

QA sign off

  • Code has been checked and approved
  • Design has been checked and approved
  • Product and business logic has been checked and proved

[optional] Are there any post-deployment tasks we need to perform?

[optional] Are there any dependencies on other PRs or Work?

Summary by CodeRabbit

  • New Features

    • Dataset names now load dynamically from the live service and are displayed across forms, deep‑links and navigation.
  • Improvements

    • Server‑side Redis caching added with graceful fallback to live fetch for faster, more resilient name lookups.
    • Form flows and deep‑link validation use dynamic dataset metadata; routing to geometry selection now relies on a session flag.
    • Status page now determines completion client‑side.
  • Tests

    • Unit and acceptance tests expanded to cover dynamic name resolution and updated flows.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Aug 6, 2025

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 66.1% 6198 / 9376
🔵 Statements 66.1% 6198 / 9376
🔵 Functions 65.01% 262 / 403
🔵 Branches 80.21% 823 / 1026
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
src/assets/js/statusPage.js 91.42% 100% 100% 91.42% 36-37, 77-80
src/controllers/CheckAnswersController.js 90% 84.21% 100% 90% 64-67, 71-74, 86-92
src/controllers/checkDeepLinkController.js 84.37% 66.66% 100% 84.37% 24-35, 54-56
src/controllers/datasetController.js 87.03% 66.66% 100% 87.03% 38-40, 56-59
src/controllers/endpointSubmissionFormDeepLinkController.js 80% 71.42% 100% 80% 14-25
src/middleware/common.middleware.js 66.3% 93.33% 40.74% 66.3% 29-38, 42, 58-59, 61-62, 74-76, 80, 96-103, 162-175, 181-203, 336, 345-347, 377-379, 427-468, 508-512, 517-520, 532-536, 564-578, 585-610, 616-624, 628-644, 713-715, 788-795, 809-864, 895-896, 899-902, 905-918, 943-948, 958-961
src/middleware/middleware.builders.js 76.65% 85.71% 69.56% 76.65% 49-51, 54, 123-134, 155-179, 196-216, 229-242, 255-258, 308-309, 390-393, 414-422
src/routes/form-wizard/check/steps.js 0% 0% 0% 0% 1-138
src/utils/datasetLoader.js 47.94% 57.89% 100% 47.94% 11-26, 28-34, 52-53, 70-79, 85-90
src/utils/utils.js 100% 90.9% 100% 100%
Generated in workflow #1247 for commit d74b059 by the Vitest Coverage Report Action

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 6, 2025

Walkthrough

Adds a Redis-backed dataset name cache and dynamic dataset name resolution; converts static dataset exports into async builders; updates controllers, middleware, routes, tests, and page objects to use async dataset retrieval and StartPage-driven acceptance navigation.

Changes

Cohort / File(s) Summary
Dataset caching layer
src/utils/datasetLoader.js
New module: getRedisClient() (lazy Redis client with lifecycle handling), fetchDatasetNames() (fetch /dataset.json and validate response), and getDatasetNameMap() (Redis-backed cache with 300s TTL and live-fetch fallback).
Utils module refactoring
src/utils/utils.js
Replaced static dataSubjects/datasets exports with async builders: buildDataSubjects(), getDataSubjects(), getDatasets(); imports getDatasetNameMap to populate dataset texts dynamically.
Controllers async conversion
src/controllers/CheckAnswersController.js, src/controllers/checkDeepLinkController.js, src/controllers/datasetController.js, src/controllers/endpointSubmissionFormDeepLinkController.js
Replaced static imports with getDataSubjects()/getDatasets() and awaited results; made relevant handlers/methods async; dataset metadata resolved from awaited maps; geometry-requirement checks precomputed and stored in session.
Middleware async updates
src/middleware/common.middleware.js, src/middleware/middleware.builders.js
Converted available-dataset computation to async: added getAvailableDatasets() and changed setAvailableDatasets / CONSTANTS.availableDatasets to async flows using getDataSubjects().
Form wizard routing
src/routes/form-wizard/check/steps.js
Replaced function-based guards with session boolean requiresGeometryTypeSelection === true for dataset/link steps; adjusted imports accordingly.
Client status script
src/assets/js/statusPage.js
Removed server dependency for finished statuses and declared local finishedProcessingStatuses = ['COMPLETE','FAILED'].
Unit tests for caching
test/unit/datasetLoader.test.js
New Vitest suite for getDatasetNameMap() validating URL construction, headers, JSON parsing and error handling.
Unit tests async updates & mocks
test/unit/util.test.js, test/unit/datasetController.test.js, test/unit/checkDeepLinkController.test.js, test/unit/endpointSubmissionFormLinkController.test.js, test/unit/statusPage.test.js
Updated mocks to async-aware wrappers, awaited controller methods in tests, and enhanced fetch mock for /dataset.json.
Acceptance test restructuring
test/acceptance/request_check.test.js, test/acceptance/dataset_overview.test.js
Migrated flows to StartPage entry and verifyAndReturnPage() navigation; added acceptance test asserting displayed dataset names match dynamically provided name map.
Page object models
test/PageObjectModels/statusPage.js, test/PageObjectModels/startPage.js
Added isCheckStatusButtonVisible() to StatusPage; added StartPage default export with verifyAndReturnPage(pageClass) for navigation.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Controller as Controller
    participant Utils as utils.js
    participant Loader as datasetLoader.js
    participant Redis
    participant API as remote /dataset.json

    Controller->>Utils: await getDatasets()
    Utils->>Utils: await getDataSubjects()
    Utils->>Loader: getDatasetNameMap(datasetKeys)

    Loader->>Redis: GET dataset:<sorted_keys>
    alt Cache hit
        Redis-->>Loader: cached JSON map
    else Cache miss / Redis unavailable
        Loader->>API: GET /dataset.json?dataset=...&field=dataset&field=name&include_typologies=false
        API-->>Loader: { datasets: [...] }
        alt Redis available
            Loader->>Redis: SET dataset:<sorted_keys> (TTL 300s)
        end
    end

    Loader-->>Utils: nameMap
    Utils-->>Controller: datasets Map (with dataset texts)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Areas needing extra attention:

  • Redis lifecycle, connection error handling and reset logic in src/utils/datasetLoader.js.
  • Async propagation and await usage across controllers/middleware to avoid runtime ordering or deadlock issues.
  • Updated test mocks and async expectations in test/unit/* (ensure mocks mirror new async APIs).
  • Acceptance flow changes: StartPage / verifyAndReturnPage and updated page objects.

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • DilwoarH

Poem

🐇 I nibble keys and hop through cache so spry,

Redis hums a secret name nearby.
Async carrots line the coder's way,
Tests and StartPage lead the playful fray,
A rabbit cheers for datasets on display!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "remove the hardcoded dataset name and use the api to get the name" is directly aligned with the main objective of this changeset. The PR's core purpose, as confirmed by the PR objectives and raw summary, is to replace hardcoded dataset names with dynamically fetched names from the planning data API endpoint. The title accurately captures this intent in a clear, single-sentence format without vague terminology. While the changeset includes additional improvements like Redis caching for performance and updates to multiple controllers and middleware files, the title appropriately focuses on the primary change without requiring coverage of every implementation detail.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch update-dataset-name-reference

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7eba92c and 766b2a9.

📒 Files selected for processing (5)
  • src/utils/datasetLoader.js (1 hunks)
  • src/utils/utils.js (1 hunks)
  • test/unit/datasetLoader.test.js (1 hunks)
  • test/unit/statusPage.test.js (1 hunks)
  • test/unit/util.test.js (1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#510
File: test/unit/middleware/datasetOverview.middleware.test.js:1-3
Timestamp: 2024-10-21T08:57:17.338Z
Learning: Tests for `pullOutDatasetSpecification` have been moved to `common.middleware.test`.
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: test/unit/lpaDetailsController.test.js:20-20
Timestamp: 2024-10-29T15:06:42.432Z
Learning: The function `initDatasetSlugToReadableNameFilter()` includes its own error handling, so additional try-catch blocks around it in test files are unnecessary.
📚 Learning: tests for `pulloutdatasetspecification` have been moved to `common.middleware.test`....
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#510
File: test/unit/middleware/datasetOverview.middleware.test.js:1-3
Timestamp: 2024-10-21T08:57:17.338Z
Learning: Tests for `pullOutDatasetSpecification` have been moved to `common.middleware.test`.

Applied to files:

  • test/unit/statusPage.test.js
  • test/unit/datasetLoader.test.js
  • test/unit/util.test.js
📚 Learning: in `test/unit/noerrorspage.test.js`, avoid using `prettifycolumnname` for column headers because the...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#609
File: test/unit/noErrorsPage.test.js:68-68
Timestamp: 2024-11-12T10:54:09.485Z
Learning: In `test/unit/noErrorsPage.test.js`, avoid using `prettifyColumnName` for column headers because the table can contain spec fields, as requested by Alex.

Applied to files:

  • test/unit/statusPage.test.js
  • test/unit/datasetLoader.test.js
  • test/unit/util.test.js
📚 Learning: in `test/unit/util.test.js`, avoid using iso 8601 date strings (e.g., `'2024-05-15t00:00:00.000z'`) ...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#644
File: test/unit/util.test.js:72-74
Timestamp: 2024-11-13T14:25:37.041Z
Learning: In `test/unit/util.test.js`, avoid using ISO 8601 date strings (e.g., `'2024-05-15T00:00:00.000Z'`) in tests, as they cause issues with daylight saving. Instead, use date strings like `'Tue, 15 May 2024 00:00:00'`.

Applied to files:

  • test/unit/statusPage.test.js
  • test/unit/datasetLoader.test.js
  • test/unit/util.test.js
📚 Learning: the function `initdatasetslugtoreadablenamefilter()` includes its own error handling, so additional ...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: test/unit/lpaDetailsController.test.js:20-20
Timestamp: 2024-10-29T15:06:42.432Z
Learning: The function `initDatasetSlugToReadableNameFilter()` includes its own error handling, so additional try-catch blocks around it in test files are unnecessary.

Applied to files:

  • test/unit/statusPage.test.js
  • test/unit/datasetLoader.test.js
  • test/unit/util.test.js
  • src/utils/datasetLoader.js
📚 Learning: in `test/unit/middleware/issuedetails.middleware.test.js`, template params are verified with a schem...
Learnt from: rosado
PR: digital-land/submit#657
File: test/unit/middleware/issueDetails.middleware.test.js:43-43
Timestamp: 2024-11-14T16:38:49.883Z
Learning: In `test/unit/middleware/issueDetails.middleware.test.js`, template params are verified with a schema, so it's acceptable for the test expectations to use primitive values while the test input uses an object for `issueEntitiesCount`.

Applied to files:

  • test/unit/statusPage.test.js
  • test/unit/util.test.js
📚 Learning: the function `initdatasetslugtoreadablenamefilter()` in `src/utils/datasetslugtoreadablename.js` alr...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • test/unit/datasetLoader.test.js
  • src/utils/datasetLoader.js
  • src/utils/utils.js
🧬 Code Graph Analysis (4)
test/unit/datasetLoader.test.js (2)
src/utils/datasetLoader.js (1)
  • getDatasetNameMap (4-12)
config/index.js (1)
  • config (13-13)
test/unit/util.test.js (1)
src/utils/datasetLoader.js (1)
  • getDatasetNameMap (4-12)
src/utils/datasetLoader.js (1)
config/index.js (1)
  • config (13-13)
src/utils/utils.js (2)
src/utils/datasetLoader.js (1)
  • getDatasetNameMap (4-12)
test/unit/util.test.js (1)
  • dataSubjects (15-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (3)
test/unit/util.test.js (1)

5-29: LGTM!

The test setup correctly mocks the new async dataset loading functionality and maintains backwards compatibility with existing tests.

test/unit/datasetLoader.test.js (1)

5-40: LGTM!

Well-structured tests with good coverage of both success and error scenarios. Proper mock setup and cleanup.

src/utils/utils.js (1)

101-101: Top-level await is fully supported

  • package.json specifies engines.node: ">=22.5.1", well above the Node.js 14.8+ requirement for top-level await.
  • No Babel, Webpack, Rollup or TypeScript transpilation configurations were found that would interfere with ESM or top-level await.

This usage is safe under the project’s runtime and build setup.

Comment thread src/utils/datasetLoader.js Outdated
Comment thread src/utils/datasetLoader.js Outdated
Comment thread src/utils/utils.js
Comment thread src/utils/utils.js
Comment thread test/unit/statusPage.test.js
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/acceptance/dataset_overview.test.js (1)

80-102: Improve test completeness and robustness.

The test correctly validates the integration between dynamic dataset loading and UI display, but consider these improvements:

  1. Test coverage: The test only validates 2 specific datasets whilst line 26 indicates 8 ODP mandatory datasets should be displayed. Consider testing all displayed datasets or at least a larger representative sample.

  2. Avoid hardcoded dataset list: The expectedDatasets array could become outdated. Consider extracting all dataset keys from the name map or making the test more dynamic.

Here's a more comprehensive approach:

  test('Dataset names shown dynamically match names fetched from datasetLoader', async ({ page }) => {
    const organisationId = 'local-authority:LBH'
    const nameMap = await getDatasetNameMap()
    const organisationOverviewPage = new OrganisationOverviewPage(page, { organisationId })
    await organisationOverviewPage.navigateHere()

    expect(await page.locator('h1').innerText()).toEqual('London Borough of Lambeth overview')
    const datasetElements = await page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li').allInnerTexts()
-   const expectedDatasets = [
-     'article-4-direction',
-     'article-4-direction-area'
-   ]
-   const expectedNames = expectedDatasets
-     .map(key => nameMap[key])

    const displayedNames = datasetElements
      .map(text => text.split('\n')[0].trim())
-     .sort()

-   for (const expectedName of expectedNames) {
-     expect(displayedNames).toContain(expectedName)
-   }
+   // Verify that all displayed names exist in the name map
+   for (const displayedName of displayedNames) {
+     const matchingKey = Object.keys(nameMap).find(key => nameMap[key] === displayedName)
+     expect(matchingKey).toBeDefined()
+   }
+   
+   // Verify we have the expected number of datasets
+   expect(displayedNames.length).toEqual(8)
  })
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f063b3f and 484be1a.

📒 Files selected for processing (1)
  • test/acceptance/dataset_overview.test.js (2 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#510
File: test/unit/middleware/datasetOverview.middleware.test.js:1-3
Timestamp: 2024-10-21T08:57:17.338Z
Learning: Tests for `pullOutDatasetSpecification` have been moved to `common.middleware.test`.
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: test/unit/lpaDetailsController.test.js:20-20
Timestamp: 2024-10-29T15:06:42.432Z
Learning: The function `initDatasetSlugToReadableNameFilter()` includes its own error handling, so additional try-catch blocks around it in test files are unnecessary.
📚 Learning: tests for `pulloutdatasetspecification` have been moved to `common.middleware.test`....
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#510
File: test/unit/middleware/datasetOverview.middleware.test.js:1-3
Timestamp: 2024-10-21T08:57:17.338Z
Learning: Tests for `pullOutDatasetSpecification` have been moved to `common.middleware.test`.

Applied to files:

  • test/acceptance/dataset_overview.test.js
📚 Learning: the function `initdatasetslugtoreadablenamefilter()` includes its own error handling, so additional ...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: test/unit/lpaDetailsController.test.js:20-20
Timestamp: 2024-10-29T15:06:42.432Z
Learning: The function `initDatasetSlugToReadableNameFilter()` includes its own error handling, so additional try-catch blocks around it in test files are unnecessary.

Applied to files:

  • test/acceptance/dataset_overview.test.js
📚 Learning: in `test/unit/noerrorspage.test.js`, avoid using `prettifycolumnname` for column headers because the...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#609
File: test/unit/noErrorsPage.test.js:68-68
Timestamp: 2024-11-12T10:54:09.485Z
Learning: In `test/unit/noErrorsPage.test.js`, avoid using `prettifyColumnName` for column headers because the table can contain spec fields, as requested by Alex.

Applied to files:

  • test/acceptance/dataset_overview.test.js
📚 Learning: the function `initdatasetslugtoreadablenamefilter()` in `src/utils/datasetslugtoreadablename.js` alr...
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • test/acceptance/dataset_overview.test.js
📚 Learning: in `test/unit/middleware/issuedetails.middleware.test.js`, template params are verified with a schem...
Learnt from: rosado
PR: digital-land/submit#657
File: test/unit/middleware/issueDetails.middleware.test.js:43-43
Timestamp: 2024-11-14T16:38:49.883Z
Learning: In `test/unit/middleware/issueDetails.middleware.test.js`, template params are verified with a schema, so it's acceptable for the test expectations to use primitive values while the test input uses an object for `issueEntitiesCount`.

Applied to files:

  • test/acceptance/dataset_overview.test.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (1)
test/acceptance/dataset_overview.test.js (1)

4-4: LGTM!

Clean import addition that follows the existing pattern and correctly imports the required utility function.

@eveleighoj
Copy link
Copy Markdown
Contributor

So I've been looking though this and battling with the fact that my laptop is slowly starting to let me down i terms of power :( but the error is specifically happening when we are running the new request and it's making the test time out. The unit test is good as it shows that the functionality works and the acceptance test is useful because it highlights that the performance is the problem. Speccificallly we manually set a Timout for each test of 300000ms which is 5 minutes.

I'm currently running some tests in github (gave up on my local machine) to see if expanding the timeout can make the test pass. This implies it's a time issue and not a different performance issue.

In terms of solving it I think there are two things to implement ragardless and should hopefully reduce the timeout:

  • I don't think this is being cached in redis but I can imagine this list is used frequently across the application and each time it's accessed it's probably redownloading eveything. Can we look at using caching to stop the need to constantly retrieve the data?
  • The request is pulling a huge amount of data if you look at this I think we can make improvements to the dataset API endpoint to allow filtering of returned fields and filtering of dataset. I believe this is already done on the entity api. This would massively reduce the volume of data being retrieved. Shall we create a separate ticket to do this piece of work first?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
src/controllers/CheckAnswersController.js (1)

67-71: Fail-safe when dataset metadata is unavailable

If the API/cache fails or the dataset key is missing, collection becomes undefined and may break postUrlRequest. Add a guarded fallback and log at warn level.

-    const datasets = await getDatasets()
-    const dataset = req.sessionModel.get('dataset')
-    const datasetMeta = datasets.get(dataset) || {}
+    const dataset = req.sessionModel.get('dataset')
+    let datasetMeta = {}
+    try {
+      const datasets = await getDatasets()
+      datasetMeta = datasets.get(dataset) || {}
+    } catch (err) {
+      logger.warn('CheckAnswersController.createJiraServiceRequest(): dataset metadata unavailable; proceeding with fallback', {
+        errorMessage: err.message, type: types.External
+      })
+    }
@@
-      collection: datasetMeta.dataSubject,
+      collection: datasetMeta.dataSubject || dataset,

Also applies to: 84-89

src/controllers/checkDeepLinkController.js (1)

45-67: Make async Express handler safe: wrap in try/catch and return the parent call

Express 4 does not capture promise rejections from async handlers. If getDatasets() throws, the error will be unhandled. Also, returning super.post(...) preserves control flow.

-  async get (req, res, next) {
-    // if the query params don't contain what we need, redirect to the "get started" page,
-    // this way the user can still proceed (but need to fill the dataset+orgName themselves)
-    const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
-    const validationResult = v.safeParse(QueryParams, req.query)
-    if (!(validationResult.success && datasets.has(dataset))) {
-      logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
-        { type: types.App, query: req.query })
-      return res.redirect('/')
-    }
-    req.sessionModel.set('dataset', dataset)
-    const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
-    req.sessionModel.set('data-subject', datasetInfo.dataSubject)
-    const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text }
-    maybeSetReferrer(req, sessionData)
-    req.sessionModel.set(this.sessionKey, sessionData)
-    this.#addHistoryStep(req, '/check/dataset')
-    super.post(req, res, next)
-  }
+  async get (req, res, next) {
+    try {
+      // if the query params don't contain what we need, redirect to the "get started" page,
+      // this way the user can still proceed (but need to fill the dataset+orgName themselves)
+      const { dataset, orgName, orgId } = req.query
+      const datasets = await getDatasets()
+      const validationResult = v.safeParse(QueryParams, req.query)
+      if (!(validationResult.success && datasets.has(dataset))) {
+        logger.info('CheckDeepLinkController.get(): invalid params for deep link, redirecting to landing page',
+          { type: types.App, query: req.query })
+        return res.redirect('/')
+      }
+      req.sessionModel.set('dataset', dataset)
+      const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
+      req.sessionModel.set('data-subject', datasetInfo.dataSubject)
+      const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text ?? dataset }
+      maybeSetReferrer(req, sessionData)
+      req.sessionModel.set(this.sessionKey, sessionData)
+      this.#addHistoryStep(req, '/check/dataset')
+      return super.post(req, res, next)
+    } catch (err) {
+      req.handlerName = 'CheckDeepLinkController.get'
+      return next(err)
+    }
+  }
src/middleware/middleware.builders.js (2)

154-172: Use cached dataset list instead of top-level value

Avoid referencing a module-scoped list that never refreshes.

-    const query = this.query({ req, params: req.params })
-    const promises = availableDatasets.map((dataset) => {
+    const query = this.query({ req, params: req.params })
+    const datasetSlugs = await ensureAvailableDatasets()
+    const promises = datasetSlugs.map((dataset) => {
       return datasette.runQuery(query, dataset).catch(error => {
         logger.error('Query failed for dataset', { dataset, errorMessage: error.message, errorStack: error.stack, type: types.DataFetch })
         throw error
       })
     })
     const result = await Promise.all(promises)
     req[this.result] = Object.fromEntries(
-      result.reduce((acc, { formattedData }, i) => {
+      result.reduce((acc, { formattedData }, i) => {
         if (formattedData.length > 0) {
-          acc.push([availableDatasets[i], formattedData[0]])
+          acc.push([datasetSlugs[i], formattedData[0]])
         }
         return acc
       }, [])
     )

195-208: Same change for fetchMany variant

Mirror the cached list usage.

-    const query = this.query({ req, params: req.params })
-    const promises = availableDatasets.map((dataset) => {
+    const query = this.query({ req, params: req.params })
+    const datasetSlugs = await ensureAvailableDatasets()
+    const promises = datasetSlugs.map((dataset) => {
       return datasette.runQuery(query, dataset).catch(error => {
         logger.error('Query failed for dataset', { dataset, errorMessage: error.message, errorStack: error.stack, type: types.DataFetch })
         throw error
       })
     })
     const result = await Promise.all(promises)
     req[this.result] = Object.fromEntries(
       result.filter(({ formattedData }) => formattedData.length > 0)
-        .map(({ formattedData }, i) => [availableDatasets[i], formattedData])
+        .map(({ formattedData }, i) => [datasetSlugs[i], formattedData])
     )
src/controllers/endpointSubmissionFormDeepLinkController.js (1)

36-57: Make async Express handler safe and validate orgId

Add try/catch and include orgId in validation to avoid undefined session state.

-  async get (req, res, next) {
-    // if the query params don't contain what we need, redirect to the "get started" page,
-    // this way the user can still proceed (but need to fill the dataset+orgName themselves)
-    const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
-
-    if (!dataset || !orgName || !datasets.has(dataset)) {
-      logger.info('EndpointSubmissionFormDeepLinkController.get(): invalid params for deep link, redirecting to landing page',
-        { type: types.App, query: req.query })
-      return res.redirect('/')
-    }
-
-    const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
-    req.sessionModel.set('dataset', dataset)
-    req.sessionModel.set('lpa', orgName)
-    req.sessionModel.set('orgId', orgId)
-    const sessionData = { lpa: orgName, dataset, orgId, datasetName: datasetInfo.text }
-    maybeSetReferrer(req, sessionData)
-    req.sessionModel.set(this.sessionKey, sessionData)
-
-    super.post(req, res, next)
-  }
+  async get (req, res, next) {
+    try {
+      // if the query params don't contain what we need, redirect to the "get started" page,
+      // this way the user can still proceed (but need to fill the dataset+orgName themselves)
+      const { dataset, orgName, orgId } = req.query
+      const datasets = await getDatasets()
+      if (!dataset || !orgName || !orgId || !datasets.has(dataset)) {
+        logger.info('EndpointSubmissionFormDeepLinkController.get(): invalid params for deep link, redirecting to landing page',
+          { type: types.App, query: req.query })
+        return res.redirect('/')
+      }
+      const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
+      req.sessionModel.set('dataset', dataset)
+      req.sessionModel.set('lpa', orgName)
+      req.sessionModel.set('orgId', orgId)
+      const sessionData = { lpa: orgName, dataset, orgId, datasetName: datasetInfo.text ?? dataset }
+      maybeSetReferrer(req, sessionData)
+      req.sessionModel.set(this.sessionKey, sessionData)
+      return super.post(req, res, next)
+    } catch (err) {
+      req.handlerName = 'EndpointSubmissionFormDeepLinkController.get'
+      return next(err)
+    }
+  }
src/controllers/datasetController.js (1)

39-45: Wrap async post with try/catch and return the parent call

Same unhandled rejection risk; also return for consistency.

-  async post (req, res, next) {
-    const datasets = await getDatasets()
-    const dataset = req.body.dataset
-    const { dataSubject } = datasets.get(dataset) || { dataSubject: '' }
-    req.body['data-subject'] = dataSubject
-    super.post(req, res, next)
-  }
+  async post (req, res, next) {
+    try {
+      const datasets = await getDatasets()
+      const dataset = req.body.dataset
+      const { dataSubject } = datasets.get(dataset) || { dataSubject: '' }
+      req.body['data-subject'] = dataSubject
+      return super.post(req, res, next)
+    } catch (err) {
+      req.handlerName = 'DatasetController.post'
+      return next(err)
+    }
+  }
src/utils/utils.js (1)

173-178: Make sorting resilient when names are missing

Prevent .localeCompare on undefined if name-map is empty.

-  availableDatasets.sort((a, b) => a.text.localeCompare(b.text))
+  availableDatasets.sort((a, b) => (a.text ?? a.value).localeCompare(b.text ?? b.value))
test/unit/datasetController.test.js (1)

16-37: Hoist the utils mock to module scope before importing the controller
Mocking inside beforeEach after the controller import means the real utils.js is loaded at module initialization, risking live API calls and flaky tests. Move your mock to the top of the file using the string-literal API:

- import DatasetController, {
-   requiresGeometryTypeToBeSelected,
-   requiresGeometryTypeToBeSelectedViaDeepLink
- } from '../../src/controllers/datasetController.js'
+ vi.mock('../../src/utils/utils.js', async () => {
+   const actual = await vi.importActual('../../src/utils/utils.js')
+   const dataSubjects = {
+     subject1: {
+       available: true,
+       dataSets: [
+         { available: true, text: 'B', value: 'B', requiresGeometryTypeSelection: true },
+         { available: false, text: 'D', value: 'D', requiresGeometryTypeSelection: false }
+       ]
+     },
+     subject2: { available: false, dataSets: [{ available: true, text: 'C', value: 'C', requiresGeometryTypeSelection: false }] },
+     subject3: { available: true, dataSets: [{ available: true, text: 'A', value: 'A', requiresGeometryTypeSelection: true }] }
+   }
+   return {
+     ...actual,
+     getDataSubjects: vi.fn(async () => dataSubjects),
+     getDatasets:    vi.fn(async () => actual.makeDatasetsLookup(dataSubjects)),
+     availableDatasets: vi.fn(() => actual.availableDatasets(dataSubjects))
+   }
+ })
+
+ import DatasetController, {
+   requiresGeometryTypeToBeSelected,
+   requiresGeometryTypeToBeSelectedViaDeepLink
+ } from '../../src/controllers/datasetController.js'

Optionally add vi.clearAllMocks() in beforeEach to reset mock state.

♻️ Duplicate comments (2)
src/utils/utils.js (2)

96-103: Fix inconsistent property access and add fallback for 'tree'

Use bracket notation consistently and add fallback text.

-        {
-          value: 'tree',
-          text: nameMap.tree,
-          available: true,
-          requiresGeometryTypeSelection: true
-        },
+        {
+          value: 'tree',
+          text: fallbackName('tree', 'Tree'),
+          available: true,
+          requiresGeometryTypeSelection: true
+        },

25-26: Handle name-map API failures with safe fallback

If getDatasetNameMap fails, current code leaves text undefined and later sorting can throw. Add a try/catch and use fallbacks.

-  const nameMap = await getDatasetNameMap(datasetKeys)
+  let nameMap = {}
+  try {
+    nameMap = await getDatasetNameMap(datasetKeys)
+  } catch (error) {
+    // Non-fatal: fall back to empty map; per-dataset defaults provided below.
+    // Avoids hard failure on cold start or API outage.
+    // eslint-disable-next-line no-console
+    console.warn('buildDataSubjects: failed to fetch dataset names, using fallbacks:', error?.message)
+    nameMap = {}
+  }
+
+  const fallbackName = (slug, human) => (nameMap[slug] ?? human)
🧹 Nitpick comments (8)
src/utils/datasetLoader.js (3)

37-39: De-duplicate and normalise input keys before building the query

Avoids redundant query params and reduces payload size.

-  datasetKeys.forEach(d => params.append('dataset', d))
+  Array.from(new Set(datasetKeys.map(String))).forEach(d => params.append('dataset', d))

33-33: Make cache TTL configurable

Hard-coding 300s limits ops flexibility. Read from config with a sensible default.

-const CACHE_TTL = 300 // 5min
+const CACHE_TTL = Number(config?.planningDataApi?.cacheTtlSeconds ?? 300) // default 5 min

42-42: Correct user-agent string

Typo: “Planning Data Provide”. Use a stable product token.

-    { headers: { 'user-agent': 'Planning Data Provide' } }
+    { headers: { 'user-agent': 'Planning Data Submit (digital-land/submit)' } }
src/controllers/checkDeepLinkController.js (2)

28-33: Fix log prefix for clarity

Use the controller’s actual name in logs.

-      logger.info('DeepLinkController.get(): invalid referrer URL, skipping', {
+      logger.info('CheckDeepLinkController.get(): invalid referrer URL, skipping', {
         type: types.App,
         referrer: req.headers.referer,
         errorMessage: err.message
       })

58-61: Defensive fallback for datasetName

If name-mapping is temporarily empty, avoid undefined in session by falling back to the slug.

-    const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text }
+    const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text ?? dataset }
src/middleware/middleware.builders.js (2)

49-51: Use consistent log field name: 'type' not 'types'

Align with other logs in this file.

-      logger.warn('no "dataset" in request params',
-        { types: types.App, endpoint: req.originalUrl, params: req.params })
+      logger.warn('no "dataset" in request params',
+        { type: types.App, endpoint: req.originalUrl, params: req.params })

154-172: Doc vs behaviour mismatch: decide whether to tolerate per-dataset errors

Docs say “catches any errors”, but rethrowing in .catch() causes the whole call to fail. Consider Promise.allSettled and skip failed datasets if that’s acceptable.

Also applies to: 195-208

src/controllers/endpointSubmissionFormDeepLinkController.js (1)

19-23: Fix log prefix in maybeSetReferrer

Correct the hard-coded controller name.

-      logger.info('DeepLinkController.get(): invalid referrer URL, skipping', {
+      logger.info('EndpointSubmissionFormDeepLinkController.get(): invalid referrer URL, skipping', {
         type: types.App,
         referrer: req.headers.referer,
         errorMessage: err.message
       })
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9d54181 and 10c6d01.

📒 Files selected for processing (13)
  • src/controllers/CheckAnswersController.js (2 hunks)
  • src/controllers/checkDeepLinkController.js (2 hunks)
  • src/controllers/datasetController.js (2 hunks)
  • src/controllers/endpointSubmissionFormDeepLinkController.js (2 hunks)
  • src/middleware/common.middleware.js (3 hunks)
  • src/middleware/middleware.builders.js (1 hunks)
  • src/utils/datasetLoader.js (1 hunks)
  • src/utils/utils.js (3 hunks)
  • test/unit/checkDeepLinkController.test.js (2 hunks)
  • test/unit/datasetController.test.js (5 hunks)
  • test/unit/datasetLoader.test.js (1 hunks)
  • test/unit/endpointSubmissionFormLinkController.test.js (2 hunks)
  • test/unit/statusPage.test.js (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/unit/statusPage.test.js
  • test/unit/datasetLoader.test.js
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/datasetLoader.js
  • src/utils/utils.js
📚 Learning: 2024-10-29T15:06:42.432Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: test/unit/lpaDetailsController.test.js:20-20
Timestamp: 2024-10-29T15:06:42.432Z
Learning: The function `initDatasetSlugToReadableNameFilter()` includes its own error handling, so additional try-catch blocks around it in test files are unnecessary.

Applied to files:

  • src/utils/datasetLoader.js
🧬 Code graph analysis (9)
src/controllers/checkDeepLinkController.js (2)
src/controllers/datasetController.js (3)
  • req (26-26)
  • datasets (14-14)
  • datasets (25-25)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/middleware/middleware.builders.js (1)
src/utils/utils.js (5)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
  • availableDatasets (173-179)
  • availableDatasets (176-176)
src/controllers/datasetController.js (1)
src/utils/utils.js (7)
  • datasets (162-162)
  • getDatasets (160-164)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
  • availableDatasets (173-179)
  • availableDatasets (176-176)
src/middleware/common.middleware.js (2)
src/middleware/middleware.builders.js (1)
  • dataSubjects (23-23)
src/utils/utils.js (3)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
src/utils/datasetLoader.js (3)
config/index.js (1)
  • config (13-13)
src/utils/logger.js (1)
  • logger (21-28)
src/utils/utils.js (3)
  • datasetKeys (11-24)
  • datasets (162-162)
  • nameMap (25-25)
src/controllers/endpointSubmissionFormDeepLinkController.js (2)
src/controllers/datasetController.js (3)
  • req (26-26)
  • datasets (14-14)
  • datasets (25-25)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/controllers/CheckAnswersController.js (2)
src/controllers/datasetController.js (2)
  • datasets (14-14)
  • datasets (25-25)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/utils/utils.js (4)
src/utils/datasetLoader.js (2)
  • nameMap (60-60)
  • getDatasetNameMap (59-90)
src/middleware/common.middleware.js (1)
  • dataSubjects (873-873)
src/middleware/middleware.builders.js (1)
  • dataSubjects (23-23)
src/controllers/datasetController.js (2)
  • datasets (14-14)
  • datasets (25-25)
test/unit/datasetController.test.js (2)
src/utils/utils.js (2)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
src/controllers/datasetController.js (3)
  • req (26-26)
  • requiresGeometryTypeToBeSelected (13-18)
  • requiresGeometryTypeToBeSelectedViaDeepLink (24-29)
🔇 Additional comments (9)
src/utils/datasetLoader.js (1)

46-56: Good defensive parsing of API payload

Validates datasets is an array and filters bad entries before Object.fromEntries. This prevents runtime errors on malformed responses.

src/controllers/CheckAnswersController.js (1)

9-9: Import change looks good

Switching to getDatasets aligns this controller with the new async dataset loading.

src/middleware/common.middleware.js (1)

872-879: Async helper is fine; keep it pure

Using await getDataSubjects() inside the helper is appropriate. No change requested.

src/controllers/datasetController.js (1)

13-18: No action needed: all async helpers are properly awaited
Searches confirm the only call sites for both functions are in tests, and each invocation uses await. No un-awaited usages found.

test/unit/datasetController.test.js (5)

30-36: Mock surface looks correct and delegates to real helpers

Good approach keeping parity with the real module via ...actual and layering minimal overrides for getDataSubjects/getDatasets/availableDatasets.


52-52: Awaiting async locals is correct

This aligns with the controller’s async behaviour and prevents unhandled promise rejections.


68-68: Awaiting async post is correct

Matches the refactor to async dataset utilities.


77-89: Geometry-type checks via body: good coverage

Covers requires/does-not-require and missing dataset cases. Reads cleanly.


91-103: Geometry-type checks via deep link: good mirror of body path

Symmetric coverage; assertions are precise.

Comment on lines +32 to 36
async locals (req, res, next) {
const dataSubjects = await getDataSubjects()
req.form.options.datasetItems = availableDatasets(dataSubjects)
super.locals(req, res, next)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Wrap async locals with try/catch

Avoid unhandled rejections in Express 4.

-  async locals (req, res, next) {
-    const dataSubjects = await getDataSubjects()
-    req.form.options.datasetItems = availableDatasets(dataSubjects)
-    super.locals(req, res, next)
-  }
+  async locals (req, res, next) {
+    try {
+      const dataSubjects = await getDataSubjects()
+      req.form.options.datasetItems = availableDatasets(dataSubjects)
+      return super.locals(req, res, next)
+    } catch (err) {
+      req.handlerName = 'DatasetController.locals'
+      return next(err)
+    }
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async locals (req, res, next) {
const dataSubjects = await getDataSubjects()
req.form.options.datasetItems = availableDatasets(dataSubjects)
super.locals(req, res, next)
}
async locals (req, res, next) {
try {
const dataSubjects = await getDataSubjects()
req.form.options.datasetItems = availableDatasets(dataSubjects)
return super.locals(req, res, next)
} catch (err) {
req.handlerName = 'DatasetController.locals'
return next(err)
}
}
🤖 Prompt for AI Agents
In src/controllers/datasetController.js around lines 32 to 36, the async locals
method is not wrapped in a try/catch which can lead to unhandled promise
rejections in Express 4; wrap the body of the async locals method in try { ... }
catch (err) { return next(err); } so any errors from getDataSubjects(),
availableDatasets(), or super.locals are caught and forwarded to Express by
calling next(err); if super.locals returns a promise, await it inside the try
block before calling next().

Comment on lines +887 to 891
export const setAvailableDatasets = async (req, res, next) => {
// Motivation: stop relying on global variables all over the place
req.availableDatasets = CONSTANTS.availableDatasets
req.availableDatasets = await CONSTANTS.availableDatasets()
next()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Wrap async middleware in try/catch; Express 4 does not auto-handle rejected promises

If getDataSubjects() rejects, this will throw and may crash the process or hang the request. Propagate via next(err).

-export const setAvailableDatasets = async (req, res, next) => {
-  // Motivation: stop relying on global variables all over the place
-  req.availableDatasets = await CONSTANTS.availableDatasets()
-  next()
-}
+export const setAvailableDatasets = async (req, res, next) => {
+  try {
+    // Motivation: stop relying on global variables all over the place
+    req.availableDatasets = await CONSTANTS.availableDatasets()
+    next()
+  } catch (err) {
+    logger.error('setAvailableDatasets failed', { type: types.App, errorMessage: err.message, errorStack: err.stack })
+    next(err)
+  }
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export const setAvailableDatasets = async (req, res, next) => {
// Motivation: stop relying on global variables all over the place
req.availableDatasets = CONSTANTS.availableDatasets
req.availableDatasets = await CONSTANTS.availableDatasets()
next()
}
export const setAvailableDatasets = async (req, res, next) => {
try {
// Motivation: stop relying on global variables all over the place
req.availableDatasets = await CONSTANTS.availableDatasets()
next()
} catch (err) {
logger.error('setAvailableDatasets failed', { type: types.App, errorMessage: err.message, errorStack: err.stack })
next(err)
}
}
🤖 Prompt for AI Agents
In src/middleware/common.middleware.js around lines 887 to 891, the async
middleware setAvailableDatasets awaits CONSTANTS.availableDatasets() without
error handling; wrap the await in a try/catch and on success set
req.availableDatasets and call next(), but if an error is thrown call next(err)
so Express can handle the failure (avoid throwing or leaving the promise
unhandled).

Comment on lines +22 to 31
export async function getAvailableDatasets () {
const dataSubjects = await getDataSubjects()
return Object.values(dataSubjects).flatMap((dataSubject) =>
(dataSubject.dataSets || [])
.filter((dataset) => dataset.available)
.map((dataset) => dataset.value)
)
}
const availableDatasets = await getAvailableDatasets()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Avoid top-level await; memoise with TTL to prevent startup/test timeouts

Top-level await blocks module initialisation and can trigger long network calls during import, contributing to the reported test timeouts. Use a cached, lazy initialiser with TTL.

-import { getDataSubjects } from '../utils/utils.js'
-
-export async function getAvailableDatasets () {
-  const dataSubjects = await getDataSubjects()
-  return Object.values(dataSubjects).flatMap((dataSubject) =>
-    (dataSubject.dataSets || [])
-      .filter((dataset) => dataset.available)
-      .map((dataset) => dataset.value)
-  )
-}
-const availableDatasets = await getAvailableDatasets()
+import { getDataSubjects } from '../utils/utils.js'
+
+export async function getAvailableDatasets () {
+  const dataSubjects = await getDataSubjects()
+  return Object.values(dataSubjects).flatMap((dataSubject) =>
+    (dataSubject.dataSets || [])
+      .filter((dataset) => dataset.available)
+      .map((dataset) => dataset.value)
+  )
+}
+
+// Lazy, memoised cache with TTL (default 5m; configurable for tests via env)
+let availableDatasetsCache = null
+let availableDatasetsCachedAt = 0
+const AVAILABLE_DATASETS_TTL_MS = Number(process.env.AVAILABLE_DATASETS_TTL_MS ?? 5 * 60 * 1000)
+let inflight = null
+async function ensureAvailableDatasets () {
+  const now = Date.now()
+  if (availableDatasetsCache && (now - availableDatasetsCachedAt) < AVAILABLE_DATASETS_TTL_MS) {
+    return availableDatasetsCache
+  }
+  if (inflight) return inflight
+  inflight = (async () => {
+    try {
+      const slugs = await getAvailableDatasets()
+      availableDatasetsCache = slugs
+      availableDatasetsCachedAt = Date.now()
+      return slugs
+    } finally {
+      inflight = null
+    }
+  })()
+  return inflight
+}

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +8 to +30
export async function getRedisClient () {
if (!config.redis) return null

if (!redisClient) {
const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
redisClient = createClient({
url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
})

redisClient.on('error', (err) => {
logger.warn(`datasetLoader/redis error: ${err.message}`)
})

try {
await redisClient.connect()
} catch (err) {
logger.warn(`datasetLoader/failed to connect to redis: ${err.message}`)
redisClient = null
return null
}
}

return redisClient
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Support Redis auth/TLS URLs and provide a clean shutdown hook

Many Redis instances require auth and TLS. Also expose a way to close the client in tests to prevent handles from keeping the process alive.

 export async function getRedisClient () {
   if (!config.redis) return null
 
   if (!redisClient) {
-    const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
-    redisClient = createClient({
-      url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
-    })
+    // Prefer a full URL if provided; else construct from parts (with optional auth)
+    const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
+    const creds = (config.redis.username || config.redis.password)
+      ? `${encodeURIComponent(config.redis.username || '')}:${encodeURIComponent(config.redis.password)}@`
+      : ''
+    const url = config.redis.url || `${urlPrefix}://${creds}${config.redis.host}:${config.redis.port}`
+    redisClient = createClient({ url })
 
     redisClient.on('error', (err) => {
       logger.warn(`datasetLoader/redis error: ${err.message}`)
     })
 
     try {
       await redisClient.connect()
     } catch (err) {
       logger.warn(`datasetLoader/failed to connect to redis: ${err.message}`)
       redisClient = null
       return null
     }
   }
 
   return redisClient
 }
+
+// Optional: export for tests/process shutdown
+export async function closeRedisClient () {
+  if (redisClient) {
+    try { await redisClient.quit() } catch {}
+    redisClient = null
+  }
+}

If you want, I can wire closeRedisClient() into your test teardown.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function getRedisClient () {
if (!config.redis) return null
if (!redisClient) {
const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
redisClient = createClient({
url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
})
redisClient.on('error', (err) => {
logger.warn(`datasetLoader/redis error: ${err.message}`)
})
try {
await redisClient.connect()
} catch (err) {
logger.warn(`datasetLoader/failed to connect to redis: ${err.message}`)
redisClient = null
return null
}
}
return redisClient
export async function getRedisClient () {
if (!config.redis) return null
if (!redisClient) {
// Prefer a full URL if provided; else construct from parts (with optional auth)
const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
const creds = (config.redis.username || config.redis.password)
? `${encodeURIComponent(config.redis.username || '')}:${encodeURIComponent(config.redis.password)}@`
: ''
const url = config.redis.url || `${urlPrefix}://${creds}${config.redis.host}:${config.redis.port}`
redisClient = createClient({ url })
redisClient.on('error', (err) => {
logger.warn(`datasetLoader/redis error: ${err.message}`)
})
try {
await redisClient.connect()
} catch (err) {
logger.warn(`datasetLoader/failed to connect to redis: ${err.message}`)
redisClient = null
return null
}
}
return redisClient
}
// Optional: export for tests/process shutdown
export async function closeRedisClient () {
if (redisClient) {
try {
await redisClient.quit()
} catch {}
redisClient = null
}
}

Comment on lines +40 to +45
const res = await fetch(
`${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
{ headers: { 'user-agent': 'Planning Data Provide' } }
)
if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add HTTP timeout (and optional retry) to prevent test hangs/timeouts

fetch has no default timeout; slow network will stall unit/acceptance tests. Add an AbortController-based timeout (configurable), and include the request URL in the error for faster triage.
Apply:

-  const res = await fetch(
-    `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
-    { headers: { 'user-agent': 'Planning Data Provide' } }
-  )
+  const timeoutMs = Number(config?.planningDataApi?.timeoutMs ?? 10000)
+  const controller = new AbortController()
+  const id = setTimeout(() => controller.abort(), timeoutMs)
+  let res
+  try {
+    res = await fetch(
+      `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
+      {
+        headers: {
+          'user-agent': 'Planning Data Submit',
+          'accept': 'application/json'
+        },
+        signal: controller.signal
+      }
+    )
+  } finally {
+    clearTimeout(id)
+  }
-  if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)
+  if (!res.ok) {
+    throw new Error(`Failed to fetch datasets from API: ${res.status} ${res.statusText} (${res.url})`)
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const res = await fetch(
`${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
{ headers: { 'user-agent': 'Planning Data Provide' } }
)
if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)
const timeoutMs = Number(config?.planningDataApi?.timeoutMs ?? 10000)
const controller = new AbortController()
const id = setTimeout(() => controller.abort(), timeoutMs)
let res
try {
res = await fetch(
`${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
{
headers: {
'user-agent': 'Planning Data Submit',
'accept': 'application/json'
},
signal: controller.signal
}
)
} finally {
clearTimeout(id)
}
if (!res.ok) {
throw new Error(
`Failed to fetch datasets from API: ${res.status} ${res.statusText} (${res.url})`
)
}
🤖 Prompt for AI Agents
In src/utils/datasetLoader.js around lines 40 to 45, the fetch call lacks a
timeout (and optional retry), which can hang tests; wrap the fetch with an
AbortController that is cancelled after a configurable timeout (e.g. read from
config.datasetFetchTimeoutMs), pass controller.signal into fetch, clear the
timer after completion, and on abort throw an Error that includes the request
URL; optionally implement a small retry loop with exponential/backoff and a max
attempts config to retry transient failures before throwing an error that
includes the final attempt URL and status/text.

Comment thread src/utils/utils.js
Comment on lines +61 to +69
value: 'conservation-area',
text: nameMap['conservation-area'],
available: true
},
{
value: 'conservation-area-document',
text: nameMap['conservation-area-document'],
available: true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Apply fallbackName for conservation-area set

-        {
-          value: 'conservation-area',
-          text: nameMap['conservation-area'],
-          available: true
-        },
+        {
+          value: 'conservation-area',
+          text: fallbackName('conservation-area', 'Conservation area'),
+          available: true
+        },
-        {
-          value: 'conservation-area-document',
-          text: nameMap['conservation-area-document'],
-          available: true
-        }
+        {
+          value: 'conservation-area-document',
+          text: fallbackName('conservation-area-document', 'Conservation area document'),
+          available: true
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value: 'conservation-area',
text: nameMap['conservation-area'],
available: true
},
{
value: 'conservation-area-document',
text: nameMap['conservation-area-document'],
available: true
}
{
value: 'conservation-area',
text: fallbackName('conservation-area', 'Conservation area'),
available: true
},
{
value: 'conservation-area-document',
text: fallbackName('conservation-area-document', 'Conservation area document'),
available: true
}
🤖 Prompt for AI Agents
In src/utils/utils.js around lines 61 to 69, the entries for value
'conservation-area' and 'conservation-area-document' are using nameMap[...]
directly; update the 'text' field for 'conservation-area' to use the
fallbackName(nameMap, 'conservation-area') helper (and optionally do the same
for 'conservation-area-document' if intended) so that a fallback label is used
when nameMap lacks that key; replace the direct nameMap lookup with a call to
fallbackName and keep available: true unchanged.

Comment thread src/utils/utils.js
Comment on lines +76 to +89
value: 'listed-building',
text: nameMap['listed-building'],
available: false
},
{
value: 'listed-building-grade',
text: nameMap['listed-building-grade'],
available: false
},
{
value: 'listed-building-outline',
text: nameMap['listed-building-outline'],
available: true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Apply fallbackName for listed-building set

-        {
-          value: 'listed-building',
-          text: nameMap['listed-building'],
-          available: false
-        },
+        {
+          value: 'listed-building',
+          text: fallbackName('listed-building', 'Listed building'),
+          available: false
+        },
-        {
-          value: 'listed-building-grade',
-          text: nameMap['listed-building-grade'],
-          available: false
-        },
+        {
+          value: 'listed-building-grade',
+          text: fallbackName('listed-building-grade', 'Listed building grade'),
+          available: false
+        },
-        {
-          value: 'listed-building-outline',
-          text: nameMap['listed-building-outline'],
-          available: true
-        }
+        {
+          value: 'listed-building-outline',
+          text: fallbackName('listed-building-outline', 'Listed building outline'),
+          available: true
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value: 'listed-building',
text: nameMap['listed-building'],
available: false
},
{
value: 'listed-building-grade',
text: nameMap['listed-building-grade'],
available: false
},
{
value: 'listed-building-outline',
text: nameMap['listed-building-outline'],
available: true
}
{
value: 'listed-building',
text: fallbackName('listed-building', 'Listed building'),
available: false
},
{
value: 'listed-building-grade',
text: fallbackName('listed-building-grade', 'Listed building grade'),
available: false
},
{
value: 'listed-building-outline',
text: fallbackName('listed-building-outline', 'Listed building outline'),
available: true
}
🤖 Prompt for AI Agents
In src/utils/utils.js around lines 76 to 89, the text fields for the
listed-building items currently use nameMap['...'] directly which can be
undefined; update those entries to use the fallbackName helper when resolving
the display text (e.g., call fallbackName(nameMap, 'listed-building') for the
'listed-building' item and similarly for 'listed-building-grade' and
'listed-building-outline') so the UI gets a safe fallback string while keeping
the same value and available flags.

Comment thread src/utils/utils.js
Comment on lines +102 to +110
value: 'tree-preservation-order',
text: nameMap['tree-preservation-order'],
available: true
},
{
value: 'tree-preservation-zone',
text: nameMap['tree-preservation-zone'],
available: true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Apply fallbackName for TPO set

-        {
-          value: 'tree-preservation-order',
-          text: nameMap['tree-preservation-order'],
-          available: true
-        },
+        {
+          value: 'tree-preservation-order',
+          text: fallbackName('tree-preservation-order', 'Tree preservation order'),
+          available: true
+        },
-        {
-          value: 'tree-preservation-zone',
-          text: nameMap['tree-preservation-zone'],
-          available: true
-        }
+        {
+          value: 'tree-preservation-zone',
+          text: fallbackName('tree-preservation-zone', 'Tree preservation zone'),
+          available: true
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value: 'tree-preservation-order',
text: nameMap['tree-preservation-order'],
available: true
},
{
value: 'tree-preservation-zone',
text: nameMap['tree-preservation-zone'],
available: true
}
{
value: 'tree-preservation-order',
text: fallbackName('tree-preservation-order', 'Tree preservation order'),
available: true
},
{
value: 'tree-preservation-zone',
text: fallbackName('tree-preservation-zone', 'Tree preservation zone'),
available: true
}
🤖 Prompt for AI Agents
In src/utils/utils.js around lines 102 to 110, the entries for
'tree-preservation-order' and 'tree-preservation-zone' use nameMap[...]
directly; if those keys are missing the displayed text will be undefined. Change
the text fields to use the fallbackName (e.g., text:
nameMap['tree-preservation-order'] || fallbackName) so each option falls back to
the provided fallbackName when nameMap lacks the key.

Comment thread test/unit/checkDeepLinkController.test.js
Comment on lines +30 to 31
await endpointSubmissionFormDeepLinkController.get(req, res, next)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Mock getDatasets to avoid network and deflake tests

This test now awaits an async controller that calls getDatasets(). Without a mock, it may hit the live API and time out. Stub the module locally.

+vi.mock('../../src/utils/utils.js', () => {
+  const map = new Map([
+    ['conservation-area', { text: 'Conservation area', dataSubject: 'conservation-area', requiresGeometryTypeSelection: false }]
+  ])
+  return { getDatasets: async () => map }
+})

Place the mock near the top of the file (before imports that execute the controller).

Also applies to: 39-40

🤖 Prompt for AI Agents
In test/unit/endpointSubmissionFormLinkController.test.js around lines 30-31
(and similarly 39-40), the test invokes the async controller which calls
getDatasets() and may hit the live API causing flakiness; add a local Jest
module mock for the module that exports getDatasets placed at the very top of
the file (before any imports that execute the controller) and have the mock
return a resolved value (e.g., an empty array or a small fixture) so the
controller uses the stubbed response and the test does not perform network
calls.

@DilwoarH DilwoarH force-pushed the update-dataset-name-reference branch from 10c6d01 to e86fa1e Compare September 5, 2025 11:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/controllers/CheckAnswersController.js (1)

67-89: Guard against missing dataset metadata; don’t submit undefined collection.

If getDatasets() fails or the dataset key is unknown, bail with a user-facing error.

-    const datasets = await getDatasets()
-    const dataset = req.sessionModel.get('dataset')
-    const datasetMeta = datasets.get(dataset) || {}
+    let datasets
+    try {
+      datasets = await getDatasets()
+    } catch (err) {
+      logger.error('CheckAnswersController.createJiraServiceRequest(): failed to load datasets', { type: types.App, errorMessage: err.message, errorStack: err.stack })
+      req.sessionModel.set('errors', [{ text: 'We could not load datasets. Please try again.' }])
+      return res.redirect('/submit/check-answers')
+    }
+    const dataset = req.sessionModel.get('dataset')
+    const datasetMeta = datasets.get(dataset)
+    if (!datasetMeta?.dataSubject) {
+      logger.warn('Unknown dataset key or missing metadata', { type: types.App, dataset })
+      req.sessionModel.set('errors', [{ text: 'Please choose a valid dataset.' }])
+      return res.redirect('/submit/check-answers')
+    }
@@
-      collection: datasetMeta.dataSubject,
+      collection: datasetMeta.dataSubject,
♻️ Duplicate comments (2)
src/middleware/common.middleware.js (1)

957-961: Wrap async middleware in try/catch; propagate errors to Express.

Uncaught rejection here can hang the request or crash the process.

-export const setAvailableDatasets = async (req, res, next) => {
-  // Motivation: stop relying on global variables all over the place
-  req.availableDatasets = await CONSTANTS.availableDatasets()
-  next()
-}
+export const setAvailableDatasets = async (req, res, next) => {
+  try {
+    // Motivation: stop relying on global variables all over the place
+    req.availableDatasets = await CONSTANTS.availableDatasets()
+    next()
+  } catch (err) {
+    logger.error('setAvailableDatasets failed', { type: types.App, errorMessage: err.message, errorStack: err.stack })
+    next(err)
+  }
+}
src/utils/utils.js (1)

10-26: Harden API call with fallback to keep UI stable.

If getDatasetNameMap fails, buildDataSubjects currently throws and can break pages/routes.

 export async function buildDataSubjects () {
   const datasetKeys = [
@@
   ]
-  const nameMap = await getDatasetNameMap(datasetKeys)
+  let nameMap = {}
+  try {
+    nameMap = await getDatasetNameMap(datasetKeys)
+  } catch (err) {
+    console.error('buildDataSubjects(): failed to fetch dataset names; using defaults', err)
+    nameMap = {}
+  }
🧹 Nitpick comments (9)
test/unit/util.test.js (1)

17-29: Initialise test data in beforeAll, but freeze to avoid accidental mutation.

Defensive: freeze dataSubjects after construction.

Apply:

 beforeAll(async () => {
   const datasetName = await getDatasetNameMap()
   dataSubjects = {
@@
   }
+  Object.freeze(dataSubjects)
+  Object.values(dataSubjects).forEach(s => Object.freeze(s))
+  Object.values(dataSubjects).forEach(s => s.dataSets.forEach(ds => Object.freeze(ds)))
 })
src/middleware/common.middleware.js (1)

941-949: Return a de-duplicated, stable-ordered list of dataset keys.

Avoid duplicates and make output deterministic for caching/comparisons.

 export const CONSTANTS = {
   async availableDatasets () {
     const dataSubjects = await getDataSubjects()
-    return Object.values(dataSubjects).flatMap((dataSubject) => dataSubject.dataSets
-      .filter((dataset) => dataset.available)
-      .map((dataset) => dataset.value)
-    )
+    const values = Object.values(dataSubjects).flatMap(ds =>
+      ds.dataSets.filter(d => d.available).map(d => d.value)
+    )
+    return Array.from(new Set(values)).sort()
   }
 }
src/utils/utils.js (1)

160-164: Avoid repeated rebuilds; memoise within process.

Reduce repeated API calls when Redis is unavailable (e.g. local/CI). Cache getDatasets() for a short TTL.

-export async function getDatasets () {
-  const dataSubjects = await getDataSubjects()
-  const datasets = makeDatasetsLookup(dataSubjects)
-  return datasets
-}
+let __datasetsCache = { value: null, expiresAt: 0 }
+export async function getDatasets () {
+  const now = Date.now()
+  if (__datasetsCache.value && __datasetsCache.expiresAt > now) return __datasetsCache.value
+  const dataSubjects = await getDataSubjects()
+  const datasets = makeDatasetsLookup(dataSubjects)
+  __datasetsCache = { value: datasets, expiresAt: now + 5 * 60 * 1000 } // 5 min
+  return datasets
+}
test/unit/datasetLoader.test.js (6)

6-13: Stub and clean up global.fetch to avoid test leakage

Assigning global.fetch = vi.fn() isn’t restored by vi.restoreAllMocks(). Prefer vi.stubGlobal and vi.unstubAllGlobals() to prevent cross-test contamination.

-  beforeEach(() => {
-    vi.clearAllMocks()
-    global.fetch = vi.fn()
-  })
+  beforeEach(() => {
+    vi.clearAllMocks()
+    vi.stubGlobal('fetch', vi.fn())
+  })

-  afterEach(() => {
-    vi.restoreAllMocks()
-  })
+  afterEach(() => {
+    vi.unstubAllGlobals()
+    vi.restoreAllMocks()
+  })

29-36: Make URL and header assertions robust (order-agnostic, trailing-slash safe, header-case tolerant)

The exact-string URL assertion is brittle (param order, trailing slash). Assert structurally; also tolerate User-Agent vs user-agent.

-    expect(fetch).toHaveBeenCalledWith(
-      `${config.mainWebsiteUrl}/dataset.json?dataset=ancient-woodland&dataset=article-4-direction-area&field=dataset&field=name&include_typologies=false`,
-      expect.objectContaining({
-        headers: expect.objectContaining({
-          'user-agent': 'Planning Data Provide'
-        })
-      })
-    )
+    expect(fetch).toHaveBeenCalledTimes(1)
+    const [calledUrl, calledOptions] = fetch.mock.calls[0]
+    const url = new URL(calledUrl)
+    // base path
+    expect(url.origin + url.pathname)
+      .toBe(new URL('dataset.json', config.mainWebsiteUrl).toString().replace(/\?$/, ''))
+    // query params (order-agnostic)
+    expect(url.searchParams.getAll('dataset').sort()).toEqual(datasetKeys.slice().sort())
+    expect(url.searchParams.getAll('field').sort()).toEqual(['dataset', 'name'])
+    expect(url.searchParams.get('include_typologies')).toBe('false')
+    // header (case-insensitive key)
+    const headers = calledOptions?.headers ?? {}
+    const userAgent = headers['user-agent'] ?? headers['User-Agent']
+    expect(userAgent).toBe('Planning Data Provide')

42-48: Relax brittle error assertion

Error text may vary (status vs statusText). Use a regex to assert intent without over-coupling to message formatting.

-    await expect(getDatasetNameMap(['ancient-woodland'])).rejects.toThrow('Failed to fetch datasets from API: Not Found')
+    await expect(getDatasetNameMap(['ancient-woodland'])).rejects.toThrow(/Failed to fetch datasets from API/i)

14-15: Add a short-circuit test for empty input

Verifies no network call when datasetKeys is empty.

   })
 
+  it('returns {} and does not call fetch when datasetKeys is empty', async () => {
+    const result = await getDatasetNameMap([])
+    expect(result).toEqual({})
+    expect(fetch).not.toHaveBeenCalled()
+  })
+
   it('should fetch datasets and return a mapping of dataset -> name', async () => {

15-41: Consider adding a cache-hit test path

Add a test that returns a cached value (mock getRedisClient() to return an object with get resolving to JSON) and asserts fetch is not called. If needed, load the module under test after vi.mock via vi.resetModules() and dynamic import() to hoist the mock correctly. I can draft this if helpful.


1-49: Mitigating CI timeouts (context from PR comments)

For acceptance tests, consider stubbing fetch at process level (e.g., MSW Node, Nock, or undici’s MockAgent) for dataset.json to avoid real network and large payloads, while keeping one integration test behind a feature flag to hit the live API in a nightly job.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 10c6d01 and e86fa1e.

📒 Files selected for processing (15)
  • src/controllers/CheckAnswersController.js (2 hunks)
  • src/controllers/checkDeepLinkController.js (2 hunks)
  • src/controllers/datasetController.js (2 hunks)
  • src/controllers/endpointSubmissionFormDeepLinkController.js (2 hunks)
  • src/middleware/common.middleware.js (3 hunks)
  • src/middleware/middleware.builders.js (1 hunks)
  • src/utils/datasetLoader.js (1 hunks)
  • src/utils/utils.js (3 hunks)
  • test/acceptance/dataset_overview.test.js (2 hunks)
  • test/unit/checkDeepLinkController.test.js (2 hunks)
  • test/unit/datasetController.test.js (5 hunks)
  • test/unit/datasetLoader.test.js (1 hunks)
  • test/unit/endpointSubmissionFormLinkController.test.js (2 hunks)
  • test/unit/statusPage.test.js (1 hunks)
  • test/unit/util.test.js (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • test/unit/endpointSubmissionFormLinkController.test.js
  • test/unit/statusPage.test.js
  • test/unit/datasetController.test.js
  • src/utils/datasetLoader.js
  • src/middleware/middleware.builders.js
  • src/controllers/endpointSubmissionFormDeepLinkController.js
  • src/controllers/datasetController.js
  • test/acceptance/dataset_overview.test.js
  • test/unit/checkDeepLinkController.test.js
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-01-30T10:13:37.085Z
Learnt from: rosado
PR: digital-land/submit#831
File: src/controllers/resultsController.js:67-75
Timestamp: 2025-01-30T10:13:37.085Z
Learning: Express.js async middleware functions require explicit error handling using try/catch blocks to properly handle asynchronous errors and prevent unhandled promise rejections.

Applied to files:

  • src/middleware/common.middleware.js
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/utils.js
🧬 Code graph analysis (6)
test/unit/util.test.js (2)
src/utils/utils.js (2)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
src/utils/datasetLoader.js (1)
  • getDatasetNameMap (59-90)
src/middleware/common.middleware.js (2)
src/middleware/middleware.builders.js (1)
  • dataSubjects (23-23)
src/utils/utils.js (3)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
test/unit/datasetLoader.test.js (2)
src/utils/utils.js (1)
  • datasetKeys (11-24)
src/utils/datasetLoader.js (1)
  • getDatasetNameMap (59-90)
src/controllers/checkDeepLinkController.js (2)
src/controllers/datasetController.js (3)
  • req (26-26)
  • datasets (14-14)
  • datasets (25-25)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/controllers/CheckAnswersController.js (2)
src/controllers/datasetController.js (2)
  • datasets (14-14)
  • datasets (25-25)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/utils/utils.js (4)
src/utils/datasetLoader.js (2)
  • nameMap (60-60)
  • getDatasetNameMap (59-90)
src/middleware/common.middleware.js (1)
  • dataSubjects (943-943)
src/middleware/middleware.builders.js (1)
  • dataSubjects (23-23)
src/controllers/datasetController.js (2)
  • datasets (14-14)
  • datasets (25-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (4)
test/unit/util.test.js (1)

7-14: Mock looks good; stable test surface.

Using vi.mock for getDatasetNameMap with a resolved map keeps tests hermetic and fast.

src/middleware/common.middleware.js (1)

8-8: Import change is fine.

Entry issue groups now come from utils; aligns with usage below.

test/unit/datasetLoader.test.js (2)

33-34: Confirm intended User-Agent value

Is the intended UA string “Planning Data Provide” or “Planning Data Provider”? Please align the test and implementation with the agreed value (ideally via a shared config constant).


37-40: LGTM: happy-path mapping assertion

Good verification of the dataset→name map returned by the loader.

Comment on lines +45 to 50
async get (req, res, next) {
// if the query params don't contain what we need, redirect to the "get started" page,
// this way the user can still proceed (but need to fill the dataset+orgName themselves)
const { dataset, orgName, orgId } = req.query
const datasets = await getDatasets()
const validationResult = v.safeParse(QueryParams, req.query)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid unnecessary fetch and handle failures.

Validate query first; only then call getDatasets(). Also catch loader errors to avoid 500s.

-  async get (req, res, next) {
+  async get (req, res, next) {
     // if the query params don't contain what we need, redirect to the "get started" page,
     // this way the user can still proceed (but need to fill the dataset+orgName themselves)
     const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
-    const validationResult = v.safeParse(QueryParams, req.query)
+    const validationResult = v.safeParse(QueryParams, req.query)
     if (!(validationResult.success && datasets.has(dataset))) {
       logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
         { type: types.App, query: req.query })
       return res.redirect('/')
     }
+    let datasets
+    try {
+      datasets = await getDatasets()
+    } catch (err) {
+      logger.error('DeepLinkController.get(): failed to load datasets', { type: types.App, errorMessage: err.message, errorStack: err.stack })
+      return res.redirect('/')
+    }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/controllers/checkDeepLinkController.js around lines 45 to 50, the code
calls getDatasets() before validating req.query and doesn't handle loader
errors; change flow to validate queryParams first using v.safeParse(QueryParams,
req.query) and only call await getDatasets() if validation passes, and wrap the
getDatasets() call in a try/catch to handle failures (call next(error) or return
an appropriate response) so loader errors don't produce unhandled 500s.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
test/integration/back_buttons.playwright.test.js (1)

160-160: Prefer stable selector over label text for dataset selection

Same concern as above: target a dataset id/value or test hook rather than the visible label to avoid coupling to API-provided names.

🧹 Nitpick comments (8)
test/integration/back_buttons.playwright.test.js (8)

17-21: Pre-warm dataset cache to prevent timeouts and cut repetition

Good update reflecting new entry at geometry-type. To reduce flaky timeouts from the dataset API and speed this suite up:

  • Pre-warm Redis (or in-memory) dataset cache once before tests.
  • Run this describe serially to ensure the first hit warms the cache for the rest.

Add near the top of this file (outside the changed lines):

test.describe.configure({ mode: 'serial' })

test.beforeAll(async ({ browser }) => {
  const page = await browser.newPage()
  const start = new StartPage(page)
  await start.navigateHere()
  const dataset = await start.verifyAndReturnPage(DatasetPage)
  await dataset.waitForPage() // triggers dataset load => cache warm
  await page.close()
})

If you can point PLANNING_DATA_API_* env vars at a lightweight stub in CI, that’ll be even faster.


33-37: DRY: factor service-start → geometry-type boilerplate into a helper

This sequence repeats across the file. Extract to a helper for readability and less test-time overhead.

Apply within this hunk:

-    const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-    await serviceStartGeometryTypePage.waitForPage()
-    await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-    await serviceStartGeometryTypePage.clickContinue()
+    const geometryTypePage = await startAtGeometryType(startPage, geometryTypes.polygon)
+    await geometryTypePage.clickContinue()

Add once (outside hunks):

async function startAtGeometryType (startPage, type) {
  const page = await startPage.verifyAndReturnPage(GeometryTypePage)
  await page.waitForPage()
  if (type) await page.selectGeometryType(type)
  return page
}

59-63: Apply the same helper here

Use the startAtGeometryType helper to cut duplication.

-    const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-    await serviceStartGeometryTypePage.waitForPage()
-    await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-    await serviceStartGeometryTypePage.clickContinue()
+    const geometryTypePage = await startAtGeometryType(startPage, geometryTypes.polygon)
+    await geometryTypePage.clickContinue()

79-83: Apply the same helper here

Same refactor for consistency.

-    const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-    await serviceStartGeometryTypePage.waitForPage()
-    await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-    await serviceStartGeometryTypePage.clickContinue()
+    const geometryTypePage = await startAtGeometryType(startPage, geometryTypes.polygon)
+    await geometryTypePage.clickContinue()

111-115: Apply the same helper here

Reduce repetition with startAtGeometryType.

-    const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-    await serviceStartGeometryTypePage.waitForPage()
-    await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-    await serviceStartGeometryTypePage.clickContinue()
+    const geometryTypePage = await startAtGeometryType(startPage, geometryTypes.polygon)
+    await geometryTypePage.clickContinue()

122-129: Stabilise dataset-driven step and reduce duplication

Selecting by visible label risks brittleness if upstream names change. Prefer targeting a stable dataset id/value (or add a data-* test hook). Also, this block repeats; consider a helper that navigates via dataset → geometry-type.

Example helper usage within this hunk:

-    await datasetPage.selectDataset(datasets.Tree)
-
-    const geometryTypePage = await datasetPage.clickContinue()
-    await geometryTypePage.waitForPage()
-    await geometryTypePage.selectGeometryType(geometryTypes.point)
-
-    const uploadMethodPage = await geometryTypePage.clickContinue()
+    const geometryTypePage = await proceedViaDatasetAndGeometry(startPage, datasets.Tree, geometryTypes.point)
+    const uploadMethodPage = await geometryTypePage.clickContinue()

Add once (outside hunks):

async function proceedViaDatasetAndGeometry (startPage, dataset, geoType) {
  const datasetPage = await startPage.verifyAndReturnPage(DatasetPage)
  await datasetPage.waitForPage()
  await datasetPage.selectDataset(dataset) // Prefer selector by value/id if available
  const geometryTypePage = await datasetPage.clickContinue()
  await geometryTypePage.waitForPage()
  await geometryTypePage.selectGeometryType(geoType)
  return geometryTypePage
}

If selectors by id/value exist (e.g., input[value="tree"]), switch the POM to use them and keep datasets.Tree mapping stable even if display text changes.


149-153: Apply the same startAtGeometryType helper here

Keeps tests concise and uniform.

-    const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-    await serviceStartGeometryTypePage.waitForPage()
-    await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-    await serviceStartGeometryTypePage.clickContinue()
+    const geometryTypePage = await startAtGeometryType(startPage, geometryTypes.polygon)
+    await geometryTypePage.clickContinue()

162-167: Collapse dataset → geometry-type navigation with helper

Reduce repetition and centralise page waits.

-    const geometryTypePage = await datasetPage.clickContinue()
-    await geometryTypePage.waitForPage()
-    await geometryTypePage.selectGeometryType(geometryTypes.point)
-
-    const uploadMethodPage = await geometryTypePage.clickContinue()
+    const geometryTypePage = await proceedViaDatasetAndGeometry(startPage, datasets.Tree, geometryTypes.point)
+    const uploadMethodPage = await geometryTypePage.clickContinue()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a637dc and fecedd9.

📒 Files selected for processing (1)
  • test/integration/back_buttons.playwright.test.js (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/integration/back_buttons.playwright.test.js (3)
test/PageObjectModels/geometryTypePage.js (1)
  • GeometryTypePage (9-24)
test/PageObjectModels/uploadMethodPage.js (1)
  • UploadMethodPage (10-29)
test/PageObjectModels/datasetPage.js (1)
  • DatasetPage (13-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (1)
test/integration/back_buttons.playwright.test.js (1)

6-6: No changes required: geometryTypes is correctly exported
The named export for geometryTypes exists in test/PageObjectModels/geometryTypePage.js.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/unit/datasetController.test.js (1)

16-38: Mock not applied: vi.mock is inside beforeEach and uses a non-hoisted signature

  • Putting vi.mock inside beforeEach prevents hoisting; the real module is imported before the mock is registered.
  • Passing import('../../src/utils/utils.js') as the first arg disables Vitest’s hoisting entirely. This risks real network calls and timeouts.

Apply this diff to remove the non-hoisted mock here:

-    vi.mock(import('../../src/utils/utils.js'), async (importOriginal) => {
-      // const { availableDatasets, makeDatasetsLookup } = await importOriginal()
-      const actual = await importOriginal()
-      const dataSubjects = {
-        subject1: {
-          available: true,
-          dataSets:
-            [{ available: true, text: 'B', value: 'B', requiresGeometryTypeSelection: true },
-              { available: false, text: 'D', value: 'D', requiresGeometryTypeSelection: false }]
-        },
-        subject2: { available: false, dataSets: [{ available: true, text: 'C', value: 'C', requiresGeometryTypeSelection: false }] },
-        subject3: { available: true, dataSets: [{ available: true, text: 'A', value: 'A', requiresGeometryTypeSelection: true }] }
-      }
-      return {
-        // availableDatasets,
-        // dataSubjects,
-        // datasets: makeDatasetsLookup(dataSubjects)
-        ...actual,
-        getDataSubjects: vi.fn(async () => dataSubjects),
-        getDatasets: vi.fn(async () => actual.makeDatasetsLookup(dataSubjects)),
-        availableDatasets: vi.fn(() => actual.availableDatasets(dataSubjects))
-      }
-    })

Then add a hoisted, top-level mock (outside of any hook) so it’s applied before imports:

// place at top-level (Vitest will hoist this above imports)
vi.mock('../../src/utils/utils.js', async (importOriginal) => {
  const actual = await importOriginal()
  const dataSubjects = {
    subject1: {
      available: true,
      dataSets: [
        { available: true, text: 'B', value: 'B', requiresGeometryTypeSelection: true },
        { available: false, text: 'D', value: 'D', requiresGeometryTypeSelection: false }
      ]
    },
    subject2: { available: false, dataSets: [{ available: true, text: 'C', value: 'C', requiresGeometryTypeSelection: false }] },
    subject3: { available: true, dataSets: [{ available: true, text: 'A', value: 'A', requiresGeometryTypeSelection: true }] }
  }

  return {
    ...actual,
    getDataSubjects: vi.fn(async () => dataSubjects),
    getDatasets: vi.fn(async () => actual.makeDatasetsLookup(dataSubjects)),
    availableDatasets: vi.fn(() => actual.availableDatasets(dataSubjects))
  }
})

To avoid any API traffic from name-mapping during tests, also mock the dataset loader:

// also at top-level to prevent network calls
vi.mock('../../src/utils/datasetLoader.js', () => ({
  getDatasetNameMap: vi.fn(async () => new Map([
    ['A', 'A'],
    ['B', 'B'],
    ['C', 'C'],
    ['D', 'D']
  ]))
}))

Optional: move the initialiser to beforeAll to avoid re-initialising for every test:

beforeAll(async () => {
  await initDatasetSlugToReadableNameFilter()
})
test/integration/pages_load_ok.playwright.test.js (1)

20-24: Assertion never fails as written; assert visibility instead

getByText() always returns a Locator, so checking for !== undefined will always pass. Assert on visibility.

Apply:

-  const sessionExpiredText = await page.getByText('Session expired')
-  await expect(sessionExpiredText !== undefined).toBeTruthy()
+  await expect(page.getByText('Session expired')).toBeVisible()
♻️ Duplicate comments (4)
src/utils/utils.js (2)

97-97: Use bracket notation for consistency: nameMap['tree']

Keep access style consistent with the rest of the file.


10-25: Guard API failures and provide local fallbacks to keep UI/tests stable

A failed API call will currently propagate and can yield undefined texts later. Wrap the fetch in try/catch and define a small helper for defaults.

Apply this diff:

 export async function buildDataSubjects () {
   const datasetKeys = [
@@
-  const nameMap = await getDatasetNameMap(datasetKeys)
+  let nameMap = {}
+  try {
+    nameMap = await getDatasetNameMap(datasetKeys)
+  } catch (err) {
+    // Avoid breaking the module if the API is slow/unavailable (keeps tests fast too)
+    // eslint-disable-next-line no-console
+    console.warn('buildDataSubjects: using fallback dataset names:', err?.message ?? err)
+    nameMap = {}
+  }
+  const nn = (key, fallback) => nameMap?.[key] ?? fallback
test/unit/checkDeepLinkController.test.js (2)

37-42: Same here: ensure getDatasets is mocked for the happy-path test

Keeps the assertion on datasetName stable and fast.


28-31: Mock getDatasets to avoid network and flakiness in unit tests

Without a stub, this test can hit the network and slow/flake. Add a vi.mock for ../../src/utils/utils.js before importing the controller.

Apply this (placed at the very top of the file, before other imports):

vi.mock('../../src/utils/utils.js', () => {
  const datasets = new Map([
    ['conservation-area', { value: 'conservation-area', text: 'Conservation area', available: true, dataSubject: 'conservation-area' }]
  ])
  return { getDatasets: vi.fn().mockResolvedValue(datasets) }
})
🧹 Nitpick comments (7)
src/utils/utils.js (1)

165-166: Remove commented debug lines

Tidy up before merge.

Apply this diff:

-// export const datasets = makeDatasetsLookup(dataSubjects)
-// console.log('datasets:', datasets)
src/middleware/middleware.builders.js (1)

154-179: Optionally tolerate partial failures across datasets

Current Promise.all with rethrow means a single failing dataset nukes the whole response. If acceptable, switch to Promise.allSettled and keep successes; log failures.

Example approach:

const slugs = await ensureAvailableDatasets()
const results = await Promise.allSettled(slugs.map((d) => datasette.runQuery(query, d)))
const ok = results
  .map((r, i) => [slugs[i], r])
  .filter(([, r]) => r.status === 'fulfilled')
  .map(([slug, r]) => [slug, r.value.formattedData])
// for fetchOneFromAllDatasetsFn: pick [0] where length > 0

Also applies to: 195-216

test/integration/pages_load_ok.playwright.test.js (2)

61-65: Factor out repeated “service start → geometry-type → continue” steps

This block repeats across tests and can be a helper to reduce flakiness and maintenance.

Example helper:

async function startViaServiceGeometry (page, type = geometryTypes.polygon) {
  const startPage = new StartPage(page)
  await startPage.navigateHere()
  const g = await startPage.verifyAndReturnPage(GeometryTypePage)
  await g.waitForPage()
  await g.selectGeometryType(type)
  await g.clickContinue()
  return startPage
}

Then replace blocks with:

const startPage = await startViaServiceGeometry(page, geometryTypes.polygon)

Also applies to: 81-85, 105-109, 133-137


1-201: Speed up acceptance tests by stubbing the large dataset.json

To address timeouts noted in PR comments, stub the planning.data.gov.uk dataset.json in tests to a small fixture. This keeps E2E logic intact while avoiding multi‑MB downloads and flaky CI.

Example:

test.beforeEach(async ({ page }) => {
  await page.route('https://www.planning.data.gov.uk/dataset.json', async route => {
    await route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify({
        datasets: [
          { slug: 'tree', name: 'Tree' },
          { slug: 'article-4-direction-area', name: 'Article 4 direction area' }
        ]
      })
    })
  })
})

You can lift this into a shared test setup if used across multiple suites.

test/integration/back_buttons.playwright.test.js (3)

59-63: DRY up repeated geometry-type selection

Same geometry-type selection appears multiple times; extract a helper to reduce duplication and ease future changes.

See the helper suggested in pages_load_ok.playwright.test.js and reuse it here.

Also applies to: 79-83, 111-115, 124-129, 149-153


55-55: Remove unused baseURL parameter

baseURL isn’t used in these tests.

Apply:

-  test('upload method page', async ({ page, baseURL }) => {
+  test('upload method page', async ({ page }) => {

Repeat for the other occurrences.

Also applies to: 75-75, 107-107, 145-145


1-183: Stub dataset.json in Playwright to reduce CI timeouts

Same advice as the other suite; route the dataset.json request to a small fixture to avoid repeated large downloads.

(Use the test.beforeEach route stub example shared in the other comment.)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fecedd9 and 98907f5.

📒 Files selected for processing (17)
  • src/controllers/CheckAnswersController.js (2 hunks)
  • src/controllers/checkDeepLinkController.js (2 hunks)
  • src/controllers/datasetController.js (2 hunks)
  • src/controllers/endpointSubmissionFormDeepLinkController.js (2 hunks)
  • src/middleware/common.middleware.js (3 hunks)
  • src/middleware/middleware.builders.js (1 hunks)
  • src/utils/datasetLoader.js (1 hunks)
  • src/utils/utils.js (3 hunks)
  • test/acceptance/dataset_overview.test.js (2 hunks)
  • test/integration/back_buttons.playwright.test.js (7 hunks)
  • test/integration/pages_load_ok.playwright.test.js (5 hunks)
  • test/unit/checkDeepLinkController.test.js (2 hunks)
  • test/unit/datasetController.test.js (5 hunks)
  • test/unit/datasetLoader.test.js (1 hunks)
  • test/unit/endpointSubmissionFormLinkController.test.js (2 hunks)
  • test/unit/statusPage.test.js (1 hunks)
  • test/unit/util.test.js (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • test/unit/endpointSubmissionFormLinkController.test.js
  • test/acceptance/dataset_overview.test.js
  • test/unit/statusPage.test.js
  • src/controllers/datasetController.js
  • test/unit/datasetLoader.test.js
  • test/unit/util.test.js
  • src/controllers/endpointSubmissionFormDeepLinkController.js
  • src/utils/datasetLoader.js
  • src/controllers/CheckAnswersController.js
  • src/middleware/common.middleware.js
  • src/controllers/checkDeepLinkController.js
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/utils.js
🧬 Code graph analysis (5)
src/middleware/middleware.builders.js (2)
src/middleware/common.middleware.js (1)
  • dataSubjects (943-943)
src/utils/utils.js (5)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
  • availableDatasets (173-179)
  • availableDatasets (176-176)
test/integration/pages_load_ok.playwright.test.js (2)
test/PageObjectModels/geometryTypePage.js (3)
  • GeometryTypePage (9-24)
  • geometryTypes (4-7)
  • geometryTypes (4-7)
test/PageObjectModels/startPage.js (1)
  • StartPage (3-7)
src/utils/utils.js (5)
src/utils/datasetLoader.js (2)
  • nameMap (60-60)
  • getDatasetNameMap (59-90)
src/middleware/common.middleware.js (1)
  • dataSubjects (943-943)
src/middleware/middleware.builders.js (1)
  • dataSubjects (23-23)
test/unit/util.test.js (1)
  • dataSubjects (15-15)
src/controllers/datasetController.js (2)
  • datasets (14-14)
  • datasets (25-25)
test/unit/datasetController.test.js (2)
src/utils/utils.js (2)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
src/controllers/datasetController.js (3)
  • req (26-26)
  • requiresGeometryTypeToBeSelected (13-18)
  • requiresGeometryTypeToBeSelectedViaDeepLink (24-29)
test/integration/back_buttons.playwright.test.js (3)
test/PageObjectModels/geometryTypePage.js (1)
  • GeometryTypePage (9-24)
test/PageObjectModels/uploadMethodPage.js (1)
  • UploadMethodPage (10-29)
test/PageObjectModels/datasetPage.js (1)
  • DatasetPage (13-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (8)
src/utils/utils.js (3)

1-1: Import looks good

No issues with the new import of getDatasetNameMap.


131-134: Async wrapper LGTM

getDataSubjects pass-through is fine for call sites expecting an async API.


160-165: getDatasets LGTM

Good separation: retrieval + lookup construction.

test/unit/datasetController.test.js (1)

52-52: Async flow in tests is correct

Awaiting locals/post and the geometry-type checks matches the async utilities and controller methods. Looks good.

Also applies to: 68-68, 77-89, 91-103

test/integration/pages_load_ok.playwright.test.js (2)

8-9: Geometry-type step integration looks consistent

The added GeometryTypePage import and route-through steps align with the new flow. Good coverage across entry points.

Also applies to: 52-55, 61-65, 81-85, 105-109, 133-137


71-72: Stabilise dataset selection: avoid relying on visible labels from API

If labels are now API-driven, they may change. Prefer selecting by a stable value attribute or a test id.

Option A (no app change): use the radio’s value (example slug “tree”):

await page.locator('input[name="dataset"][value="tree"]').check()

Option B (preferred long-term): add data-testid to radios and target via getByTestId().

Also applies to: 91-93, 115-121, 143-151

test/integration/back_buttons.playwright.test.js (2)

6-6: Geometry-type page inserted correctly into back‑button flows

The added steps are coherent and exercise both forward and backward navigation. Good.

Also applies to: 17-21, 33-37, 59-63, 79-83, 111-115, 124-129, 149-153


43-46: Prefer selecting dataset by value or test id instead of label

Relying on UI text can break if API-provided names change.

If slug is stable, add in DatasetPage a method to select by value:

// test/PageObjectModels/datasetPage.js
async selectDatasetByValue (slug) {
  this.currentDataset = slug
  return await this.page.locator(`input[name="dataset"][value="${slug}"]`).check()
}

Then update tests:

await datasetPage.selectDatasetByValue('tree')

Also applies to: 92-97, 124-129, 162-167

Comment on lines +20 to 31
import { getDataSubjects } from '../utils/utils.js'

const availableDatasets = Object.values(dataSubjects).flatMap((dataSubject) =>
(dataSubject.dataSets || [])
.filter((dataset) => dataset.available)
.map((dataset) => dataset.value)
)
export async function getAvailableDatasets () {
const dataSubjects = await getDataSubjects()
return Object.values(dataSubjects).flatMap((dataSubject) =>
(dataSubject.dataSets || [])
.filter((dataset) => dataset.available)
.map((dataset) => dataset.value)
)
}
const availableDatasets = await getAvailableDatasets()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid top‑level await; memoise available datasets with TTL to fix timeouts

Top‑level await blocks module initialisation and contributes to the reported timeouts. Lazy, TTL‑cached loading avoids repeated expensive calls during app start and tests.

Apply this diff:

 import { getDataSubjects } from '../utils/utils.js'

 export async function getAvailableDatasets () {
   const dataSubjects = await getDataSubjects()
   return Object.values(dataSubjects).flatMap((dataSubject) =>
     (dataSubject.dataSets || [])
       .filter((dataset) => dataset.available)
       .map((dataset) => dataset.value)
   )
 }
-const availableDatasets = await getAvailableDatasets()
+// Lazy, memoised cache with TTL; configurable for tests via env
+let availableDatasetsCache = null
+let availableDatasetsCachedAt = 0
+const AVAILABLE_DATASETS_TTL_MS = Number(process.env.AVAILABLE_DATASETS_TTL_MS ?? 5 * 60 * 1000)
+let inflight = null
+export async function ensureAvailableDatasets () {
+  const now = Date.now()
+  if (availableDatasetsCache && (now - availableDatasetsCachedAt) < AVAILABLE_DATASETS_TTL_MS) {
+    return availableDatasetsCache
+  }
+  if (inflight) return inflight
+  inflight = (async () => {
+    try {
+      const slugs = await getAvailableDatasets()
+      availableDatasetsCache = slugs
+      availableDatasetsCachedAt = Date.now()
+      return slugs
+    } finally {
+      inflight = null
+    }
+  })()
+  return inflight
+}

And update the call sites to use the lazy getter (outside this hunk):

// in fetchOneFromAllDatasetsFn
const slugs = await ensureAvailableDatasets()
const promises = slugs.map((dataset) => /* ... */)
req[this.result] = Object.fromEntries(/* use slugs[i] */)

// in fetchManyFromAllDatasetsFn
const slugs = await ensureAvailableDatasets()
const promises = slugs.map((dataset) => /* ... */)
req[this.result] = Object.fromEntries(/* map with slugs[i] */)
🤖 Prompt for AI Agents
In src/middleware/middleware.builders.js around lines 20 to 31, there is a
top-level await calling getAvailableDatasets which blocks module initialization;
replace it with a lazy, memoised async getter (e.g., ensureAvailableDatasets)
that caches the dataset slugs for a short TTL and refreshes after expiry to
avoid repeated expensive calls and startup timeouts. Implement
ensureAvailableDatasets to call getDataSubjects the first time (or after TTL),
compute the array of available dataset slugs, store them with a timestamp, and
return the cached value for subsequent calls until TTL elapses. Remove the
top-level await and export/keep the ensureAvailableDatasets function; update
call sites (fetchOneFromAllDatasetsFn and fetchManyFromAllDatasetsFn) to await
ensureAvailableDatasets(), map over the returned slugs for the promises, and
build req[this.result] using the corresponding slugs by index. Ensure TTL is
configurable and handle concurrent callers so only one refresh runs at a time.

Comment thread src/utils/utils.js
Comment on lines +27 to +112
'article-4-direction': {
available: true,
dataSets: [
{
value: 'article-4-direction',
text: nameMap['article-4-direction'],
available: true
},
{
value: 'article-4-direction-area',
text: nameMap['article-4-direction-area'],
available: true
}
]
},
'brownfield-land': {
available: true,
dataSets: [
{
value: 'brownfield-land',
text: nameMap['brownfield-land'],
available: true
},
{
value: 'brownfield-site',
text: nameMap['brownfield-site'],
available: false
}
]
},
'conservation-area': {
available: true,
dataSets: [
{
value: 'conservation-area',
text: nameMap['conservation-area'],
available: true
},
{
value: 'conservation-area-document',
text: nameMap['conservation-area-document'],
available: true
}
]
},
'listed-building': {
available: true,
dataSets: [
{
value: 'listed-building',
text: nameMap['listed-building'],
available: false
},
{
value: 'listed-building-grade',
text: nameMap['listed-building-grade'],
available: false
},
{
value: 'listed-building-outline',
text: nameMap['listed-building-outline'],
available: true
}
]
},
'tree-preservation-order': {
available: true,
dataSets: [
{
value: 'tree',
text: nameMap.tree,
available: true,
requiresGeometryTypeSelection: true
},
{
value: 'tree-preservation-order',
text: nameMap['tree-preservation-order'],
available: true
},
{
value: 'tree-preservation-zone',
text: nameMap['tree-preservation-zone'],
available: true
}
]
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use safe fallbacks for all dataset display names (and fix ‘tree’ dot access)

Direct lookups can be undefined; this breaks sorting and renders blank labels. Use the helper to guarantee strings.

Apply this diff:

         {
           value: 'article-4-direction',
-          text: nameMap['article-4-direction'],
+          text: nn('article-4-direction', 'Article 4 direction'),
           available: true
         },
         {
           value: 'article-4-direction-area',
-          text: nameMap['article-4-direction-area'],
+          text: nn('article-4-direction-area', 'Article 4 direction area'),
           available: true
         }
@@
         {
           value: 'brownfield-land',
-          text: nameMap['brownfield-land'],
+          text: nn('brownfield-land', 'Brownfield land'),
           available: true
         },
         {
           value: 'brownfield-site',
-          text: nameMap['brownfield-site'],
+          text: nn('brownfield-site', 'Brownfield site'),
           available: false
         }
@@
         {
           value: 'conservation-area',
-          text: nameMap['conservation-area'],
+          text: nn('conservation-area', 'Conservation area'),
           available: true
         },
         {
           value: 'conservation-area-document',
-          text: nameMap['conservation-area-document'],
+          text: nn('conservation-area-document', 'Conservation area document'),
           available: true
         }
@@
         {
           value: 'listed-building',
-          text: nameMap['listed-building'],
+          text: nn('listed-building', 'Listed building'),
           available: false
         },
         {
           value: 'listed-building-grade',
-          text: nameMap['listed-building-grade'],
+          text: nn('listed-building-grade', 'Listed building grade'),
           available: false
         },
         {
           value: 'listed-building-outline',
-          text: nameMap['listed-building-outline'],
+          text: nn('listed-building-outline', 'Listed building outline'),
           available: true
         }
@@
         {
           value: 'tree',
-          text: nameMap.tree,
+          text: nn('tree', 'Tree'),
           available: true,
           requiresGeometryTypeSelection: true
         },
         {
           value: 'tree-preservation-order',
-          text: nameMap['tree-preservation-order'],
+          text: nn('tree-preservation-order', 'Tree preservation order'),
           available: true
         },
         {
           value: 'tree-preservation-zone',
-          text: nameMap['tree-preservation-zone'],
+          text: nn('tree-preservation-zone', 'Tree preservation zone'),
           available: true
         }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/utils/utils.js around lines 27-112, all dataset text fields currently
access nameMap directly (and one uses nameMap.tree via dot access) which can
yield undefined and break sorting/UI; update each dataset.text to call the
project’s safe helper (e.g. getDisplayName(nameMap, 'key')) instead of direct
lookups and change nameMap.tree to use the helper (or bracket access) so every
text value always returns a string fallback.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/integration/validation_errors.playwright.test.js (2)

78-78: Live dataset dependency note applies here as well

Same concern as at Line 47 regarding stubbing dataset.json.


112-112: Live dataset dependency note applies here too

Same stubbing advice as Line 47.

🧹 Nitpick comments (8)
test/integration/validation_errors.playwright.test.js (8)

23-23: Add an explicit wait before interacting

Insert a wait to avoid racing the UI after navigation.

-  await geometryTypePage.clickContinue(true)
+  await geometryTypePage.waitForPage()
+  await geometryTypePage.clickContinue(true)

38-42: Drop the initial geometry-type detour to speed up tests

This pre-navigation (select + continue) is redundant given you later navigate to dataset → geometry-type again. Removing it reduces runtime and state leakage risk.

-  const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-  await serviceStartGeometryTypePage.waitForPage()
-  await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-  await serviceStartGeometryTypePage.clickContinue()
+  // Start page already redirects; go straight to the dataset selection you actually want to test

47-47: Avoid live API dependency for dataset names in integration tests

Selecting datasets.Tree may still cause a live fetch of dataset mappings, risking timeouts/flakiness. Stub the dataset endpoint or provide a fixture-backed mapping in tests.

Example Playwright route stub (adapt path/fixture as needed):

test.beforeEach(async ({ page }) => {
  await page.route('https://www.planning.data.gov.uk/dataset.json*', route =>
    route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify(require('../fixtures/dataset.json'))
    })
  )
})

49-53: Wait for UploadMethod page after continue to eliminate race

After continue, ensure the next page is fully loaded before interacting/asserting.

-  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  await uploadMethodPage2.waitForPage()

69-73: Same redundant pre-navigation — remove for speed and determinism

This mirrors Lines 38–42 and can be safely dropped.

-  const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-  await serviceStartGeometryTypePage.waitForPage()
-  await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-  await serviceStartGeometryTypePage.clickContinue()
+  // Avoid duplicate journey; proceed directly to dataset flow

84-84: Add explicit wait for UploadMethod page load

Mirror the earlier suggestion to prevent flakiness.

-  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  await uploadMethodPage2.waitForPage()

103-107: Same redundant pre-navigation — remove

Repeats the initial detour; trim for speed and to avoid state carry-over.

-  const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
-  await serviceStartGeometryTypePage.waitForPage()
-  await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
-  await serviceStartGeometryTypePage.clickContinue()
+  // Go straight to dataset → geometry-type path under test

118-118: Waits after navigation: add for both UploadMethod and URL pages

Add explicit waits after continuing to UploadMethod and then to URL pages to avoid timing issues.

-  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  const uploadMethodPage2 = await geometryTypePage.clickContinue()
+  await uploadMethodPage2.waitForPage()

And just below (outside this hunk), consider:

-  const uploadURLPage = await uploadMethodPage2.clickContinue()
+  const uploadURLPage = await uploadMethodPage2.clickContinue()
+  await uploadURLPage.waitForPage()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98907f5 and 86cb284.

📒 Files selected for processing (1)
  • test/integration/validation_errors.playwright.test.js (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/integration/validation_errors.playwright.test.js (4)
test/PageObjectModels/startPage.js (1)
  • StartPage (3-7)
test/PageObjectModels/geometryTypePage.js (1)
  • GeometryTypePage (9-24)
test/PageObjectModels/uploadMethodPage.js (1)
  • UploadMethodPage (10-29)
test/PageObjectModels/datasetPage.js (1)
  • DatasetPage (13-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (3)
test/integration/validation_errors.playwright.test.js (3)

5-5: Import looks correct

Importing the GeometryType page object and enum is consistent with the new flow.


80-83: LGTM: explicit wait and selection are appropriate

Waiting for the geometry-type page and selecting the intended type is correct.


114-116: LGTM: wait + selection sequencing is sound

This is consistent with the intended journey for the Tree dataset.

Comment on lines +11 to +15
const serviceStartGeometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
await serviceStartGeometryTypePage.waitForPage()
await serviceStartGeometryTypePage.selectGeometryType(geometryTypes.polygon)
await serviceStartGeometryTypePage.clickContinue()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Test pre-selects a geometry type before asserting “no selection” error — likely false negative/flakiness

You select a geometry type (polygon) before later asserting the “no selection” error on the geometry-type page. If the selection persists in session, the error will never surface. Simplify this test to exercise the error immediately on the first geometry-type page (no dataset round-trip), e.g. click Continue without selecting and assert the error on the same page.

Proposed rewrite for this test (illustrative):

test('when the user clicks continue on the geometry-type page without entering a geometry type, the page correctly indicates there\'s an error', async ({ page }) => {
  const startPage = new StartPage(page)
  await startPage.navigateHere()

  const geometryTypePage = await startPage.verifyAndReturnPage(GeometryTypePage)
  await geometryTypePage.waitForPage()
  await geometryTypePage.clickContinue(true)

  const expectedErrors = [
    {
      fieldName: 'input#geomType.govuk-radios__input',
      expectedErrorMessage: 'Select if your geometry data given as points or polygons'
    }
  ]
  await geometryTypePage.expectErrorMessages(expectedErrors)
})
🤖 Prompt for AI Agents
In test/integration/validation_errors.playwright.test.js around lines 11 to 15,
the test currently pre-selects a geometry type (polygon) before asserting the
“no selection” validation, which can mask the error and cause flakiness; update
the test to exercise the validation immediately by not selecting any geometry
type: instantiate and navigate to the start page, transition to the
GeometryTypePage, wait for the page, call clickContinue (without selecting
anything) so the page validates, and then assert the expected error message for
the geometry type input on the same GeometryTypePage instead of later after a
dataset round-trip.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
test/acceptance/dataset_overview.test.js (1)

79-79: Rename test to reflect reality (no loader mock).

Suggest: “Dataset names displayed in ODP mandatory list match expected mapping”.

-test('Dataset names shown dynamically match names fetched from datasetLoader', async ({ page }) => {
+test('Dataset names displayed in ODP mandatory list match expected mapping', async ({ page }) => {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86cb284 and f1d971c.

📒 Files selected for processing (1)
  • test/acceptance/dataset_overview.test.js (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/acceptance/dataset_overview.test.js (2)
src/utils/utils.js (1)
  • nameMap (25-25)
test/PageObjectModels/organisationOverviewPage.js (1)
  • OrganisationOverviewPage (4-16)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test

Comment on lines +82 to +94
// Mock the expected nameMap to avoid Redis calls
const nameMap = {
'brownfield-land': 'Brownfield land',
'article-4-direction-area': 'Article 4 direction area',
'article-4-direction': 'Article 4 direction',
'conservation-area': 'Conservation area',
'conservation-area-document': 'Conservation area document',
tree: 'Tree',
'tree-preservation-zone': 'Tree preservation zone',
'listed-building': 'Listed building',
'listed-building-outline': 'Listed building outline',
'tree-preservation-order': 'Tree preservation order'
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

This doesn’t mock anything; the app will still hit Redis/API.

The inline nameMap is only used for assertions. It does not stub datasetLoader or prevent network/Redis access, so this test can still time out or be flaky.

  • Either actually stub server-side calls, or adjust the comment/title to avoid claiming a mock.

Options to make the test hermetic:

  • Start a tiny local HTTP server in test setup to serve a minimal /dataset.json, make datasetLoader read its base URL from an env var, and point it to the local server for acceptance runs.
  • If the app starts in-process with Playwright, use nock to intercept https://www.planning.data.gov.uk/dataset.json.
  • Pre-seed Redis with the expected name map in a test fixture so the API isn’t called.

Would you like me to draft a minimal change to datasetLoader to take a PLANNING_DATA_API_BASE_URL env var and a Playwright fixture that spins up a stub server?


🏁 Script executed:

#!/bin/bash
set -e
sed -n '1,200p' test/acceptance/dataset_overview.test.js

Length of output: 5342


Stub server-side calls or remove the “mock” claim in the dataset overview test
The inline nameMap at test/acceptance/dataset_overview.test.js (lines 82–94) is only used for assertions and does not stop real Redis/API requests, making the test prone to flakiness.

  • Replace this with a true stub of datasetLoader (for example, use nock to intercept https://www.planning.data.gov.uk/dataset.json)
  • Or pre-seed Redis with the expected nameMap in your test fixture
  • Alternatively, update the comment/title to no longer call it a mock
    Let me know if you’d like a draft change to add a PLANNING_DATA_API_BASE_URL env var to datasetLoader and a Playwright fixture that spins up a stub server.
🤖 Prompt for AI Agents
In test/acceptance/dataset_overview.test.js around lines 82–94 the inline
nameMap is only used for assertions and does not prevent real Redis/API calls;
replace this fragile “mock” by either stubbing the datasetLoader HTTP call or
pre-seeding Redis: option A — use nock (or similar) to intercept GET
https://www.planning.data.gov.uk/dataset.json and respond with a dataset payload
that yields the expected nameMap (add the nock setup in the test before the app
starts and cleanup after); option B — write fixture code to set the Redis key(s)
used by datasetLoader with the expected nameMap before starting the app and
clear them after; or option C — if you don’t want to stub/seed, change the
comment to no longer call it a mock; implement one of these fixes and ensure the
test starts after the stub/seed is applied so no external HTTP/Redis requests
occur.

Comment on lines +100 to +109
const datasetElements = await page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li').allInnerTexts()
const displayedNames = datasetElements
.map(text => text.split('\n')[0].trim())

for (const displayedName of displayedNames) {
const matchingKey = Object.keys(nameMap).find(key => nameMap[key] === displayedName)
expect(matchingKey).toBeDefined()
}
expect(displayedNames.length).toEqual(8)
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Assert by dataset key via link href; avoid brittle newline split and magic 8.

Current approach splits on newlines and only checks that names exist in a set, not that each name matches its dataset key. Extract the datasetId from the task-list link and assert nameMap[datasetId] === displayedName. Also drop the hard-coded 8 (this will drift) or compute against a canonical expected list.

-    const datasetElements = await page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li').allInnerTexts()
-    const displayedNames = datasetElements
-      .map(text => text.split('\n')[0].trim())
-
-    for (const displayedName of displayedNames) {
-      const matchingKey = Object.keys(nameMap).find(key => nameMap[key] === displayedName)
-      expect(matchingKey).toBeDefined()
-    }
-    expect(displayedNames.length).toEqual(8)
+    const linkLocator = page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li a[href*="/overview"]')
+    const items = await linkLocator.evaluateAll(els =>
+      els.map(a => {
+        const href = a.getAttribute('href') || ''
+        const parts = href.split('/')
+        const datasetId = parts[parts.length - 2] // /organisations/:org/:datasetId/overview
+        const displayedName = (a.textContent || '').trim()
+        return { datasetId, displayedName }
+      })
+    )
+
+    for (const { datasetId, displayedName } of items) {
+      expect(nameMap[datasetId]).toBe(displayedName)
+    }
+    // Optional: if you have a canonical list for ODP mandatory datasets, assert exact count against it instead of a magic number.
+    // expect(items.length).toBe(expectedOdpDatasetIds.length)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const datasetElements = await page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li').allInnerTexts()
const displayedNames = datasetElements
.map(text => text.split('\n')[0].trim())
for (const displayedName of displayedNames) {
const matchingKey = Object.keys(nameMap).find(key => nameMap[key] === displayedName)
expect(matchingKey).toBeDefined()
}
expect(displayedNames.length).toEqual(8)
})
const linkLocator = page.locator('[data-testid=datasetsOdpMandatory] .govuk-task-list li a[href*="/overview"]')
const items = await linkLocator.evaluateAll(els =>
els.map(a => {
const href = a.getAttribute('href') || ''
const parts = href.split('/')
const datasetId = parts[parts.length - 2] // /organisations/:org/:datasetId/overview
const displayedName = (a.textContent || '').trim()
return { datasetId, displayedName }
})
)
for (const { datasetId, displayedName } of items) {
expect(nameMap[datasetId]).toBe(displayedName)
}
// Optional: if you have a canonical list for ODP mandatory datasets, assert exact count against it instead of a magic number.
// expect(items.length).toBe(expectedOdpDatasetIds.length)
🤖 Prompt for AI Agents
In test/acceptance/dataset_overview.test.js around lines 100 to 109, the test
currently relies on splitting link text by newline and asserts a hard-coded
count of 8; change it to iterate task-list items, for each item find the anchor
href, extract the datasetId from that href, and assert that nameMap[datasetId]
=== displayedName (where displayedName is the visible text from that item), and
replace the hard-coded length assertion with a comparison against
Object.keys(nameMap).length (or another canonical expected list) so the test
asserts by dataset key and avoids brittle newline parsing and magic numbers.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
src/controllers/checkDeepLinkController.js (1)

45-55: Reorder validation and add error handling to prevent unnecessary API calls and unhandled rejections.

The previous review comment highlighting this issue remains unaddressed. getDatasets() is still invoked before validating query parameters, causing unnecessary API calls when parameters are invalid. Additionally, the lack of error handling can result in unhandled promise rejections and 500 errors.

Given the PR context mentions test timeouts and large API payloads, avoiding unnecessary calls is critical for performance.

Apply this diff to validate first and wrap the API call in error handling:

   async get (req, res, next) {
     // if the query params don't contain what we need, redirect to the "get started" page,
     // this way the user can still proceed (but need to fill the dataset+orgName themselves)
     const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
     const validationResult = v.safeParse(QueryParams, req.query)
-    if (!(validationResult.success && datasets.has(dataset))) {
+    if (!validationResult.success) {
       logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
         { type: types.App, query: req.query })
       return res.redirect('/')
     }
+
+    let datasets
+    try {
+      datasets = await getDatasets()
+    } catch (err) {
+      logger.error('DeepLinkController.get(): failed to load datasets', { 
+        type: types.App, 
+        errorMessage: err.message, 
+        errorStack: err.stack 
+      })
+      return res.redirect('/')
+    }
+
+    if (!datasets.has(dataset)) {
+      logger.info('DeepLinkController.get(): dataset not found, redirecting to landing page',
+        { type: types.App, dataset })
+      return res.redirect('/')
+    }
src/utils/datasetLoader.js (4)

45-45: Fix typo in user-agent header.

The user-agent header says "Planning Data Provide" but should be "Planning Data Submit" to match the project name.

Apply this diff:

-    { headers: { 'user-agent': 'Planning Data Provide' } }
+    { headers: { 'user-agent': 'Planning Data Submit' } }

43-47: CRITICAL: Add timeout to fetch to prevent test hangs.

This is the root cause of the test timeouts described in the PR comments. The fetch call has no timeout, so when the planning data API returns a large payload or responds slowly, tests hang for 5+ minutes until the test framework times out. This matches the exact symptoms: "acceptance test timeouts", "large payload from https://www.planning.data.gov.uk/dataset.json", and "tests pass locally but timeout on GitHub Actions".

Apply this diff to add a configurable timeout using AbortController:

+  const timeoutMs = Number(config.planningDataApi?.timeoutMs ?? 10000)
+  const controller = new AbortController()
+  const timeoutId = setTimeout(() => controller.abort(), timeoutMs)
+
-  const res = await fetch(
-    `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
-    { headers: { 'user-agent': 'Planning Data Provide' } }
-  )
+  let res
+  try {
+    res = await fetch(
+      `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
+      {
+        headers: { 'user-agent': 'Planning Data Submit' },
+        signal: controller.signal
+      }
+    )
+  } catch (err) {
+    clearTimeout(timeoutId)
+    if (err.name === 'AbortError') {
+      throw new Error(`Failed to fetch datasets from API: request timeout after ${timeoutMs}ms`)
+    }
+    throw err
+  } finally {
+    clearTimeout(timeoutId)
+  }
-  if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)
+  if (!res.ok) {
+    throw new Error(`Failed to fetch datasets from API: ${res.status} ${res.statusText} (${res.url})`)
+  }

62-93: Add in-memory fallback cache for test and development environments.

This is a major contributor to the test timeout issue. When Redis is unavailable (common in test/dev environments), every single request to getDatasetNameMap hits the slow external API with the large payload. An in-memory cache would ensure that after the first (slow) request, subsequent calls are instant—dramatically improving test performance and reliability.

The PR comments specifically mention this: eveleighoj suggested "Implement caching (e.g., Redis) so the dataset list is not re-downloaded on each access" and noted that tests timeout because of repeated API calls.

Apply this diff to add a lightweight in-memory fallback:

 let redisClient
 
+// Lightweight in-memory cache: key → { value, expiresAt }
+const memoryCache = new Map()
+
+function memGet (key) {
+  const entry = memoryCache.get(key)
+  if (!entry) return null
+  if (entry.expiresAt < Date.now()) {
+    memoryCache.delete(key)
+    return null
+  }
+  return entry.value
+}
+
+function memSet (key, value, ttlSeconds) {
+  memoryCache.set(key, {
+    value,
+    expiresAt: Date.now() + ttlSeconds * 1000
+  })
+}
+
 export async function getDatasetNameMap (datasetKeys) {
   let nameMap = {}
   if (!Array.isArray(datasetKeys) || !datasetKeys.length) return {}
 
   // normalize order → consistent cache key
   const cacheKey = `dataset:${datasetKeys.slice().sort().join(',')}`
+  
+  // 1) Check in-memory cache first
+  const memHit = memGet(cacheKey)
+  if (memHit) return memHit
+  
   const client = await getRedisClient()
 
   if (client) {
     try {
       const cached = await client.get(cacheKey)
       if (cached) {
         nameMap = JSON.parse(cached)
+        // Also populate memory cache
+        memSet(cacheKey, nameMap, CACHE_TTL)
         return nameMap
       }
     } catch (err) {
       logger.warn(`datasetLoader/redis get error: ${err.message}`)
     }
   }
 
   // fallback → fetch fresh
   nameMap = await fetchDatasetNames(datasetKeys)
 
   if (client) {
     try {
       await client.setEx(cacheKey, CACHE_TTL, JSON.stringify(nameMap))
     } catch (err) {
       logger.warn(`datasetLoader/redis set error: ${err.message}`)
     }
   }
+  
+  // Always populate memory cache
+  memSet(cacheKey, nameMap, CACHE_TTL)
+  
   return nameMap
 }

8-34: CRITICAL: Missing error event listener on Redis client.

The Redis client is created without an error listener attached. According to best practices for the node-redis library, you must attach at least one 'error' listener to avoid unhandled exceptions from the EventEmitter, which can crash the process.

Based on learnings.

Apply this diff to attach the error listener immediately after client creation:

   if (!redisClient) {
     const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
     redisClient = createClient({
       url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
     })
+
+    redisClient.on('error', (err) => {
+      logger.warn(`datasetLoader/redis error: ${err.message}`)
+    })

     try {
       await redisClient.connect()
🧹 Nitpick comments (5)
src/controllers/checkDeepLinkController.js (2)

67-68: Clarify or track the form wizard timing issue.

The comment references a "bug with Form wizard" that causes a timing issue with async operations, but doesn't provide details or a tracking reference. Consider either elaborating on the issue in the comment or creating a ticket to track it, ensuring future maintainers understand why this workaround exists.


76-76: Remove commented-out code.

The commented super.post(req, res, next) call should be removed if it's no longer needed, as it adds clutter and can confuse future maintainers.

Apply this diff to remove the commented line:

       return res.redirect('/check/upload-method')
     }
-    // super.post(req, res, next)
   }
src/utils/datasetLoader.js (3)

12-15: Add support for Redis auth and TLS URLs.

Many production Redis instances require authentication and TLS. The current implementation only supports basic host/port/secure configuration. Consider supporting full Redis URLs (with embedded credentials) or separate username/password config options.

Apply this diff to support auth and full URLs:

   if (!redisClient) {
-    const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
-    redisClient = createClient({
-      url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
-    })
+    // Prefer a full URL if provided; else construct from parts
+    let url = config.redis.url
+    if (!url) {
+      const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
+      const creds = (config.redis.username || config.redis.password)
+        ? `${encodeURIComponent(config.redis.username || '')}:${encodeURIComponent(config.redis.password || '')}@`
+        : ''
+      url = `${urlPrefix}://${creds}${config.redis.host}:${config.redis.port}`
+    }
+    redisClient = createClient({ url })

8-34: Export a cleanup function for graceful shutdown and test teardown.

The module maintains a singleton Redis client but doesn't expose a way to close it. This can cause test processes to hang due to open handles and prevents graceful shutdown in production.

Based on learnings.

Add this export after the getRedisClient function:

/**
 * Close the Redis client for graceful shutdown or test cleanup.
 * Safe to call multiple times.
 */
export async function closeRedisClient () {
  if (redisClient) {
    try {
      await redisClient.quit()
    } catch (err) {
      logger.warn(`datasetLoader/failed to close redis client: ${err.message}`)
    }
    redisClient = null
  }
}

62-93: Consider adding protection against cache stampede.

If multiple requests arrive simultaneously before the cache is populated, they will all hit the slow external API concurrently. Consider using a simple promise-memoization pattern to ensure only one in-flight request per cache key.

Add this helper at module scope:

// Promise memoization to prevent cache stampede
const inflightRequests = new Map()

async function memoizeFetch (cacheKey, fetchFn) {
  if (inflightRequests.has(cacheKey)) {
    return inflightRequests.get(cacheKey)
  }
  
  const promise = fetchFn().finally(() => {
    inflightRequests.delete(cacheKey)
  })
  
  inflightRequests.set(cacheKey, promise)
  return promise
}

Then wrap the fetch call:

// Inside getDatasetNameMap, after cache miss:
nameMap = await memoizeFetch(cacheKey, () => fetchDatasetNames(datasetKeys))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eab17ea and 726fe62.

📒 Files selected for processing (2)
  • src/controllers/checkDeepLinkController.js (3 hunks)
  • src/utils/datasetLoader.js (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/datasetLoader.js
🧬 Code graph analysis (2)
src/controllers/checkDeepLinkController.js (2)
src/controllers/datasetController.js (4)
  • req (26-26)
  • datasets (14-14)
  • datasets (25-25)
  • requiresGeometryTypeToBeSelectedViaDeepLink (24-29)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
src/utils/datasetLoader.js (3)
config/index.js (1)
  • config (13-13)
src/utils/logger.js (1)
  • logger (21-28)
src/utils/utils.js (3)
  • datasetKeys (11-24)
  • datasets (162-162)
  • nameMap (25-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test

Comment thread src/controllers/checkDeepLinkController.js Outdated
Comment thread src/utils/datasetLoader.js
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/controllers/checkDeepLinkController.js (2)

46-56: Validate query parameters before fetching datasets.

The getDatasets() call at line 50 occurs before query parameter validation at line 51. This is inefficient because dataset fetching (which may involve API calls or cache lookups) happens even when the query parameters are invalid.

Validate req.query first using v.safeParse(QueryParams, req.query), and only call await getDatasets() if validation passes.

Apply this diff to reorder the operations:

   async get (req, res, next) {
     // if the query params don't contain what we need, redirect to the "get started" page,
     // this way the user can still proceed (but need to fill the dataset+orgName themselves)
     const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
     const validationResult = v.safeParse(QueryParams, req.query)
+    if (!validationResult.success) {
+      logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
+        { type: types.App, query: req.query })
+      return res.redirect('/')
+    }
+    const datasets = await getDatasets()
     if (!(validationResult.success && datasets.has(dataset))) {
       logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
         { type: types.App, query: req.query })
       return res.redirect('/')
     }

46-72: Add error handling for async operations.

The async calls to getDatasets() (line 50) and requiresGeometryTypeToBeSelectedViaDeepLink() (line 68) lack error handling. In Express 4, unhandled promise rejections in async middleware can cause the application to hang or crash.

Wrap the async operations in try/catch blocks to handle failures gracefully.

Apply this diff to add error handling:

   async get (req, res, next) {
-    // if the query params don't contain what we need, redirect to the "get started" page,
-    // this way the user can still proceed (but need to fill the dataset+orgName themselves)
-    const { dataset, orgName, orgId } = req.query
-    const datasets = await getDatasets()
-    const validationResult = v.safeParse(QueryParams, req.query)
-    if (!(validationResult.success && datasets.has(dataset))) {
-      logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
-        { type: types.App, query: req.query })
-      return res.redirect('/')
-    }
-
-    req.sessionModel.set('dataset', dataset)
-    const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
-    req.sessionModel.set('data-subject', datasetInfo.dataSubject)
-    const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text }
-    maybeSetReferrer(req, sessionData)
-    req.sessionModel.set(this.sessionKey, sessionData)
-
-    this.#addHistoryStep(req, '/check/dataset')
-
-    // Pre-calculate geometry requirement to avoid async timing issues in form wizard async conditional routing
-    const requiresGeometry = await requiresGeometryTypeToBeSelectedViaDeepLink(req)
-    req.sessionModel.set('requiresGeometryTypeSelection', requiresGeometry)
-
-    super.post(req, res, next)
+    try {
+      // if the query params don't contain what we need, redirect to the "get started" page,
+      // this way the user can still proceed (but need to fill the dataset+orgName themselves)
+      const { dataset, orgName, orgId } = req.query
+      const validationResult = v.safeParse(QueryParams, req.query)
+      if (!validationResult.success) {
+        logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
+          { type: types.App, query: req.query })
+        return res.redirect('/')
+      }
+
+      const datasets = await getDatasets()
+      if (!datasets.has(dataset)) {
+        logger.info('DeepLinkController.get(): unknown dataset in deep link, redirecting to landing page',
+          { type: types.App, dataset })
+        return res.redirect('/')
+      }
+
+      req.sessionModel.set('dataset', dataset)
+      const datasetInfo = datasets.get(dataset) ?? { dataSubject: '', requiresGeometryTypeSelection: false }
+      req.sessionModel.set('data-subject', datasetInfo.dataSubject)
+      const sessionData = { 'data-subject': datasetInfo.dataSubject, orgName, orgId, dataset, datasetName: datasetInfo.text }
+      maybeSetReferrer(req, sessionData)
+      req.sessionModel.set(this.sessionKey, sessionData)
+
+      this.#addHistoryStep(req, '/check/dataset')
+
+      // Pre-calculate geometry requirement to avoid async timing issues in form wizard async conditional routing
+      const requiresGeometry = await requiresGeometryTypeToBeSelectedViaDeepLink(req)
+      req.sessionModel.set('requiresGeometryTypeSelection', requiresGeometry)
+
+      super.post(req, res, next)
+    } catch (err) {
+      logger.error('DeepLinkController.get(): error processing deep link', {
+        type: types.App,
+        errorMessage: err.message,
+        errorStack: err.stack
+      })
+      return res.redirect('/')
+    }
   }
src/controllers/datasetController.js (1)

32-36: Add error handling to async locals method.

The async locals method lacks error handling. In Express 4, unhandled promise rejections in async middleware can cause application failures. If getDataSubjects() or super.locals() fails, the error will not be caught.

Wrap the method body in try/catch and forward errors to Express using next(err).

Apply this diff to add error handling:

   async locals (req, res, next) {
-    const dataSubjects = await getDataSubjects()
-    req.form.options.datasetItems = availableDatasets(dataSubjects)
-    super.locals(req, res, next)
+    try {
+      const dataSubjects = await getDataSubjects()
+      req.form.options.datasetItems = availableDatasets(dataSubjects)
+      return super.locals(req, res, next)
+    } catch (err) {
+      req.handlerName = 'DatasetController.locals'
+      return next(err)
+    }
   }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50067d5 and 11a39e3.

📒 Files selected for processing (4)
  • src/controllers/checkDeepLinkController.js (3 hunks)
  • src/controllers/datasetController.js (2 hunks)
  • src/routes/form-wizard/check/steps.js (3 hunks)
  • test/unit/datasetController.test.js (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/unit/datasetController.test.js
🧰 Additional context used
🧬 Code graph analysis (2)
src/controllers/datasetController.js (1)
src/utils/utils.js (7)
  • datasets (162-162)
  • getDatasets (160-164)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
  • availableDatasets (173-179)
  • availableDatasets (176-176)
src/controllers/checkDeepLinkController.js (2)
src/controllers/datasetController.js (4)
  • req (26-26)
  • datasets (14-14)
  • datasets (25-25)
  • requiresGeometryTypeToBeSelectedViaDeepLink (24-29)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (2)
src/routes/form-wizard/check/steps.js (2)

3-3: LGTM! Import simplification aligns with session-based routing.

The removal of named imports is consistent with the refactoring to session-based routing flags, eliminating the need for function-based routing checks in this configuration file.


33-38: LGTM! Session-based routing avoids async timing issues.

The refactoring from function-based routing checks to a session boolean flag (requiresGeometryTypeSelection) is a cleaner approach that avoids async timing issues in the form wizard's conditional routing. This relies on controllers (datasetController and checkDeepLinkController) correctly setting the flag before routing, which they do.

Also applies to: 124-129

Comment thread src/controllers/datasetController.js
Comment thread src/controllers/datasetController.js Outdated
@pooleycodes
Copy link
Copy Markdown
Contributor

Copilot summary of slack messages for update:

Summary of Changes

  • Previous code Introduced an async function for conditional routing in hmpo-form-wizard.

  • Original implementation caused routing issues because the wizard does not handle async lookups well.

  • Fixed an accidental bug where all datasets were requesting geometry type by default.

Updated tests:

  • Previous integration tests were modified to accept the incorrect default behavior, which masked the issue.
    Acceptance tests remained unchanged, leading to GitHub failures.

Reason for Changes

  • Conditional routing was failing because the async Redis lookup for dataset names was too slow, causing the wizard to default to the geometry page.
  • Workaround implemented: set the dataset variable in the session within the controller for faster routing.
    Ensured tests now reflect the correct expected behavior rather than the accidental default.

Additional Notes

  • Redis caching for dataset names still works for return trips.
  • Conditional routing logic: if dataset-id is tree, it should route to the optional page. Previously, this condition was incorrectly triggering for all datasets.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/utils/datasetLoader.js (2)

11-19: Support Redis URL/auth/TLS and add error listener

Current client config ignores TLS/auth and lacks an error handler. Improves portability and observability.

Apply:

-    const urlPrefix = 'redis'
-    redisClient = createClient({
-      url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
-    })
+    const urlPrefix = `redis${config.redis.secure ? 's' : ''}`
+    const creds = (config.redis.username || config.redis.password)
+      ? `${encodeURIComponent(config.redis.username || '')}:${encodeURIComponent(config.redis.password)}@`
+      : ''
+    const url = config.redis.url || `${urlPrefix}://${creds}${config.redis.host}:${config.redis.port}`
+    redisClient = createClient({ url })
+    redisClient.on('error', (err) => {
+      logger.warn(`datasetLoader/redis error: ${err.message}`)
+    })

Optionally export closeRedisClient() for tests to avoid open handles.


43-47: Add HTTP timeout and clearer errors to stop CI hangs

fetch has no timeout; acceptance runs can stall. Add AbortController timeout (configurable), proper Accept/UA headers, and include status+URL in errors.

Apply:

-  const res = await fetch(
-    `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
-    { headers: { 'user-agent': 'Planning Data Provide' } }
-  )
-  if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)
+  const timeoutMs = Number(config?.planningDataApi?.timeoutMs ?? 10000)
+  const controller = new AbortController()
+  const timer = setTimeout(() => controller.abort(), timeoutMs)
+  let res
+  try {
+    res = await fetch(
+      `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
+      {
+        headers: {
+          'user-agent': 'Planning Data Submit',
+          'accept': 'application/json'
+        },
+        signal: controller.signal
+      }
+    )
+  } catch (err) {
+    if (err?.name === 'AbortError') {
+      throw new Error(`Dataset fetch timed out after ${timeoutMs}ms (${config.mainWebsiteUrl}/dataset.json)`)
+    }
+    throw err
+  } finally {
+    clearTimeout(timer)
+  }
+  if (!res.ok) {
+    throw new Error(`Failed to fetch datasets from API: ${res.status} ${res.statusText} (${res.url})`)
+  }
🧹 Nitpick comments (3)
src/assets/js/statusPage.js (1)

2-7: Constants synchronised but code duplication present across multiple files.

The client-side definition matches the server-side constant exactly (['COMPLETE', 'FAILED']), confirming current synchronisation. However, verification reveals this constant is now defined in three places: src/utils/utils.js, src/models/requestData.js, and src/assets/js/statusPage.js. Whilst the client-side separation is architecturally sound, the duplication pattern—already present server-side—creates maintenance risk if status values change.

Consider whether this warrants consolidating to a single source of truth for all three definitions.

src/utils/datasetLoader.js (2)

39-42: Harden input and de‑duplicate keys before building query

Prevent accidental strings/iterables and duplicate dataset params.

Apply:

-  if (!datasetKeys?.length) return {}
-  const params = new URLSearchParams()
-  datasetKeys.forEach(d => params.append('dataset', d))
+  if (!Array.isArray(datasetKeys) || datasetKeys.length === 0) return {}
+  const params = new URLSearchParams()
+  const keys = Array.from(new Set(datasetKeys.filter(Boolean))).sort()
+  keys.forEach(d => params.append('dataset', d))

4-4: Normalise local import path

Since logger.js is in the same directory, prefer a relative path without .. for clarity.

Apply:

-import logger from '../utils/logger.js'
+import logger from './logger.js'
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11a39e3 and f86b1c5.

📒 Files selected for processing (2)
  • src/assets/js/statusPage.js (1 hunks)
  • src/utils/datasetLoader.js (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/datasetLoader.js
🧬 Code graph analysis (1)
src/utils/datasetLoader.js (3)
config/index.js (1)
  • config (13-13)
src/utils/logger.js (1)
  • logger (21-28)
src/utils/utils.js (3)
  • datasetKeys (11-24)
  • datasets (162-162)
  • nameMap (25-25)

Comment thread src/utils/datasetLoader.js Outdated
Comment thread src/utils/datasetLoader.js Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (6)
test/acceptance/request_check.test.js (4)

29-43: Add error handling and timeout feedback to polling helper.

The polling loop completes 60 iterations but doesn't throw an error if processing never finishes, making test failures silent and difficult to debug.

Apply this diff to add explicit timeout handling:

 async function waitForStatusPageToBeProcessing (statusPage) {
   // wait up to 60 seconds for processing to finish
   for (let i = 0; i < 60; i++) {
     if (!(await statusPage.page.url().includes('/check/status'))) {
-      break
+      return
     }
 
     await statusPage.page.waitForTimeout(1000)
 
     if (await statusPage.isCheckStatusButtonVisible()) {
       await statusPage.clickCheckStatusButton()
-      break
+      return
     }
   }
+  throw new Error('Status page processing did not complete within 60 seconds')
 }

56-58: Inconsistent page navigation pattern.

Line 56 calls clickContinue() without capturing the return, whilst line 58 obtains the StatusPage via startPage.verifyAndReturnPage(StatusPage). This differs from line 103 in checkErrorDataFile, where statusPage is captured directly from uploadFilePage.clickContinue().

Apply this diff to align with the pattern used at line 103:

-  await uploadFilePage.clickContinue()
-
-  const statusPage = await startPage.verifyAndReturnPage(StatusPage)
+  const statusPage = await uploadFilePage.clickContinue()

74-76: Remove redundant, unawaited assertion.

Line 75 calls resultsPage.expectPageHasTabs(jsEnabled) without awaiting, and line 82 correctly awaits the same assertion. The unawaited call at line 75 is redundant and may execute out of order.

Apply this diff:

   if (jsEnabled) {
     resultsPage = await statusPage.verifyAndReturnPage(ResultsPage)
-    resultsPage.expectPageHasTabs(jsEnabled)
   } else {

246-246: JS-specific polling used in JavaScript-disabled test.

Line 246 calls waitForStatusPageToBeProcessing, which uses isCheckStatusButtonVisible() and clickCheckStatusButton(), within a JavaScript-disabled test context. The datafile tests (lines 64–66) correctly use the explicit non-JS sequence.

Apply this diff to match the non-JS flow pattern:

   await statusPage.waitForPage()
-  await waitForStatusPageToBeProcessing(statusPage)
+  await statusPage.expectCheckStatusButtonToBeVisible()
+  await page.waitForTimeout(5000)
+  await statusPage.clickCheckStatusButton()

Apply the same fix at line 274.

src/controllers/checkDeepLinkController.js (1)

50-56: Validate first, then fetch; improve error logging.

Avoid an expensive getDatasets() before basic query validation; also log errors with structured meta. This reduces network load and failure blast radius.

-    let datasets = null
-    try {
-      datasets = await getDatasets()
-    } catch (error) {
-      logger.error('getDatasets Error', error)
-      return res.redirect('/')
-    }
-
-    const validationResult = v.safeParse(QueryParams, req.query)
-    if (!(validationResult.success && datasets.has(dataset))) {
+    const validationResult = v.safeParse(QueryParams, req.query)
+    if (!validationResult.success) {
+      logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page', { type: types.App, query: req.query })
+      return res.redirect('/')
+    }
+    let datasets
+    try {
+      datasets = await getDatasets()
+    } catch (err) {
+      logger.error('DeepLinkController.get(): failed to load datasets', { type: types.App, errorMessage: err.message, errorStack: err.stack })
+      return res.redirect('/')
+    }
+    if (!datasets.has(dataset)) {
       logger.info('DeepLinkController.get(): invalid params for deep link, redirecting to landing page',
         { type: types.App, query: req.query })
       return res.redirect('/')
     }

Also applies to: 58-63

src/utils/datasetLoader.js (1)

43-48: Add HTTP timeout and proper headers to stop test hangs/timeouts.

fetch has no timeout; slow API causes 5‑minute acceptance test stalls. Add AbortController timeout, Accept header, correct UA, and richer error messages.

-  const res = await fetch(
-    `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
-    { headers: { 'user-agent': 'Planning Data Provide' } }
-  )
-  if (!res.ok) throw new Error(`Failed to fetch datasets from API: ${res.statusText}`)
+  const timeoutMs = Number(config?.planningDataApi?.timeoutMs ?? 10000)
+  const controller = new AbortController()
+  const timer = setTimeout(() => controller.abort(), timeoutMs)
+  let res
+  try {
+    res = await fetch(
+      `${config.mainWebsiteUrl}/dataset.json?${params.toString()}&field=dataset&field=name&include_typologies=false`,
+      {
+        headers: {
+          'user-agent': 'Planning Data Submit',
+          'accept': 'application/json'
+        },
+        signal: controller.signal
+      }
+    )
+  } catch (err) {
+    if (err?.name === 'AbortError') {
+      throw new Error(`Timed out fetching datasets after ${timeoutMs}ms (${config.mainWebsiteUrl})`)
+    }
+    throw err
+  } finally {
+    clearTimeout(timer)
+  }
+  if (!res.ok) {
+    throw new Error(`Failed to fetch datasets from API: ${res.status} ${res.statusText} (${res.url})`)
+  }
🧹 Nitpick comments (6)
src/controllers/checkDeepLinkController.js (2)

8-8: Reuse datasetInfo to avoid a second async lookup.

You already have datasetInfo; don’t call back into another controller (and re-fetch). It’s faster and avoids routing race conditions.

-import { requiresGeometryTypeToBeSelectedViaDeepLink } from './datasetController.js'
+// no longer needed when reusing datasetInfo; remove import above
-    // Pre-calculate geometry requirement to avoid async timing issues in form wizard async conditional routing
-    const requiresGeometry = await requiresGeometryTypeToBeSelectedViaDeepLink(req)
-    req.sessionModel.set('requiresGeometryTypeSelection', requiresGeometry)
+    // Pre-calculate geometry requirement using datasetInfo already loaded
+    const requiresGeometry = Boolean(datasetInfo?.requiresGeometryTypeSelection)
+    req.sessionModel.set('requiresGeometryTypeSelection', requiresGeometry)

Also applies to: 74-77


78-78: Return the result of super.post.

Prevents any accidental double-send and matches typical Express flow.

-    super.post(req, res, next)
+    return super.post(req, res, next)
src/utils/datasetLoader.js (2)

6-7: Add in‑memory fallback cache and configurable TTL to reduce API hits.

Improves local/dev and first‑hit latency when Redis is absent; cuts CI flakiness.

 let redisClient
+
+// lightweight in‑memory cache
+const memoryCache = new Map()
+function memGet (key) {
+  const entry = memoryCache.get(key)
+  if (!entry) return null
+  if (entry.expiresAt < Date.now()) { memoryCache.delete(key); return null }
+  return entry.value
+}
+function memSet (key, value, ttlSeconds) {
+  memoryCache.set(key, { value, expiresAt: Date.now() + ttlSeconds * 1000 })
+}
-const CACHE_TTL = 300 // 5min
+const CACHE_TTL = Number(config?.planningDataApi?.cacheTtlSeconds ?? 300) // default 5min
 export async function getDatasetNameMap (datasetKeys) {
   let nameMap = {}
   if (!Array.isArray(datasetKeys) || !datasetKeys.length) return {}
 
-  const cacheKey = `dataset:${datasetKeys.slice().sort().join(',')}`
+  const normalised = datasetKeys.slice().filter(Boolean).sort()
+  const cacheKey = `dataset:${normalised.join(',')}`
+  // 1) memory cache first
+  const memHit = memGet(cacheKey)
+  if (memHit) return memHit
   const client = await getRedisClient()
 
   if (client) {
     try {
       const cached = await client.get(cacheKey)
       if (cached) {
         nameMap = JSON.parse(cached)
-        return nameMap
+        // hydrate memory cache too
+        memSet(cacheKey, nameMap, CACHE_TTL)
+        return nameMap
       }
     } catch (err) {
       logger.warn(`datasetLoader/redis get error: ${err.message}`)
     }
   }
 
   // fallback → fetch fresh
   nameMap = await fetchDatasetNames(datasetKeys)
 
   if (client) {
     try {
       await client.setEx(cacheKey, CACHE_TTL, JSON.stringify(nameMap))
     } catch (err) {
       logger.warn(`datasetLoader/redis set error: ${err.message}`)
     }
   }
-  return nameMap
+  // always set memory cache
+  memSet(cacheKey, nameMap, CACHE_TTL)
+  return nameMap
 }

Also applies to: 36-36, 62-92


12-15: Support Redis auth/TLS URLs and expose a close hook for tests.

Broader environment support and clean shutdown to avoid open handles.

-    const urlPrefix = 'redis'
-    redisClient = createClient({
-      url: `${urlPrefix}://${config.redis.host}:${config.redis.port}`
-    })
+    const urlPrefix = `redis${config.redis?.secure ? 's' : ''}`
+    const creds = (config.redis?.username || config.redis?.password)
+      ? `${encodeURIComponent(config.redis.username || '')}:${encodeURIComponent(config.redis.password || '')}@`
+      : ''
+    const url = config.redis?.url || `${urlPrefix}://${creds}${config.redis.host}:${config.redis.port}`
+    redisClient = createClient({ url })
+    redisClient.on('error', (err) => {
+      logger.warn(`datasetLoader/redis error: ${err.message}`)
+    })
-  return redisClient
+  return redisClient
 }
+
+// Optional: allow tests/process shutdown to close Redis cleanly
+export async function closeRedisClient () {
+  if (redisClient) {
+    try { await redisClient.quit() } catch {}
+    redisClient = null
+  }
+}

Also applies to: 33-34

src/controllers/datasetController.js (2)

33-41: Return from locals and keep error path consistent.

Return the result of super.locals in both paths; keeps middleware chain explicit and avoids accidental fall‑through.

-  async locals (req, res, next) {
-    try {
-      const dataSubjects = await getDataSubjects()
-      req.form.options.datasetItems = availableDatasets(dataSubjects)
-    } catch (error) {
-      logger.error('Failed to load dataset subjects:', error)
-      req.form.options.datasetItems = []
-    }
-    super.locals(req, res, next)
-  }
+  async locals (req, res, next) {
+    try {
+      const dataSubjects = await getDataSubjects()
+      req.form.options.datasetItems = availableDatasets(dataSubjects)
+      return super.locals(req, res, next)
+    } catch (error) {
+      logger.error('Failed to load dataset subjects:', error)
+      req.form.options.datasetItems = []
+      return super.locals(req, res, next)
+    }
+  }

45-60: Return from post; keep structured logging.

Return super.post for clarity. Consider structured error meta for easier triage.

   async post (req, res, next) {
     try {
       const datasets = await getDatasets()
       const dataset = req.body.dataset
       const dataSet = datasets.get(dataset) || { dataSubject: '' }
       const { dataSubject } = dataSet
       req.body['data-subject'] = dataSubject
       // Set in session to avoid async next routing issues
       const requiresGeometry = dataSet?.requiresGeometryTypeSelection || false
       req.sessionModel.set('requiresGeometryTypeSelection', requiresGeometry)
     } catch (error) {
-      logger.error('DatasetController.post: failed to set data-subject:', error)
+      logger.error('DatasetController.post: failed to set data-subject', { errorMessage: error?.message, errorStack: error?.stack })
       req.body['data-subject'] = ''
       req.sessionModel.set('requiresGeometryTypeSelection', false)
     }
-    super.post(req, res, next)
+    return super.post(req, res, next)
   }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a9058b7 and 0695a1b.

📒 Files selected for processing (4)
  • src/controllers/checkDeepLinkController.js (3 hunks)
  • src/controllers/datasetController.js (2 hunks)
  • src/utils/datasetLoader.js (1 hunks)
  • test/acceptance/request_check.test.js (13 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-29T14:58:22.843Z
Learnt from: GeorgeGoodall-GovUk
PR: digital-land/submit#593
File: index.js:20-20
Timestamp: 2024-10-29T14:58:22.843Z
Learning: The function `initDatasetSlugToReadableNameFilter()` in `src/utils/datasetSlugToReadableName.js` already includes appropriate error handling and logging.

Applied to files:

  • src/utils/datasetLoader.js
🧬 Code graph analysis (4)
src/utils/datasetLoader.js (2)
src/utils/logger.js (1)
  • logger (21-28)
src/utils/utils.js (2)
  • datasetKeys (11-24)
  • nameMap (25-25)
test/acceptance/request_check.test.js (3)
test/PageObjectModels/uploadMethodPage.js (1)
  • UploadMethodPage (10-29)
test/PageObjectModels/statusPage.js (1)
  • StatusPage (4-54)
test/PageObjectModels/resultsPage.js (1)
  • ResultsPage (9-80)
src/controllers/datasetController.js (2)
src/utils/utils.js (7)
  • datasets (162-162)
  • getDatasets (160-164)
  • dataSubjects (132-132)
  • dataSubjects (161-161)
  • getDataSubjects (131-134)
  • availableDatasets (173-179)
  • availableDatasets (176-176)
src/utils/logger.js (1)
  • logger (21-28)
src/controllers/checkDeepLinkController.js (2)
src/controllers/datasetController.js (4)
  • req (27-27)
  • datasets (15-15)
  • datasets (26-26)
  • requiresGeometryTypeToBeSelectedViaDeepLink (25-30)
src/utils/utils.js (2)
  • datasets (162-162)
  • getDatasets (160-164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (3)
test/acceptance/request_check.test.js (3)

11-13: LGTM: Import changes support the refactoring.

The imports correctly bring in StartPage, StatusPage, and ResultsPage for the new navigation pattern.


170-171: LGTM: Consistent StartPage navigation pattern.

The tests correctly use startPage.verifyAndReturnPage(UploadMethodPage) to obtain page objects, aligning with the new navigation architecture.

Also applies to: 198-199, 235-236, 263-264, 288-289


181-182: LGTM: Appropriate use of helpers in JS-enabled tests.

The JS-enabled URL tests correctly use waitForStatusPageToBeProcessing for polling and verifyAndReturnPage(ResultsPage) for navigation.

Also applies to: 186-186, 209-210, 214-214

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
test/acceptance/request_check.test.js (3)

56-58: Confusing navigation flow — remove redundant clickContinue call.

Line 56 calls await uploadFilePage.clickContinue() but ignores the returned page object. Line 58 then obtains the StatusPage via a different mechanism (startPage.verifyAndReturnPage). Based on the uploadFilePage implementation, clickContinue() should return the next page object.

This creates a confusing flow with potentially redundant navigation or verification.

Apply the fix suggested in the previous review:

-  await uploadFilePage.clickContinue()
-
-  const statusPage = await startPage.verifyAndReturnPage(StatusPage)
+  const statusPage = await uploadFilePage.clickContinue()

246-246: JS-disabled tests incorrectly use JS-specific polling helper.

Lines 246 and 274 call waitForStatusPageToBeProcessing(statusPage) within JavaScript-disabled test scenarios. This helper relies on isCheckStatusButtonVisible() and clickCheckStatusButton(), which expect JavaScript functionality.

The datafile tests correctly handle JS-disabled scenarios with an explicit flow (lines 64-66, 111-112): calling expectCheckStatusButtonToBeVisible(), waiting, then clicking manually.

Apply the fix suggested in the previous review — replace the polling calls with the explicit non-JS sequence:

Line 246:

   await statusPage.waitForPage()
-  await waitForStatusPageToBeProcessing(statusPage)
+  await statusPage.expectCheckStatusButtonToBeVisible()
+  await page.waitForTimeout(5000)
+  const resultsPage = await statusPage.clickCheckStatusButton()
-
-  const id = await statusPage.getIdFromUrl()
+  const id = await statusPage.getIdFromUrl()

Line 274:

   await statusPage.waitForPage()
-  await waitForStatusPageToBeProcessing(statusPage)
+  await statusPage.expectCheckStatusButtonToBeVisible()
+  await page.waitForTimeout(5000)
+  const resultsPage = await statusPage.clickCheckStatusButton()
-
-  const id = await statusPage.getIdFromUrl()
+  const id = await statusPage.getIdFromUrl()

Note: You'll also need to remove the later resultsPage = await statusPage.verifyAndReturnPage(ResultsPage) calls (lines 251, 279) since the resultsPage will already be obtained from clickCheckStatusButton().

Also applies to: 274-274


74-75: Remove unawaited and redundant expectPageHasTabs call.

Line 75 calls resultsPage.expectPageHasTabs(jsEnabled) without await and immediately after obtaining the results page. This assertion runs before completing, and it's redundant since line 82 performs the same check correctly (with await) after the conditional block.

Although a past review comment indicated this was addressed, the unawaited call remains.

Apply this diff:

   if (jsEnabled) {
     resultsPage = await statusPage.verifyAndReturnPage(ResultsPage)
-    resultsPage.expectPageHasTabs(jsEnabled)
   } else {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0695a1b and d74b059.

📒 Files selected for processing (1)
  • test/acceptance/request_check.test.js (14 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/acceptance/request_check.test.js (3)
test/PageObjectModels/uploadMethodPage.js (1)
  • UploadMethodPage (10-29)
test/PageObjectModels/statusPage.js (1)
  • StatusPage (4-54)
test/PageObjectModels/resultsPage.js (1)
  • ResultsPage (9-80)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests / test
🔇 Additional comments (2)
test/acceptance/request_check.test.js (2)

11-13: LGTM! Navigation pattern correctly updated.

The new imports and StartPage-driven navigation align well with the refactored flow described in the PR objectives.

Also applies to: 17-20


311-332: LGTM! Robust error handling for variable server error messages.

The enhanced error checking logic appropriately handles the variability in server error responses by checking for the #bad-upload element and then looping through multiple possible error messages. The try-catch approach with a final assertion provides good test resilience while maintaining clear failure messages.

Comment on lines +29 to +43
async function waitForStatusPageToBeProcessing (statusPage) {
// wait up to 60 seconds for processing to finish
for (let i = 0; i < 60; i++) {
if (!(await statusPage.page.url().includes('/check/status'))) {
break
}

await statusPage.page.waitForTimeout(1000)

if (await statusPage.isCheckStatusButtonVisible()) {
await statusPage.clickCheckStatusButton()
break
}
} catch (err) {
hasMap = false
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix incorrect await usage and improve polling robustness.

Line 32 incorrectly uses await on statusPage.page.url(), which is a synchronous method in Playwright. This should be called without await.

Additionally, the past review comment about improving this polling mechanism remains relevant — the helper lacks timeout error handling and uses hard-coded intervals.

Apply this diff to fix the await issue:

-    if (!(await statusPage.page.url().includes('/check/status'))) {
+    if (!statusPage.page.url().includes('/check/status')) {

Consider also implementing the more robust polling mechanism suggested in the previous review to add proper timeout handling and error reporting.

@pooleycodes pooleycodes removed their request for review October 29, 2025 12:03
@pooleycodes pooleycodes merged commit f8a0110 into main Oct 29, 2025
9 checks passed
@pooleycodes pooleycodes deleted the update-dataset-name-reference branch October 29, 2025 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update provide so that dataset names are used

4 participants