Skip to content

[DAPS-1830] - core foxx refactored json schema integration 1#1892

Merged
JoshuaSBrown merged 15 commits intodevelfrom
1830-DAPS-core-foxx-schema-integration_1
Mar 24, 2026
Merged

[DAPS-1830] - core foxx refactored json schema integration 1#1892
JoshuaSBrown merged 15 commits intodevelfrom
1830-DAPS-core-foxx-schema-integration_1

Conversation

@JoshuaSBrown
Copy link
Collaborator

@JoshuaSBrown JoshuaSBrown commented Mar 17, 2026

Ticket

Description

How Has This Been Tested?

Artifacts (if appropriate):

Tasks

  • - A description of the PR has been provided, and a diagram included if it is a new feature.
  • - Formatter has been run
  • - CHANGELOG comment has been added
  • - Labels have been assigned to the pr
  • - A reviwer has been added
  • - A user has been assigned to work on the pr
  • - If new feature a unit test has been added

Summary by Sourcery

Introduce a pluggable schema validation and storage architecture, refactor schema handling out of ClientWorker into dedicated components, and add comprehensive unit and integration tests for the new schema services.

Bug Fixes:

  • Fix mock ClientWorker version response to use the correct version namespace in tests.

Enhancements:

  • Refactor schema creation, revision, update, and metadata validation logic from ClientWorker into a dedicated SchemaHandler class using a DatabaseAPI dependency.
  • Add a generic ISchemaValidator interface with concrete implementations for local JSON Schema validation, external API–backed validation, and a no-op validator for legacy/native schemas.
  • Introduce a SchemaServiceFactory to route schema storage and validation requests to engine-specific implementations with configurable defaults.
  • Add an ISchemaStorage abstraction with Arango-backed and external-API-backed storage implementations, plus a SchemaAPIClient and configuration struct for REST-based schema operations.
  • Extend core/server build configuration to compile new schema validator, storage, and client handler components.

Build:

  • Update core/server and test CMake configurations to include new schema validator, storage, client handler, and integration test sources, and to generate mock core version headers as part of the test build.

Tests:

  • Add extensive unit tests for JsonSchemaValidator, SchemaHandler, and SchemaServiceFactory covering validation behavior, caching, and error handling.
  • Add an integration test suite for SchemaServiceFactory exercising realistic DataFed-style configurations and multi-engine behavior.
  • Update tests and mock core setup, including version handling and CMake changes, to support the new schema services in the test environment.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 17, 2026

Reviewer's Guide

Refactors schema validation and storage in the core server by extracting responsibilities into dedicated SchemaHandler, validator, storage, and factory components, updates ClientWorker to use SchemaHandler, and adds comprehensive unit and integration tests plus build wiring.

Sequence diagram for schema create request handling via SchemaHandler

sequenceDiagram
    actor Client
    participant ClientWorker
    participant SchemaHandler
    participant DatabaseAPI
    participant JsonLib as NlohmannJSON
    participant JsonValidator as JsonSchemaValidatorLib

    Client->>ClientWorker: send SchemaCreateRequest
    ClientWorker->>ClientWorker: PROC_MSG_BEGIN macro expands
    ClientWorker->>SchemaHandler: handleCreate(a_uid, request, reply, log_context)

    SchemaHandler->>DatabaseAPI: setClient(a_uid)
    SchemaHandler->>SchemaHandler: validateSchemaDefinition(request.def, log_context)
    activate SchemaHandler
    SchemaHandler->>JsonLib: parse(def) -> schema
    SchemaHandler->>SchemaHandler: enforceRequiredProperties(schema)
    SchemaHandler->>JsonValidator: create json_validator with schemaLoader callback
    SchemaHandler->>JsonValidator: set_root_schema(schema)
    deactivate SchemaHandler

    alt validation ok
        SchemaHandler->>DatabaseAPI: schemaCreate(request, log_context)
        DatabaseAPI-->>SchemaHandler: ack
        SchemaHandler-->>ClientWorker: return
        ClientWorker->>ClientWorker: PROC_MSG_END macro expands
        ClientWorker-->>Client: AckReply
    else validation error
        SchemaHandler-->>ClientWorker: throw TraceException("Invalid metadata schema")
        ClientWorker-->>Client: error propagated
    end
Loading

Class diagram for new schema validation and storage components

classDiagram
    direction LR

    class ClientWorker {
      -DatabaseAPI m_db_client
      -SchemaHandler* m_schema_handler
      +procSchemaCreateRequest(a_uid, msg_request, log_context) : unique_ptr~IMessage~
      +procSchemaReviseRequest(a_uid, msg_request, log_context) : unique_ptr~IMessage~
      +procSchemaUpdateRequest(a_uid, msg_request, log_context) : unique_ptr~IMessage~
      +procMetadataValidateRequest(a_uid, msg_request, log_context) : unique_ptr~IMessage~
    }

    class SchemaHandler {
      -DatabaseAPI& m_db_client
      +SchemaHandler(a_db_client)
      +handleCreate(a_uid, a_request, a_reply, log_context) : void
      +handleRevise(a_uid, a_request, a_reply, log_context) : void
      +handleUpdate(a_uid, a_request, a_reply, log_context) : void
      +handleMetadataValidate(a_uid, a_request, a_reply, log_context) : void
      +schemaLoader(a_uri, a_value, log_context) : void
      +enforceRequiredProperties(a_schema) static : void
      -validateSchemaDefinition(a_def, log_context) : void
    }

    class ISchemaValidator {
      <<interface>>
      +validateDefinition(a_schema_format, a_content, log_context) : ValidationResult
      +validateMetadata(a_schema_id, a_metadata_format, a_metadata_content, log_context) : ValidationResult
      +hasValidationCapability() : bool
      +cacheSchema(a_schema_id, a_content, a_format, log_context) : bool
      +evictSchema(a_schema_id) : void
    }

    class JsonSchemaValidator {
      -SchemaLoaderCallback m_loader
      -mutex m_loader_mutex
      -LogContext m_current_log_context
      -shared_mutex m_cache_mutex
      -unordered_map~string, shared_ptr~json_validator~~ m_schema_cache
      +JsonSchemaValidator(a_loader)
      +setSchemaLoader(a_loader) : void
      +validateDefinition(a_schema_format, a_content, log_context) : ValidationResult
      +validateMetadata(a_schema_id, a_metadata_format, a_metadata_content, log_context) : ValidationResult
      +hasValidationCapability() : bool
      +cacheSchema(a_schema_id, a_content, a_format, log_context) : bool
      +evictSchema(a_schema_id) : void
      +clearCache() : void
      +isCached(a_schema_id) : bool
      -enforceDataFedRequirements(a_schema) : void
      -schemaLoaderAdapter(a_uri, a_value) : void
      -compileSchema(a_schema, log_context) : shared_ptr~json_validator~
    }

    class ExternalSchemaValidator {
      -unique_ptr~SchemaAPIClient~ m_client
      -string m_engine
      +ExternalSchemaValidator(a_client, a_engine)
      +validateDefinition(a_schema_format, a_content, log_context) : ValidationResult
      +validateMetadata(a_schema_id, a_metadata_format, a_metadata_content, log_context) : ValidationResult
      +hasValidationCapability() : bool
    }

    class NullSchemaValidator {
      +NullSchemaValidator()
      +validateDefinition(a_schema_format, a_content, log_context) : ValidationResult
      +validateMetadata(a_schema_id, a_metadata_format, a_metadata_content, log_context) : ValidationResult
      +hasValidationCapability() : bool
    }

    class ISchemaStorage {
      <<interface>>
      +storeContent(a_id, a_content, a_desc, log_context) : string
      +retrieveContent(a_id, a_arango_def, log_context) : StorageRetrieveResult
      +updateContent(a_id, a_content, a_desc, log_context) : string
      +deleteContent(a_id, log_context) : void
    }

    class ArangoSchemaStorage {
      +ArangoSchemaStorage()
      +storeContent(a_id, a_content, a_desc, log_context) : string
      +retrieveContent(a_id, a_arango_def, log_context) : StorageRetrieveResult
      +updateContent(a_id, a_content, a_desc, log_context) : string
      +deleteContent(a_id, log_context) : void
    }

    class ExternalSchemaStorage {
      -unique_ptr~SchemaAPIClient~ m_client
      +ExternalSchemaStorage(a_client)
      +storeContent(a_id, a_content, a_desc, log_context) : string
      +retrieveContent(a_id, a_arango_def, log_context) : StorageRetrieveResult
      +updateContent(a_id, a_content, a_desc, log_context) : string
      +deleteContent(a_id, log_context) : void
    }

    class SchemaAPIClient {
      -SchemaAPIConfig m_config
      -CURL* m_curl
      +SchemaAPIClient(a_config)
      +~SchemaAPIClient()
      +isConfigured() : bool
      +putSchema(a_id, a_name, a_description, a_content, log_context) : void
      +patchSchema(a_id, a_name, a_description, a_content, log_context) : void
      +getSchema(a_id, log_context) : json
      +deleteSchema(a_id, log_context) : void
      +validateSchema(a_schema_format, a_engine, a_content, a_errors, log_context) : bool
      +validateMetadata(a_schema_id, a_metadata_format, a_engine, a_metadata_content, a_errors, a_warnings, log_context) : bool
      -httpGet(a_path, log_context) : json
      -httpPost(a_path, a_body, a_http_code, log_context) : json
      -httpPut(a_path, a_body, log_context) : json
      -httpPatch(a_path, a_body, log_context) : json
      -httpDelete(a_path, log_context) : void
      -curlPerform(a_method, a_url, a_body, a_http_code, log_context) : json
    }

    class SchemaAPIConfig {
      +string base_url
      +string bearer_token
      +bool verify_ssl
      +string ca_cert_path
      +string client_cert_path
      +string client_key_path
      +long connect_timeout_sec
      +long request_timeout_sec
      +isConfigured() : bool
      +hasAuth() : bool
    }

    class SchemaServiceFactory {
      -shared_ptr~ISchemaStorage~ m_default_storage
      -shared_ptr~ISchemaValidator~ m_default_validator
      -unordered_map~string, shared_ptr~ISchemaStorage~~ m_storage
      -unordered_map~string, shared_ptr~ISchemaValidator~~ m_validators
      +setDefaultStorage(a_storage) : void
      +registerStorage(a_engine, a_storage) : void
      +getStorage(a_engine) : ISchemaStorage
      +setDefaultValidator(a_validator) : void
      +registerValidator(a_engine, a_validator) : void
      +getValidator(a_engine) : ISchemaValidator
      +hasCustomStorage(a_engine) : bool
      +hasCustomValidator(a_engine) : bool
      -resolveStorage(a_engine) : ISchemaStorage
      -resolveValidator(a_engine) : ISchemaValidator
    }

    class ValidationResult {
      +bool valid
      +string errors
      +string warnings
      +Ok(a_warnings) static : ValidationResult
      +Fail(a_errors) static : ValidationResult
    }

    class StorageRetrieveResult {
      +bool success
      +string content
      +string error
      +Ok(a_content) static : StorageRetrieveResult
      +Fail(a_error) static : StorageRetrieveResult
    }

    %% Relationships
    ClientWorker --> SchemaHandler : uses
    SchemaHandler --> DatabaseAPI : uses

    ISchemaValidator <|.. JsonSchemaValidator
    ISchemaValidator <|.. ExternalSchemaValidator
    ISchemaValidator <|.. NullSchemaValidator

    ISchemaStorage <|.. ArangoSchemaStorage
    ISchemaStorage <|.. ExternalSchemaStorage

    ExternalSchemaStorage --> SchemaAPIClient : uses
    ExternalSchemaValidator --> SchemaAPIClient : uses

    SchemaAPIClient --> SchemaAPIConfig : has

    SchemaServiceFactory --> ISchemaStorage : returns
    SchemaServiceFactory --> ISchemaValidator : returns
    SchemaServiceFactory o--> ISchemaStorage : holds
    SchemaServiceFactory o--> ISchemaValidator : holds
Loading

File-Level Changes

Change Details Files
Move schema-specific logic out of ClientWorker into a dedicated SchemaHandler and adjust ClientWorker usage.
  • Instantiate a SchemaHandler in ClientWorker and store it as a unique_ptr member.
  • Replace inline implementations of schema create/revise/update/metadata-validate in ClientWorker with calls to corresponding SchemaHandler methods.
  • Remove ClientWorker-specific schema helper methods such as schemaEnforceRequiredProperties and schemaLoader, updating record validation code to use SchemaHandler::schemaLoader instead.
core/server/ClientWorker.cpp
core/server/ClientWorker.hpp
Introduce a SchemaHandler component that encapsulates schema validation and database interactions.
  • Create SchemaHandler class to handle schema create, revise, update, and metadata validation using DatabaseAPI.
  • Implement static enforceRequiredProperties and a schemaLoader helper to centralize DataFed schema requirements and $ref loading.
  • Ensure SchemaHandler uses stack-local error handling for json-schema validation to avoid shared state across threads.
core/server/client_handlers/SchemaHandler.hpp
core/server/client_handlers/SchemaHandler.cpp
Add a pluggable schema validation and storage abstraction layer, including local, external, and null implementations.
  • Introduce ISchemaValidator and ValidationResult to abstract validation behavior from concrete engines.
  • Implement JsonSchemaValidator with local nlohmann-based validation, caching, and $ref resolution via a configurable loader.
  • Implement ExternalSchemaValidator that delegates definition and metadata validation to an external SchemaAPI via REST.
  • Introduce ISchemaStorage plus ArangoSchemaStorage and ExternalSchemaStorage to abstract schema content storage location.
  • Add SchemaAPIConfig and SchemaAPIClient to configure and call the external Schema Management API, including TLS and timeouts.
  • Introduce SchemaServiceFactory to route storage and validator instances based on engine name, with support for defaults and custom registrations.
core/server/ISchemaValidator.hpp
core/server/schema_validators/JsonSchemaValidator.hpp
core/server/schema_validators/JsonSchemaValidator.cpp
core/server/schema_validators/ExternalSchemaValidator.hpp
core/server/schema_validators/ExternalSchemaValidator.cpp
core/server/schema_validators/NullSchemaValidator.hpp
core/server/ISchemaStorage.hpp
core/server/schema_storage/ArangoSchemaStorage.hpp
core/server/schema_storage/ExternalSchemaStorage.hpp
core/server/schema_storage/ExternalSchemaStorage.cpp
core/server/SchemaAPIConfig.hpp
core/server/SchemaAPIClient.hpp
core/server/SchemaAPIClient.cpp
core/server/SchemaServiceFactory.hpp
core/server/SchemaServiceFactory.cpp
Add extensive unit and integration tests for the new schema components and wire them into the build.
  • Create unit tests for JsonSchemaValidator covering schema definition validation, caching behavior, metadata validation, $ref resolution, complex schema scenarios, and edge cases.
  • Create unit tests for SchemaHandler focusing on enforceRequiredProperties and end-to-end create/revise/update flows against a test DatabaseAPI when available.
  • Create unit and integration tests for SchemaServiceFactory to validate registration, default resolution, engine routing, and multi-validator behavior.
  • Update CMakeLists to include new source directories and test binaries, and configure a generated Version.hpp for mock_core tests.
  • Fix mock_core ClientWorker version response to use the correct version namespace in tests.
core/server/tests/unit/test_JsonSchemaValidator.cpp
core/server/tests/unit/test_SchemaHandler.cpp
core/server/tests/unit/test_SchemaServiceFactory.cpp
core/server/tests/integration/test_SchemaServiceFactory.cpp
core/server/tests/unit/CMakeLists.txt
core/server/tests/integration/CMakeLists.txt
core/server/CMakeLists.txt
tests/mock_core/ClientWorker.cpp
tests/CMakeLists.txt
tests/mock_core/Version.hpp.in

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • SchemaAPIClient currently ignores several fields in SchemaAPIConfig (notably bearer_token, client_cert_path, and client_key_path), so if these are expected to control authentication/TLS you should wire them into curlPerform (e.g., add an Authorization header and client certificate options) or explicitly document that they are unused.
  • The DataFed-specific schema checks are implemented twice (JsonSchemaValidator::enforceDataFedRequirements and SchemaHandler::enforceRequiredProperties); consider centralizing this logic in a single helper to avoid future divergence in behavior between paths.
  • ExternalSchemaStorage and ExternalSchemaValidator are documented as not thread-safe but there is no guard at the factory or call sites; consider enforcing one-instance-per-thread more explicitly (e.g., via construction patterns or comments where they are instantiated) to prevent accidental sharing across threads.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- SchemaAPIClient currently ignores several fields in SchemaAPIConfig (notably bearer_token, client_cert_path, and client_key_path), so if these are expected to control authentication/TLS you should wire them into curlPerform (e.g., add an Authorization header and client certificate options) or explicitly document that they are unused.
- The DataFed-specific schema checks are implemented twice (JsonSchemaValidator::enforceDataFedRequirements and SchemaHandler::enforceRequiredProperties); consider centralizing this logic in a single helper to avoid future divergence in behavior between paths.
- ExternalSchemaStorage and ExternalSchemaValidator are documented as not thread-safe but there is no guard at the factory or call sites; consider enforcing one-instance-per-thread more explicitly (e.g., via construction patterns or comments where they are instantiated) to prevent accidental sharing across threads.

## Individual Comments

### Comment 1
<location path="core/server/schema_validators/JsonSchemaValidator.cpp" line_range="52-61" />
<code_context>
+  m_loader = std::move(a_loader);
+}
+
+void JsonSchemaValidator::schemaLoaderAdapter(const nlohmann::json_uri &a_uri,
+                                              nlohmann::json &a_value) {
+  // Extract schema ID from URI path (skip leading "/")
+  // This matches the existing ClientWorker::schemaLoader behavior
+  std::string id = a_uri.path();
+  if (!id.empty() && id[0] == '/') {
+    id = id.substr(1);
+  }
+
+  // m_loader_mutex must be held by caller (compileSchema)
+  DL_DEBUG(m_current_log_context,
+           "JsonSchemaValidator loading schema ref: " << id
+               << " (scheme=" << a_uri.scheme() << ", path=" << a_uri.path()
+               << ")");
+
+  if (!m_loader) {
+    throw std::runtime_error("Schema $ref resolution failed: no loader "
+                             "configured for schema ID: " + id);
+  }
+
+  a_value = m_loader(id, m_current_log_context);
+
+  DL_TRACE(m_current_log_context, "Loaded referenced schema: " << a_value);
</code_context>
<issue_to_address>
**issue (bug_risk):** Schema loader adapter accesses shared state without locking, breaking the stated thread-safety guarantees

`schemaLoaderAdapter` assumes `m_loader_mutex` is held, but `compileSchema` only holds it during `set_root_schema`. When the returned validator later resolves `$ref`s (potentially from multiple threads), `schemaLoaderAdapter` is called without the mutex, so `m_loader` and `m_current_log_context` are accessed (and `m_loader` potentially modified via `setSchemaLoader`) without synchronization, leading to data races and UB.

To keep the validator thread-safe, either:
- Take `m_loader_mutex` inside `schemaLoaderAdapter` around all uses of `m_loader` and `m_current_log_context`, or
- Preferably, capture the loader and log context into the validator at construction time (e.g., via a lambda with copied state) so each compiled validator has an immutable, thread-safe loader callback.
</issue_to_address>

### Comment 2
<location path="core/server/SchemaAPIClient.cpp" line_range="64-73" />
<code_context>
+    EXCEPT(INTERNAL_ERROR,
+           "SchemaAPIClient: not configured (no base URL provided)");
+
+  std::string res_buf;
+  char error[CURL_ERROR_SIZE] = {};
+
+  curl_easy_setopt(m_curl, CURLOPT_URL, a_url.c_str());
+  curl_easy_setopt(m_curl, CURLOPT_WRITEDATA, &res_buf);
+  curl_easy_setopt(m_curl, CURLOPT_ERRORBUFFER, error);
+  curl_easy_setopt(m_curl, CURLOPT_CUSTOMREQUEST, a_method.c_str());
+
+  struct curl_slist *headers = nullptr;
+  headers = curl_slist_append(headers, "Content-Type: application/json");
+  headers = curl_slist_append(headers, "Accept: application/json");
+
+  std::string corr_header = "x-correlation-id: " + log_context.correlation_id;
+  headers = curl_slist_append(headers, corr_header.c_str());
+
+  curl_easy_setopt(m_curl, CURLOPT_HTTPHEADER, headers);
+
+  if (a_body && !a_body->empty()) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Authorization/bearer token from SchemaAPIConfig is not used in HTTP requests

`SchemaAPIConfig` exposes a `bearer_token`/`hasAuth()`, but `curlPerform` never sets an `Authorization` header. If the schema API requires auth, requests will fail even when the token is configured. Please wire this through, e.g.

```cpp
if (m_config.hasAuth()) {
  std::string auth = "Authorization: Bearer " + m_config.bearer_token;
  headers = curl_slist_append(headers, auth.c_str());
}
```

If `SchemaAPIConfig` also carries mTLS settings (client cert/key), consider applying those curl options here as well.
</issue_to_address>

### Comment 3
<location path="core/server/client_handlers/SchemaHandler.cpp" line_range="43-52" />
<code_context>
+
+// ── Private ─────────────────────────────────────────────────────────────────
+
+void SchemaHandler::validateSchemaDefinition(const std::string &a_def,
+                                             LogContext log_context) {
+  nlohmann::json schema = nlohmann::json::parse(a_def);
+
+  enforceRequiredProperties(schema);
+
+  nlohmann::json_schema::json_validator validator(
+      bind(&SchemaHandler::schemaLoader, this, placeholders::_1,
+           placeholders::_2, log_context));
+
+  validator.set_root_schema(schema);
+}
+
</code_context>
<issue_to_address>
**suggestion:** Schema validation logic is duplicated between SchemaHandler and JsonSchemaValidator

This sequence (parse JSON → enforce DataFed requirements → construct `json_validator` with `schemaLoader``set_root_schema`) essentially duplicates `JsonSchemaValidator::validateDefinition` / `compileSchema`. Duplicating this logic risks the two flows diverging over time (e.g., new rules/formats added in one place only). Consider having `SchemaHandler::validateSchemaDefinition` delegate to `JsonSchemaValidator` or a shared helper so validation behavior and error mapping are defined in a single location.
</issue_to_address>

### Comment 4
<location path="core/server/client_handlers/SchemaHandler.cpp" line_range="182-191" />
<code_context>
+
+  // Stack-local error handler — no shared mutable state.
+  // Replaces the old m_validator_err member on ClientWorker.
+  struct LocalErrorHandler
+      : nlohmann::json_schema::basic_error_handler {
+    std::string errors;
+    void error(const nlohmann::json::json_pointer &ptr,
+               const nlohmann::json &instance,
+               const std::string &message) override {
+      if (!errors.empty())
+        errors += "\n";
+      errors += "At " + ptr.to_string() + ": " + message;
+      basic_error_handler::error(ptr, instance, message);
+    }
+  } handler;
+
+  try {
</code_context>
<issue_to_address>
**suggestion:** LocalErrorHandler implementation differs from the reusable one in JsonSchemaValidator

There are now two similar but slightly different `LocalErrorHandler` implementations (here and in `JsonSchemaValidator.cpp`), which can cause inconsistent error messages and duplicated logic. Please extract a single shared implementation (e.g., a small header) and use it in both places, or route metadata validation through the existing `JsonSchemaValidator` abstraction instead of duplicating the pattern.

Suggested implementation:

```cpp
  nlohmann::json_schema::json_validator validator(
      bind(&SchemaHandler::schemaLoader, this, placeholders::_1,
           placeholders::_2, log_context));

  // Stack-local error handler — no shared mutable state.
  // Reuses the shared implementation from JsonSchemaValidator to keep
  // error formatting consistent across the server.
  JsonSchema::LocalErrorHandler handler;

  validateSchemaDefinition(a_request.def(), log_context);

```

To fully implement the “single shared LocalErrorHandler” suggestion, you’ll also need to:

1. **Extract the existing LocalErrorHandler from JsonSchemaValidator.cpp into a header**:
   - Create a small header, for example `core/server/validation/JsonSchemaErrorHandler.h`, that contains the `LocalErrorHandler` definition used by `JsonSchemaValidator.cpp`.
   - Place it in an appropriate namespace (e.g. `namespace JsonSchema { ... }`) so it can be reused here and elsewhere.

2. **Update JsonSchemaValidator.cpp**:
   - Remove its local/anonymous-namespace `LocalErrorHandler` definition.
   - Include the new header (`#include "JsonSchemaErrorHandler.h"`).
   - Use the shared `JsonSchema::LocalErrorHandler` type.

3. **Include the new header in SchemaHandler.cpp and adjust namespace if needed**:
   - Add `#include "JsonSchemaErrorHandler.h"` (or whatever name/path you choose) near the other includes.
   - If the chosen namespace is different from `JsonSchema::LocalErrorHandler`, update the replacement above accordingly (e.g. `using JsonSchema::LocalErrorHandler;` or fully qualify the type used here).

4. **Wire the handler into the validator (if not already done elsewhere in the function)**:
   - Ensure that after constructing `validator` and `handler`, you call `validator.set_error_handler(handler);` before validation, and that validation failures are surfaced consistently (e.g. via `handler.errors` or by rethrowing as in `JsonSchemaValidator`).
</issue_to_address>

### Comment 5
<location path="core/server/tests/integration/test_SchemaServiceFactory.cpp" line_range="60-69" />
<code_context>
+BOOST_AUTO_TEST_SUITE(FactoryWithJsonSchemaValidator)
</code_context>
<issue_to_address>
**suggestion (testing):** Integration tests could also cover interactions with external validators/storage when those are wired in

Given the new `ExternalSchemaValidator` and `ExternalSchemaStorage`, please extend this suite (or add a sibling one) configuring the factory with those components backed by a mock `SchemaAPIClient`. That would let you assert that:
- external validators are invoked via the factory,
- errors and warnings from the external API are reflected in `ValidationResult`, and
- fallback to default validators still works when external engines are misconfigured.
This will provide end-to-end coverage of the external-schema integration at the factory level.

Suggested implementation:

```cpp
BOOST_AUTO_TEST_CASE(register_json_schema_validator) {
  SchemaServiceFactory factory;
  auto json_validator = std::make_shared<JsonSchemaValidator>();

  factory.registerValidator("JSONSchema", json_validator);

  ISchemaValidator &retrieved = factory.getValidator("JSONSchema");
  BOOST_TEST(retrieved.hasValidationCapability() == true);
}

BOOST_AUTO_TEST_CASE(external_validator_is_invoked_via_factory) {
  // Arrange
  SchemaServiceFactory factory;

  auto mock_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(mock_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(mock_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{R"({"name": "Alice", "age": 42})"};

  // The mock is expected to be called when validation is triggered via the factory
  EXPECT_CALL(*mock_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([](const std::string &, const std::string &, ExternalValidationResponse &out) {
        out.errors.clear();
        out.warnings.clear();
        return true;
      });

  // Act
  auto &validator = factory.getValidator("ExternalJSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert
  BOOST_TEST(result.isValid());
  BOOST_TEST(result.errors().empty());
  BOOST_TEST(result.warnings().empty());
}

BOOST_AUTO_TEST_CASE(external_validator_propagates_errors_and_warnings) {
  // Arrange
  SchemaServiceFactory factory;

  auto mock_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(mock_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(mock_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{kInvalidUserSchema}; // uses the invalid JSON sample from anonymous namespace

  ExternalValidationResponse response;
  response.errors = {
      ExternalValidationError{"type", "expected integer, got string"}};
  response.warnings = {
      ExternalValidationWarning{"age", "value is out of recommended range"}};

  EXPECT_CALL(*mock_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([&response](const std::string &, const std::string &, ExternalValidationResponse &out) {
        out = response;
        return true;
      });

  // Act
  auto &validator = factory.getValidator("ExternalJSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert
  BOOST_TEST(!result.isValid());
  BOOST_TEST(result.errors().size() == response.errors.size());
  BOOST_TEST(result.warnings().size() == response.warnings.size());
  BOOST_TEST(result.errors()[0].path == response.errors[0].path);
  BOOST_TEST(result.errors()[0].message == response.errors[0].message);
  BOOST_TEST(result.warnings()[0].path == response.warnings[0].path);
  BOOST_TEST(result.warnings()[0].message == response.warnings[0].message);
}

BOOST_AUTO_TEST_CASE(fallback_to_default_validator_when_external_misconfigured) {
  // Arrange
  SchemaServiceFactory factory;

  // Default JSON schema validator still registered
  auto json_validator = std::make_shared<JsonSchemaValidator>();
  factory.registerValidator("JSONSchema", json_validator);

  // External pieces are present but misconfigured (e.g. API client throws / fails)
  auto failing_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(failing_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(failing_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{R"({"name": "Alice", "age": 42})"};

  EXPECT_CALL(*failing_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([](const std::string &, const std::string &, ExternalValidationResponse &) {
        throw std::runtime_error("external validation engine misconfigured");
      });

  // Act: factory should transparently fall back to the default JSONSchema validator
  ISchemaValidator &validator = factory.getValidator("JSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert: despite external failure, default validator still succeeds
  BOOST_TEST(result.isValid());
  BOOST_TEST(result.errors().empty());
}

```

The above tests assume several APIs and types that may differ from your actual implementation. To integrate them you will likely need to:
1. Adjust the mock type:
   - If you already have a `SchemaAPIClient` interface, create `MockSchemaAPIClient` using your mocking framework (e.g., gmock) with a method compatible with how `ExternalSchemaValidator`/`ExternalSchemaStorage` call it (I used `validateSchema(schema_id, payload, ExternalValidationResponse&)` as a placeholder).
   - Replace `EXPECT_CALL`/`testing::_` with your actual mocking constructs if you are not using gmock.
2. Align constructor and wiring:
   - Update the construction of `ExternalSchemaValidator` and `ExternalSchemaStorage` to match their real constructors (they might take the client by reference, pointer, or additional configuration).
   - Replace `factory.setSchemaStorage(...)` and `factory.registerValidator("ExternalJSONSchema", external_validator);` with the actual configuration methods on `SchemaServiceFactory` that wire external storage/validators.
3. Match `ValidationResult` API:
   - Update `result.isValid()`, `result.errors()`, `result.warnings()` accessors to whatever your `ValidationResult` exposes (e.g., `ok()`, `getErrors()`, `getWarnings()`).
   - Adjust field access for errors/warnings (`path`, `message`) to the fields your error/warning types actually have.
4. Use the correct invalid JSON payload:
   - I referenced `kInvalidUserSchema` from the anonymous namespace; ensure this symbol exists or replace it with your actual invalid JSON sample defined at the top of the file.
5. If your design triggers validation only via a higher-level factory method (e.g. `factory.validate(schema_id, payload)`):
   - Replace the direct `validator.validate(schema_id, payload)` calls with the appropriate factory-level entry point so the tests truly exercise “end-to-end” behavior through `SchemaServiceFactory`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +60 to +69
BOOST_AUTO_TEST_SUITE(FactoryWithJsonSchemaValidator)

BOOST_AUTO_TEST_CASE(register_json_schema_validator) {
SchemaServiceFactory factory;
auto json_validator = std::make_shared<JsonSchemaValidator>();

factory.registerValidator("JSONSchema", json_validator);

ISchemaValidator &retrieved = factory.getValidator("JSONSchema");
BOOST_TEST(retrieved.hasValidationCapability() == true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Integration tests could also cover interactions with external validators/storage when those are wired in

Given the new ExternalSchemaValidator and ExternalSchemaStorage, please extend this suite (or add a sibling one) configuring the factory with those components backed by a mock SchemaAPIClient. That would let you assert that:

  • external validators are invoked via the factory,
  • errors and warnings from the external API are reflected in ValidationResult, and
  • fallback to default validators still works when external engines are misconfigured.
    This will provide end-to-end coverage of the external-schema integration at the factory level.

Suggested implementation:

BOOST_AUTO_TEST_CASE(register_json_schema_validator) {
  SchemaServiceFactory factory;
  auto json_validator = std::make_shared<JsonSchemaValidator>();

  factory.registerValidator("JSONSchema", json_validator);

  ISchemaValidator &retrieved = factory.getValidator("JSONSchema");
  BOOST_TEST(retrieved.hasValidationCapability() == true);
}

BOOST_AUTO_TEST_CASE(external_validator_is_invoked_via_factory) {
  // Arrange
  SchemaServiceFactory factory;

  auto mock_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(mock_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(mock_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{R"({"name": "Alice", "age": 42})"};

  // The mock is expected to be called when validation is triggered via the factory
  EXPECT_CALL(*mock_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([](const std::string &, const std::string &, ExternalValidationResponse &out) {
        out.errors.clear();
        out.warnings.clear();
        return true;
      });

  // Act
  auto &validator = factory.getValidator("ExternalJSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert
  BOOST_TEST(result.isValid());
  BOOST_TEST(result.errors().empty());
  BOOST_TEST(result.warnings().empty());
}

BOOST_AUTO_TEST_CASE(external_validator_propagates_errors_and_warnings) {
  // Arrange
  SchemaServiceFactory factory;

  auto mock_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(mock_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(mock_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{kInvalidUserSchema}; // uses the invalid JSON sample from anonymous namespace

  ExternalValidationResponse response;
  response.errors = {
      ExternalValidationError{"type", "expected integer, got string"}};
  response.warnings = {
      ExternalValidationWarning{"age", "value is out of recommended range"}};

  EXPECT_CALL(*mock_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([&response](const std::string &, const std::string &, ExternalValidationResponse &out) {
        out = response;
        return true;
      });

  // Act
  auto &validator = factory.getValidator("ExternalJSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert
  BOOST_TEST(!result.isValid());
  BOOST_TEST(result.errors().size() == response.errors.size());
  BOOST_TEST(result.warnings().size() == response.warnings.size());
  BOOST_TEST(result.errors()[0].path == response.errors[0].path);
  BOOST_TEST(result.errors()[0].message == response.errors[0].message);
  BOOST_TEST(result.warnings()[0].path == response.warnings[0].path);
  BOOST_TEST(result.warnings()[0].message == response.warnings[0].message);
}

BOOST_AUTO_TEST_CASE(fallback_to_default_validator_when_external_misconfigured) {
  // Arrange
  SchemaServiceFactory factory;

  // Default JSON schema validator still registered
  auto json_validator = std::make_shared<JsonSchemaValidator>();
  factory.registerValidator("JSONSchema", json_validator);

  // External pieces are present but misconfigured (e.g. API client throws / fails)
  auto failing_client = std::make_shared<MockSchemaAPIClient>();
  auto external_storage = std::make_shared<ExternalSchemaStorage>(failing_client);
  auto external_validator = std::make_shared<ExternalSchemaValidator>(failing_client);

  factory.setSchemaStorage(external_storage);
  factory.registerValidator("ExternalJSONSchema", external_validator);

  const std::string schema_id{"user-schema"};
  const std::string payload{R"({"name": "Alice", "age": 42})"};

  EXPECT_CALL(*failing_client, validateSchema(schema_id, payload, testing::_))
      .Times(1)
      .WillOnce([](const std::string &, const std::string &, ExternalValidationResponse &) {
        throw std::runtime_error("external validation engine misconfigured");
      });

  // Act: factory should transparently fall back to the default JSONSchema validator
  ISchemaValidator &validator = factory.getValidator("JSONSchema");
  ValidationResult result = validator.validate(schema_id, payload);

  // Assert: despite external failure, default validator still succeeds
  BOOST_TEST(result.isValid());
  BOOST_TEST(result.errors().empty());
}

The above tests assume several APIs and types that may differ from your actual implementation. To integrate them you will likely need to:

  1. Adjust the mock type:
    • If you already have a SchemaAPIClient interface, create MockSchemaAPIClient using your mocking framework (e.g., gmock) with a method compatible with how ExternalSchemaValidator/ExternalSchemaStorage call it (I used validateSchema(schema_id, payload, ExternalValidationResponse&) as a placeholder).
    • Replace EXPECT_CALL/testing::_ with your actual mocking constructs if you are not using gmock.
  2. Align constructor and wiring:
    • Update the construction of ExternalSchemaValidator and ExternalSchemaStorage to match their real constructors (they might take the client by reference, pointer, or additional configuration).
    • Replace factory.setSchemaStorage(...) and factory.registerValidator("ExternalJSONSchema", external_validator); with the actual configuration methods on SchemaServiceFactory that wire external storage/validators.
  3. Match ValidationResult API:
    • Update result.isValid(), result.errors(), result.warnings() accessors to whatever your ValidationResult exposes (e.g., ok(), getErrors(), getWarnings()).
    • Adjust field access for errors/warnings (path, message) to the fields your error/warning types actually have.
  4. Use the correct invalid JSON payload:
    • I referenced kInvalidUserSchema from the anonymous namespace; ensure this symbol exists or replace it with your actual invalid JSON sample defined at the top of the file.
  5. If your design triggers validation only via a higher-level factory method (e.g. factory.validate(schema_id, payload)):
    • Replace the direct validator.validate(schema_id, payload) calls with the appropriate factory-level entry point so the tests truly exercise “end-to-end” behavior through SchemaServiceFactory.

@JoshuaSBrown JoshuaSBrown self-assigned this Mar 18, 2026
@JoshuaSBrown JoshuaSBrown requested a review from nedvedba March 23, 2026 12:37
@JoshuaSBrown JoshuaSBrown added Component: Core Relates to core service Type: Test Related to unit or integration testing labels Mar 23, 2026
@JoshuaSBrown JoshuaSBrown added the Priority: Medium Above average priority label Mar 23, 2026
@JoshuaSBrown JoshuaSBrown changed the title 1830 daps core foxx schema integration 1 [DAPS-1830] - core foxx schema integration 1 Mar 23, 2026
Copy link
Collaborator

@nedvedba nedvedba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A lot cleaner and better documented than what was there originally.

@JoshuaSBrown JoshuaSBrown changed the title [DAPS-1830] - core foxx schema integration 1 [DAPS-1830] - core foxx refactored json schema integration 1 Mar 24, 2026
@JoshuaSBrown
Copy link
Collaborator Author

LGTM! A lot cleaner and better documented than what was there originally.

Thanks Blake!

@JoshuaSBrown JoshuaSBrown merged commit 1dae067 into devel Mar 24, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Core Relates to core service Priority: Medium Above average priority Type: Test Related to unit or integration testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants