Skip to content

[DAPS-1522] Metrics Router Logging Improvements#1797

Merged
megatnt1122 merged 13 commits intodevelfrom
refactor-DAPS-1522-Metrics-Router-Logging-Improvements
Dec 4, 2025
Merged

[DAPS-1522] Metrics Router Logging Improvements#1797
megatnt1122 merged 13 commits intodevelfrom
refactor-DAPS-1522-Metrics-Router-Logging-Improvements

Conversation

@megatnt1122
Copy link
Collaborator

@megatnt1122 megatnt1122 commented Dec 1, 2025

Ticket

#1522

Description

Logging Improvements to metric_router

Tasks

  • - A description of the PR has been provided, and a diagram included if it is a new feature.
  • - Formatter has been run
  • - CHANGELOG comment has been added
  • - Labels have been assigned to the pr
  • - A reviwer has been added
  • - A user has been assigned to work on the pr
  • - If new feature a unit test has been added

Summary by Sourcery

Improve observability and test coverage for the metrics Foxx router.

Enhancements:

  • Add structured request logging (start, success, failure) to metrics router endpoints, including client and correlation identifiers.

Tests:

  • Introduce a metrics router Foxx test suite covering /users/active, /msg_count, and /purge behaviour.
  • Register the new metrics router test in the CMake test configuration with appropriate Foxx fixtures.

@megatnt1122 megatnt1122 self-assigned this Dec 1, 2025
@megatnt1122 megatnt1122 added Component: Database Relates to database microservice / data model Type: Refactor Imlplementation change, same functionality Priority: Low Lower priority work. labels Dec 1, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 1, 2025

Reviewer's Guide

Adds structured request logging to the metrics Foxx router, introduces unit tests for its endpoints, wires the metrics router tests into CMake, and applies minor JS formatting cleanups in a third‑party worker file.

Sequence diagram for metrics_router msg_count update with structured logging

sequenceDiagram
    actor Client
    participant MetricsRouter
    participant g_lib
    participant Logger
    participant DB

    Client->>MetricsRouter: POST /metrics/msg_count/update?client=clientId
    MetricsRouter->>g_lib: getUserFromClientID(clientId)
    g_lib-->>MetricsRouter: client
    MetricsRouter->>Logger: logRequestStarted(clientId, correlationId, POST, metrics/msg_count/update)

    alt success
        loop for each timestamp bucket
            MetricsRouter->>DB: metrics.save(obj)
            DB-->>MetricsRouter: ack
        end
        MetricsRouter->>Logger: logRequestSuccess(clientId, correlationId, POST, metrics/msg_count/update, obj)
        MetricsRouter-->>Client: 200 OK
    else failure
        MetricsRouter->>Logger: logRequestFailure(clientId, correlationId, POST, metrics/msg_count/update, obj, error)
        MetricsRouter->>g_lib: handleException(error, res)
        g_lib-->>Client: error response
    end
Loading

Sequence diagram for metrics_router GET endpoints with structured logging

sequenceDiagram
    actor Client
    participant MetricsRouter
    participant g_lib
    participant Logger
    participant DB

    Client->>MetricsRouter: GET /metrics/msg_count?client=clientId
    MetricsRouter->>g_lib: getUserFromClientID(clientId)
    g_lib-->>MetricsRouter: client
    MetricsRouter->>Logger: logRequestStarted(clientId, correlationId, GET, metrics/msg_count)

    alt success
        MetricsRouter->>DB: _query(AQL for msg_count)
        DB-->>MetricsRouter: result
        MetricsRouter->>Logger: logRequestSuccess(clientId, correlationId, GET, metrics/msg_count, result)
        MetricsRouter-->>Client: 200 OK with result
    else failure
        MetricsRouter->>Logger: logRequestFailure(clientId, correlationId, GET, metrics/msg_count, result, error)
        MetricsRouter->>g_lib: handleException(error, res)
        g_lib-->>Client: error response
    end

    Client->>MetricsRouter: GET /metrics/users/active?client=clientId
    MetricsRouter->>g_lib: getUserFromClientID(clientId)
    g_lib-->>MetricsRouter: client or null
    MetricsRouter->>Logger: logRequestStarted(clientId, correlationId, GET, metrics/users/active)

    alt success
        MetricsRouter->>DB: _query(AQL for active users)
        DB-->>MetricsRouter: cursor
        MetricsRouter->>MetricsRouter: aggregate cursor into cnt
        MetricsRouter->>Logger: logRequestSuccess(clientId, correlationId, GET, metrics/users/active, cnt)
        MetricsRouter-->>Client: 200 OK with cnt
    else failure
        MetricsRouter->>Logger: logRequestFailure(clientId, correlationId, GET, metrics/users/active, cnt, error)
        MetricsRouter->>g_lib: handleException(error, res)
        g_lib-->>Client: error response
    end

    Client->>MetricsRouter: POST /metrics/purge
    MetricsRouter->>Logger: logRequestStarted(undefined, correlationId, POST, metrics/purge)

    alt success
        MetricsRouter->>DB: metrics.save(purge audit document)
        DB-->>MetricsRouter: ack
        MetricsRouter->>DB: _query(remove old metrics)
        DB-->>MetricsRouter: ack
        MetricsRouter->>Logger: logRequestSuccess(undefined, correlationId, POST, metrics/purge, undefined)
        MetricsRouter-->>Client: 200 OK
    else failure
        MetricsRouter->>Logger: logRequestFailure(undefined, correlationId, POST, metrics/purge, undefined, error)
        MetricsRouter->>g_lib: handleException(error, res)
        g_lib-->>Client: error response
    end
Loading

File-Level Changes

Change Details Files
Add structured request lifecycle logging to all metrics router endpoints.
  • Import shared logger utility and define a common basePath for the metrics router.
  • For POST /msg_count/update, resolve client from query, log Started/Success/Failure events with correlationId, routePath, status, description, and extra payload/result.
  • For GET /msg_count, resolve client, add Started/Success/Failure logging with appropriate metadata and response body as extra on success.
  • For GET /users/active, optionally resolve client, introduce separate cnt accumulator variable, and add Started/Success/Failure logging with cnt in extra.
  • For POST /purge, add Started/Success/Failure request logging with purge description and placeholder client/extra values.
core/database/foxx/api/metrics_router.js
Introduce unit tests for the metrics router and hook them into the build/test pipeline.
  • Create metrics_router.test.js with tests covering /users/active default window, since filtering, empty result, msg_count/update write behavior, msg_count filtering by time/type/uid, and purge behavior.
  • Ensure tests manage metrics and user collections setup/teardown to keep DB state isolated.
  • Register foxx_metrics_router test in CMake and require Foxx fixtures so it runs with other Foxx router tests.
core/database/foxx/tests/metrics_router.test.js
core/database/CMakeLists.txt
Apply minor formatting fixes to third-party JS assets.
  • Remove stray blank lines and adjust minor spacing and wrapping artifacts in the Ace CoffeeScript worker bundle.
  • Make a tiny alignment change in a minified html5shiv-printshiv asset.
web/static/ace/worker-coffee.js
docs/_static/js/html5shiv-printshiv.min.js

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Several log entries reuse the description "Update message metrics" for GET /msg_count and other non-update routes; consider using route-specific descriptions (e.g., "Get message metrics") so log messages more accurately reflect the operation being performed.
  • In the /purge route logging, client and extra are hard-coded to the literal string "undefined"; using null or omitting these fields entirely would avoid conflating the string with an actual undefined/missing value in downstream log consumers.
  • The POST /purge test assumes db.metrics.toArray()[0] is the non-purged document, but collection iteration order is not guaranteed; it would be more robust to locate the remaining document by its total or timestamp rather than relying on array position.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Several log entries reuse the description "Update message metrics" for GET /msg_count and other non-update routes; consider using route-specific descriptions (e.g., "Get message metrics") so log messages more accurately reflect the operation being performed.
- In the /purge route logging, `client` and `extra` are hard-coded to the literal string "undefined"; using `null` or omitting these fields entirely would avoid conflating the string with an actual undefined/missing value in downstream log consumers.
- The `POST /purge` test assumes `db.metrics.toArray()[0]` is the non-purged document, but collection iteration order is not guaranteed; it would be more robust to locate the remaining document by its `total` or `timestamp` rather than relying on array position.

## Individual Comments

### Comment 1
<location> `core/database/foxx/api/metrics_router.js:223-232` </location>
<code_context>
     .post("/purge", function (req, res) {
         try {
+            logger.logRequestStarted({
+                client: "undefined",
+                correlationId: req.headers["x-correlation-id"],
+                httpVerb: "POST",
</code_context>

<issue_to_address>
**suggestion:** Using the string literal "undefined" in purge route logs may cause confusion versus actual `undefined`/`null` values.

In the `/purge` handler, `client` (and later `extra`) are always logged as the string "undefined", which is indistinguishable from a bug where the value was actually `undefined`. Consider using `null`, omitting the field, or a more explicit value like "no_client" to keep logs unambiguous for analysis and debugging.

Suggested implementation:

```javascript
    .post("/purge", function (req, res) {
        try {
            logger.logRequestStarted({
+                client: null,
+                correlationId: req.headers["x-correlation-id"],
+                httpVerb: "POST",

                routePath: basePath + "/msg_count/update",
                status: "Started",
                description: "Update message metrics",
            });

```

1. If there is a corresponding `extra: "undefined"` field later in the same `/purge` handler (or related logging for this route), it should be updated analogously to use `null` or a more explicit sentinel value (e.g. `"no_extra"`) instead of the string `"undefined"`.
2. If a real client identifier is available on `req` (e.g. `req.user`, `req.headers["x-client-id"]`, or similar), consider passing that actual value instead of `null` to further improve log usefulness.
</issue_to_address>

### Comment 2
<location> `core/database/foxx/tests/metrics_router.test.js:212-215` </location>
<code_context>
+        expect(arr[0].uid).to.equal("u2");
+    });
+
+    it("POST /purge should remove metrics older than timestamp", () => {
+        const now = Math.floor(Date.now() / 1000);
+
+        db.metrics.save([
+            { timestamp: now - 1000, type: "msgcnt_total", total: 1 }, // should be removed
+            { timestamp: now, type: "msgcnt_total", total: 2 }, // should stay
+        ]);
+
+        const ts = now - 500;
+        const res = request.post(`${metrics_base_url}/purge?timestamp=${ts}`);
+
+        expect(res.status).to.equal(204);
+
+        const docs = db.metrics.toArray();
+        //Equals 2 due to writing the purge doc
+        expect(docs.length).to.equal(2);
+        expect(docs[0].total).to.equal(2);
+    });
+});
</code_context>

<issue_to_address>
**suggestion (testing):** Avoid relying on implicit document order in the `/purge` test assertions.

In this test, the assertion `expect(docs[0].total).to.equal(2);` assumes `toArray()` returns the surviving metric first, but collection iteration order isn’t guaranteed and may change across backends or ArangoDB versions, making the test flaky.

Instead of indexing into `docs[0]`, assert based on content, e.g.:

```js
const docs = db.metrics.toArray();
expect(docs.length).to.equal(2);
const remaining = docs.find((d) => d.total === 2);
expect(remaining).to.exist;
```

You could also explicitly assert there is exactly one non-`purge` document and one `type === "purge"` document to better reflect the route’s behavior.

```suggestion
        const docs = db.metrics.toArray();
        // Equals 2 due to writing the purge doc
        expect(docs.length).to.equal(2);

        // Assert based on content instead of relying on implicit document order
        const remainingMetricDocs = docs.filter((d) => d.type !== "purge");
        const purgeDocs = docs.filter((d) => d.type === "purge");

        expect(remainingMetricDocs.length).to.equal(1);
        expect(remainingMetricDocs[0].total).to.equal(2);
        expect(purgeDocs.length).to.equal(1);
```
</issue_to_address>

### Comment 3
<location> `web/static/ace/worker-coffee.js:1646-1651` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 4
<location> `web/static/ace/worker-coffee.js:26752-26757` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 5
<location> `web/static/ace/worker-coffee.js:32947-32954` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 6
<location> `web/static/ace/worker-coffee.js:34737-34744` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 7
<location> `web/static/ace/worker-coffee.js:35654-35661` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 8
<location> `web/static/ace/worker-coffee.js:36613-36620` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@JoshuaSBrown
Copy link
Collaborator

Looks like there might be an error in one of your tests.

Dec 01 13:17:20 ci-datafed-arangodb arangod[1118]: 2025-12-01T18:17:20.579241Z [1118-6] ERROR [24213] {general} Client: unknown      | Correlation_ID: unknown | HTTP: POST           | Route: metrics/msg_count/update | Status: Failure      | Desc: Update message metrics | Extra: unknown       | Error: illegal document identifier | Stack: ArangoError: illegal document identifier\n    at module.exports.obj.getUserFromClientID (/tmp/arangodb-app.root/_db/sdms/api/1/APP/api/support.js:660:27)\n    at Route._handler (/tmp/arangodb-app.root/_db/sdms/api/1/APP/api/metrics_router.js:17:28)\n    at next (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:420:15)\n    at next (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:418:7)\n    at next (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:418:7)\n    at next (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:418:7)\n    at dispatch (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:434:3)\n    at Tree.dispatch (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/router/tree.js:136:7)\n    at callback (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/foxx/service.js:356:35)\n    at execute (/opt/arangodb3/arangodb3-linux-3.12.2_x86_64/usr/share/arangodb3/js/server/modules/@arangodb/actions.js:1235:7)
Dec 01 13:17:20 ci-datafed-arangodb arangod[1118]: 2025-12-01T18:17:20.579345Z [1118-6] INFO [99d80] {general} Service exception: ArangoError 1205: illegal document identifier

Copy link
Collaborator

@JoshuaSBrown JoshuaSBrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an error in one of your tests, I'm also seeing that the formatted files are being included again.

@JoshuaSBrown JoshuaSBrown removed their assignment Dec 2, 2025
@megatnt1122 megatnt1122 merged commit 4e3f0fc into devel Dec 4, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Database Relates to database microservice / data model Priority: Low Lower priority work. Type: Refactor Imlplementation change, same functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants