[DAPS-1522] Data Router Logging Improvements#1789
Merged
JoshuaSBrown merged 12 commits intodevelfrom Feb 4, 2026
Merged
Conversation
Contributor
Reviewer's GuideRefactors data_router to use centralized permission helpers and introduces structured request lifecycle logging (start/success/failure) across all data routes for better observability and correlation. Sequence diagram for POST data/get with task initialization and loggingsequenceDiagram
actor User
participant ClientApp
participant DataRouter
participant GLib
participant Logger
participant ArangoDB
participant GTasks
User->>ClientApp: Request data download
ClientApp->>DataRouter: POST /data/get?client=clientId
DataRouter->>GLib: getUserFromClientID(clientId)
GLib-->>DataRouter: client
DataRouter->>Logger: logRequestStarted(client, correlationId, POST, data/get, Started)
DataRouter->>ArangoDB: _executeTransaction(read metadata, validate ids)
activate ArangoDB
ArangoDB->>GLib: resolveDataID for each id
GLib-->>ArangoDB: data ids
ArangoDB-->>DataRouter: validated ids
deactivate ArangoDB
DataRouter->>GTasks: taskInitDataGet(client, path, encrypt, ids)
GTasks-->>DataRouter: task descriptor result
DataRouter-->>ClientApp: HTTP 200 result (task info)
DataRouter->>Logger: logRequestSuccess(client, correlationId, POST, data/get, Success, result)
alt Failure during transaction or task init
DataRouter->>Logger: logRequestFailure(client, correlationId, POST, data/get, Failure, result, error)
DataRouter->>GLib: handleException(error, res)
GLib-->>ClientApp: Error response
end
Class diagram for updated data_router modules and logging helpersclassDiagram
class DataRouter {
+post_create(req, res)
+post_create_batch(req, res)
+post_update(req, res)
+post_update_batch(req, res)
+post_update_md_err_msg(req, res)
+post_update_size(req, res)
+get_view(req, res)
+post_export(req, res)
+get_dep_graph_get(req, res)
+get_lock(req, res)
+get_path(req, res)
+get_list_by_alloc(req, res)
+post_get(req, res)
+post_put(req, res)
+post_alloc_chg(req, res)
+post_owner_chg(req, res)
+post_delete(req, res)
-recordCreate(client, record, result)
-recordUpdate(client, record, result)
}
class GLib {
+getUserFromClientID(clientId)
+getUserFromClientID_noexcept(clientId)
+hasManagerPermProj(client, ownerId)
+hasPermissions(client, resource, perms)
+hasAdminPermObject(client, objectId)
+getPermissions(client, resource, perms)
+ensureAdminPermUser(client, userId)
+ensureManagerPermProj(client, projId)
+resolveDataID(id, client)
+resolveDataCollID(id, client)
+getObject(id, client)
+hasPublicRead(dataId)
+computeDataPath(loc, localOnly)
+saveRecentGlobusPath(client, path, transferType)
+handleException(error, res)
<<constants>> PERM_CREATE
<<constants>> PERM_WR_META
<<constants>> PERM_WR_REC
<<constants>> PERM_RD_REC
<<constants>> PERM_RD_META
<<constants>> PERM_RD_DATA
<<constants>> TT_DATA_EXPORT
<<constants>> TT_DATA_GET
<<constants>> TT_DATA_PUT
}
class Logger {
+logRequestStarted(client, correlationId, httpVerb, routePath, status, description)
+logRequestSuccess(client, correlationId, httpVerb, routePath, status, description, extra)
+logRequestFailure(client, correlationId, httpVerb, routePath, status, description, extra, error)
}
class ArangoDB {
+_executeTransaction(config)
+c_document(id)
+d_document(id)
+owner_firstExample(query)
+loc_firstExample(query)
+_update(id, patch)
+_remove(id)
+_query(query, bindVars)
}
class GTasks {
+taskInitDataGet(client, path, encrypt, ids)
+taskInitDataPut(client, path, encrypt, ids, check, srcRepoId)
+taskInitRecAllocChg(client, projId, ids, opts)
+taskInitRecOwnerChg(client, ids, collId, opts)
+taskInitRecCollDelete(client, ids)
}
DataRouter --> GLib : uses
DataRouter --> Logger : logs via
DataRouter --> ArangoDB : executes transactions
DataRouter --> GTasks : initializes tasks
GLib --> ArangoDB : helper DB access
GLib <.. DataRouter : centralized permissions and utilities
Logger <.. DataRouter : request lifecycle logging
Flow diagram for generic data_router request lifecycle loggingflowchart TD
A[Receive HTTP request
client, headers, body] --> B[Resolve client
getUserFromClientID or getUserFromClientID_noexcept]
B --> C[logRequestStarted
client, correlationId,
httpVerb, routePath,
status Started,
description]
C --> D{Handler uses
ArangoDB and helpers}
D --> E[Build result data
or side effects only]
E --> F[Send success response
res.send result or empty]
F --> G[logRequestSuccess
client, correlationId,
httpVerb, routePath,
status Success,
description, extra]
D --> H[Exception thrown
validation, permissions,
DB error, task error]
H --> I[logRequestFailure
client, correlationId,
httpVerb, routePath,
status Failure,
description, extra,
error]
I --> J[handleException
via g_lib.handleException]
J --> K[Error response
status code and body]
subgraph RetryLoop[Optional retry loop
for write conflict 1200]
D --> L{Arango errorNum 1200
and retries left?}
L -- yes --> D
L -- no --> H
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In the /update/batch handler you now set
client = req.queryParams.clientinstead of usingg_lib.getUserFromClientID, which changesclientfrom a user object to a raw id string and will impact downstream permission checks and task initialization that expect a full client object. - In the /export transaction you replaced
results = []withresult = []but still appear to useresultslater (for pushes,res.send(results), and logging), which will either reference an undefined variable or leave the logged/sent value empty; align the variable name consistently (resultvsresults). - Some of the new logging calls have inconsistencies or likely mistakes (e.g., /dep/graph/get logs
httpVerb: "POST"for a GET route, and /update/size logs beforeclientis set and usesextra: "undefined"), so it would be good to standardize these fields and ensureclientandextracarry meaningful values across all handlers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the /update/batch handler you now set `client = req.queryParams.client` instead of using `g_lib.getUserFromClientID`, which changes `client` from a user object to a raw id string and will impact downstream permission checks and task initialization that expect a full client object.
- In the /export transaction you replaced `results = []` with `result = []` but still appear to use `results` later (for pushes, `res.send(results)`, and logging), which will either reference an undefined variable or leave the logged/sent value empty; align the variable name consistently (`result` vs `results`).
- Some of the new logging calls have inconsistencies or likely mistakes (e.g., /dep/graph/get logs `httpVerb: "POST"` for a GET route, and /update/size logs before `client` is set and uses `extra: "undefined"`), so it would be good to standardize these fields and ensure `client` and `extra` carry meaningful values across all handlers.
## Individual Comments
### Comment 1
<location> `core/database/foxx/api/data_router.js:1279-1288` </location>
<code_context>
router
.get("/view", function (req, res) {
+ let results = null;
+ let client = null;
try {
</code_context>
<issue_to_address>
**issue (bug_risk):** The /view handler logs `results`, but that variable is never populated and the actual response uses `data`.
In this route, `results` is never updated, so the success log will always show `extra: null` instead of the actual payload. To log the returned data, either assign `results = [data];` before `res.send`, or update the log call to use the real payload, e.g. `extra: { results: [data] }` or `extra: data` directly.
</issue_to_address>
### Comment 2
<location> `core/database/foxx/api/data_router.js:264-273` </location>
<code_context>
try {
- var result = {
+ client = g_lib.getUserFromClientID(req.queryParams.client);
+ logger.logRequestStarted({
+ client: client,
+ correlationId: req.headers["x-correlation-id"],
</code_context>
<issue_to_address>
**suggestion:** Logging for /dep/graph/get uses HTTP verb POST even though this is a GET route.
Please update the logger calls in this handler to use `httpVerb: "GET"` so the logs match the route definition (`router.get("/dep/graph/get", ...)`) and remain accurate for method-based filtering and monitoring.
Suggested implementation:
```javascript
logger.logRequestStarted({
client: client,
correlationId: req.headers["x-correlation-id"],
httpVerb: "GET",
routePath: basePath + "/dep/graph/get",
status: "Started",
description: "Get dependency graph",
});
```
```javascript
logger.logRequestCompleted({
client: client,
correlationId: req.headers["x-correlation-id"],
httpVerb: "GET",
routePath: basePath + "/dep/graph/get",
status: "Completed",
description: "Get dependency graph",
});
```
```javascript
logger.logRequestFailed({
client: client,
correlationId: req.headers["x-correlation-id"],
httpVerb: "GET",
routePath: basePath + "/dep/graph/get",
status: "Failed",
description: "Get dependency graph",
error: err && err.stack ? err.stack : err,
});
```
- Ensure these logger calls are indeed within the `router.get("/dep/graph/get", ...)` handler.
- Do NOT change `httpVerb` for other routes, such as the `/create` POST route you showed; those should remain `"POST"`.
- If there are any other logger calls (e.g., `logRequestStarted`, `logRequestCompleted`, `logRequestFailed`) associated with `/dep/graph/get` that are not shown here, update their `httpVerb` field to `"GET"` in the same way.
</issue_to_address>
### Comment 3
<location> `core/database/foxx/api/data_router.js:30-36` </location>
<code_context>
if (owner_id != client._id) {
if (!g_lib.hasManagerPermProj(client, owner_id)) {
var parent_coll = g_db.c.document(parent_id);
if (!g_lib.hasPermissions(client, parent_coll, g_lib.PERM_CREATE)) {
throw error.ERR_PERM_DENIED;
}
}
}
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/merge-nested-ifs))
```suggestion
if (owner_id != client._id && !g_lib.hasManagerPermProj(client, owner_id)) {
var parent_coll = g_db.c.document(parent_id);
if (!g_lib.hasPermissions(client, parent_coll, g_lib.PERM_CREATE)) {
throw error.ERR_PERM_DENIED;
}
}
```
<br/><details><summary>Explanation</summary>Reading deeply nested conditional code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>
### Comment 4
<location> `core/database/foxx/api/data_router.js:1304-1308` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))
<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.
From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>
### Comment 5
<location> `core/database/foxx/api/data_router.js:1802` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))
<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.
From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
481f0ce to
db78057
Compare
JoshuaSBrown
requested changes
Jan 27, 2026
Collaborator
JoshuaSBrown
left a comment
There was a problem hiding this comment.
A few minor fixes still needed, nice job.
… github.com:ORNL/DataFed into refactor-DAPS-1522-Data-Router-Logging-Improvements
Collaborator
|
CI is failing I believe because of a dangling edge in the test. Try adding these to the before and after cleanup hooks. const collections = ["u", "d", "c", "repo", "alloc", "loc", "owner", "alias", "item", "dep"]; The error we are seeing is likely due to a dangling edge. |
JoshuaSBrown
approved these changes
Feb 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ticket
#1522
Description
Logging improvements to data_router
Tasks
Summary by Sourcery
Improve observability and permission handling in the data router API endpoints.
New Features:
Enhancements: