fix(service): race condition in force_reload() by Keshoid · Pull Request #98 · RSquad/ton-rust-node

Keshoid · 2026-04-16T22:53:30Z

Summary

Fix race condition in force_reload() where config snapshot was captured before reload(), but by the time reload() finished rebuilding runtime state (vault, RPC, wallets, pools), the config could have been mutated by another concurrent REST API request — causing the runtime state to be rebuilt from a stale config
Split reload into two clear methods: reload_state() (reads current config from store, used by force_reload) and reload(config) (accepts an external config, used for file-change reloads)

Copilot

Pull request overview

This PR aims to ensure runtime state/task behavior updates correctly when elections-related configuration changes, and improves CI artifact collection for nodectl e2e runs.

Changes:

Add an info log when a forced runtime cache reload is triggered.
Change elections include/exclude REST handlers to signal config_changed instead of directly restarting the elections task.
Update the nodectl e2e GitHub Actions workflow to always upload logs and set artifact retention/behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/node-control/service/src/runtime_config.rs`	Adds logging around `force_reload()` to improve observability of cache reload triggers.
`src/node-control/service/src/http/http_server_task.rs`	Routes elections include/exclude mutations through the central `config_changed` notification mechanism.
`.github/workflows/nodectl-e2e.yml`	Adjusts e2e artifact upload to run always and adds retention/handling options.

Comments suppressed due to low confidence (1)

.github/workflows/nodectl-e2e.yml:78

The e2e runner (run_singlehost_nodectl.py) writes the bootstrap/script log to src/node/tests/test_run_net_py/singlehost-bootstrap.log by default (not under tmp/). This workflow removed the bootstrap log from the uploaded artifact, so failures/successes may be harder to diagnose. Consider adding src/node/tests/test_run_net_py/singlehost-bootstrap.log to the path list (or otherwise ensure the script log is included).

          if-no-files-found: warn
          path: |
            src/node/tests/test_run_net_py/tmp/nodectl-config.json
            src/node/tests/test_run_net_py/tmp/nodectl-service.log

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-16T22:57:55Z

    /// Use after REST mutations that change structural config (entities, endpoints).
    pub async fn force_reload(&self) -> anyhow::Result<()> {
        let config = (*self.get()).clone();
+        tracing::info!("force reloading is triggered");


The log message reads a bit ungrammatically ("force reloading is triggered") and is likely to become a frequently-emitted info log. Consider rewording to something clearer/consistent with other logs (e.g., "force reload triggered") and/or lowering to debug if this is primarily for troubleshooting.

Suggested change

tracing::info!("force reloading is triggered");

tracing::info!("force reload triggered");

log removed

Copilot · 2026-04-16T22:57:55Z

            }
        })
        .map_err(|e| AppError::internal(e.to_string()))?;



Switching from directly restarting elections_task to config_changed.notify_one() means applying include/exclude now depends on the service loop successfully completing runtime_cfg.force_reload() before tasks are restarted. Since force_reload() reopens the vault and reconnects RPC, a transient vault/RPC failure can prevent elections include/exclude from taking effect even though the config update succeeded. Consider keeping a direct elections-task restart for this endpoint (or using a lighter signal that restarts tasks without a full cache rebuild) so enable/disable can still apply when reload fails for unrelated reasons.

Suggested change

if let Err(e) = state.elections_task.restart().await {

tracing::warn!(

"failed to directly restart elections task after exclude update: {e}"

);

}

Copilot · 2026-04-16T22:57:55Z


-    let task = state.elections_task.clone();
-    tokio::spawn(async move {
-        let _ = task.restart().await;
-    });
+    state.config_changed.notify_one();



Same concern as in the exclude handler: using config_changed.notify_one() makes the include operation’s effect contingent on a successful runtime_cfg.force_reload() cycle (vault/RPC reconnect). If reload fails, tasks won’t restart and the new bindings enablement may not take effect until a later reload succeeds. Consider restarting the elections task directly for this change (or separating “restart tasks” from “rebuild caches”).

linear · 2026-04-17T11:20:40Z

SMA-72 Fix TOCTOU race in config mutation handlers

Keshoid added 2 commits April 17, 2026 01:52

fix(service): runtime state reload on elections include/exclude

d5c7e20

chore: always upload artifacts for nodectl e2e job

c4a28be

Copilot AI review requested due to automatic review settings April 16, 2026 22:53

Keshoid added the run-e2e label Apr 16, 2026

Copilot started reviewing on behalf of Keshoid April 16, 2026 22:53 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

debug: add debug printing

d8a47b8

Keshoid added run-e2e and removed run-e2e labels Apr 17, 2026

debug: add more logs

3b68ea8

Keshoid removed the run-e2e label Apr 17, 2026

Merge branch 'release/nodectl/v0.4.0' into fix/task-reload

c936101

Keshoid added the run-e2e label Apr 17, 2026

fix(service): race condition on reload

4306df6

Keshoid added run-e2e and removed run-e2e labels Apr 17, 2026

refactor(service): remove debug logs

103f1d1

Keshoid added the run-e2e label Apr 17, 2026

Keshoid changed the title ~~Fix runtime state reload on elections settings change~~ fix(service): race condition in force_reload() Apr 17, 2026

Keshoid added run-e2e and removed run-e2e labels Apr 17, 2026

mrnkslv approved these changes Apr 17, 2026

View reviewed changes

Keshoid merged commit 6bfa77d into release/nodectl/v0.4.0 Apr 17, 2026
11 of 12 checks passed

Keshoid deleted the fix/task-reload branch April 17, 2026 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(service): race condition in force_reload()#98

fix(service): race condition in force_reload()#98
Keshoid merged 7 commits into
release/nodectl/v0.4.0from
fix/task-reload

Keshoid commented Apr 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Keshoid Apr 17, 2026

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Keshoid Apr 17, 2026

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Keshoid Apr 17, 2026

Uh oh!

linear Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	tracing::info!("force reloading is triggered");
	tracing::info!("force reload triggered");

+    if let Err(e) = state.elections_task.restart().await {
+        tracing::warn!(
+            "failed to directly restart elections task after exclude update: {e}"
+        );
+    }

Conversation

Keshoid commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Keshoid Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Keshoid Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Keshoid Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

linear Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Keshoid commented Apr 16, 2026 •

edited

Loading