Skip to content

Route string/bool values to Pluto config, handle unmapped types#120

Open
asaiacai wants to merge 5 commits into
mainfrom
claude/confident-lamport-2yeUY
Open

Route string/bool values to Pluto config, handle unmapped types#120
asaiacai wants to merge 5 commits into
mainfrom
claude/confident-lamport-2yeUY

Conversation

@asaiacai

@asaiacai asaiacai commented Jun 5, 2026

Copy link
Copy Markdown

Description

This PR improves how the wandb compatibility shim forwards logged values to Pluto by implementing proper value routing and handling for unmapped types.

Key Changes

  1. Value Routing for Pluto:

    • Numeric metrics: int/float and any scalar exposing .item() (numpy/torch/etc.) → Pluto metrics (time-series)
    • Strings and bools: → Pluto config (latest-wins, queryable via get_run().config). This enables the /resume-crashed-run use case where checkpoint paths and other string metadata need to be readable from Pluto
    • wandb media and lists: → converted Pluto media
    • JSON-serializable values (dicts, None, nested structures): → Pluto config as fallback
    • Unmapped types: Reported to Sentry telemetry once per key (maintainer-coverage signal, not user-facing)
  2. Improved Scalar Detection:

    • Replaced _is_torch_tensor_scalar() with _as_scalar_number() that mirrors Pluto core's own op._process_log_item_sync
    • Now accepts any scalar exposing a callable .item() method (numpy scalars, 0-d arrays, custom tensor-like objects)
    • Properly excludes bool and str from scalar handling
  3. Unforwardable Value Handling:

    • Added _handle_unforwardable() method to gracefully handle values with no metric/media/config mapping
    • Attempts JSON serialization as last resort (preserves dicts, None, nested primitives)
    • Reports non-serializable types to Sentry once per key for maintainer awareness
    • No user-facing warnings (not a user-actionable problem)
  4. Config Synchronization:

    • Added pluto_config dict to collect string/bool/JSON-serializable values
    • Calls pluto_run.update_config() after logging metrics
    • Includes error handling for config sync failures

Testing

Added comprehensive test coverage in test_wandb_compat.py:

  • test_log_routes_strings_to_config_not_metrics: Verifies strings land in config, not metrics
  • test_log_forwards_numpy_scalars_as_metrics: Ensures numpy scalars reach Pluto
  • test_log_forwards_any_item_scalar_like_pluto_core: Guards against being stricter than Pluto core
  • test_log_does_not_treat_failing_item_as_scalar: Handles non-scalars with .item() gracefully
  • test_unforwardable_value_alerts_sentry_once_not_user: Verifies Sentry alerting and deduplication
  • test_json_serializable_unmapped_value_falls_back_to_config: Confirms dict/None fallback to config

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • New unit tests for value routing and type handling

https://claude.ai/code/session_01JzBDybRDXt1TLnsQAxhKqz


Note

Medium Risk
Changes dual-logging routing in a hot path (wandb.log); mis-routing could affect resume/metadata in Pluto while wandb stays correct, but failures are contained in try/except and covered by new unit tests.

Overview
The wandb→Pluto shim WandbRunWrapper.log now splits each wandb.log payload before forwarding: numbers (including any scalar with a working .item(), not only torch) go to pluto_run.log, while str, bool, and JSON-serializable leftovers go to pluto_run.update_config (latest-wins), so checkpoint paths and similar metadata are readable from Pluto config (e.g. resume flows).

Values that are not metrics, media, or JSON config are handled by new _handle_unforwardable: serializable dicts/None/primitive lists are kept as config; otherwise the Pluto copy is skipped, debug is logged once per key, and Sentry gets a one-time maintainer alert—wandb behavior is unchanged.

Reviewed by Cursor Bugbot for commit 3d52d9c. Configure here.

claude added 4 commits June 5, 2026 00:30
WandbRunWrapper.log() pre-filtered logged values and only forwarded
Python int/float, torch scalar tensors, and wandb media to Pluto.
String values had no branch (_convert_wandb_to_pluto returns None for
str), so they were silently dropped.

Concretely, a checkpoint-metadata logger that does:

    wandb.log({'checkpoint/step': step,        # int  -> reached Pluto
               'checkpoint/epoch': epoch,       # int  -> reached Pluto
               'checkpoint/r2_path': r2_path,    # str  -> DROPPED
               'checkpoint/local_path': path},   # str  -> DROPPED
              step=step)

lost the two string paths while the numeric step/epoch came through
fine. Those string paths are exactly what a resume flow needs to know
which object to stage.

Now str/bool values route to update_config() (latest-wins, queryable
via get_run().config), mirroring where wandb places loose strings (run
summary/overview). A resume skill can read the most recent
checkpoint/r2_path off the run config instead of scanning storage.

Also forward numpy scalars (np.generic, excluding np.bool_) as metrics
via .item(). This is defensive hardening, not part of the above bug:
the shim was stricter than Pluto's own log(), which already accepts
anything exposing .item(). Frameworks that log np.int64/np.float32
were dropped here even though Pluto core would have kept them.
Generalize the shim's numeric forwarding to accept any value exposing a
callable .item() returning a number, replacing the narrower torch-tensor +
numpy-generic checks. This mirrors Pluto's own log()
(op._process_log_item_sync), which already forwards anything with .item().

Motivation: a checkpoint logger annotated as log_checkpoint(step: int,
epoch: int, ...) can still pass a non-plain-int at runtime (np.int64, a 0-d
tensor, a framework scalar). The old shim dropped those even though Pluto
core would have kept them, so a logged 'epoch' could silently never reach
Pluto. A failing .item() (multi-element array) is treated as not-a-scalar
and ignored, same as Pluto would fail it. bool/str remain routed to config.
The shim only forwarded recognized types (numbers, wandb media, str/bool)
and silently dropped everything else — dicts, None, raw/multi-element
tensors, numpy arrays, unconvertible wandb media (Html/Object3D/...),
custom objects. Silent drops are what made missing data (e.g. a whole
checkpoint metadata call) so hard to diagnose: no error, no warning, just
absent.

Add a 'preserve-what-we-can, otherwise fail loud' fallback in
_handle_unforwardable():
- JSON-serializable leftovers (nested dicts/lists of primitives, None,
  etc.) are preserved as Pluto config, mirroring how wandb keeps loose
  values in the run summary.
- Anything else warns ONCE per key (WARNING, naming key + type) instead of
  vanishing. The value still reaches W&B; only the Pluto copy is dropped,
  and now visibly.

Never raises — wandb behavior is unaffected.
The previous change warned users (WARNING log) when a value had no Pluto
mapping. But that's a gap in OUR type coverage, not a user error — people
migrating away from wandb shouldn't be nagged about types only the package
maintainers can fix.

Reroute the signal: genuinely unforwardable values (non-JSON-serializable,
no metric/media mapping) now emit a Sentry telemetry alert via the SDK's
isolated client (pluto/sentry.py), once per key, grouped by type name so
we can see which unhandled types show up in the wild and add coverage. The
local log drops to debug. JSON-serializable leftovers still fall back to
config silently. Sentry honors PLUTO_DISABLE_TELEMETRY and swallows all
errors, so this never affects the user or wandb.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 3d52d9c. Configure here.

Comment thread pluto/compat/wandb.py
logger.debug(
f'pluto.compat.wandb: Failed to sync string/bool '
f'values to Pluto config: {e}'
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pluto log failure skips config

Medium Severity

update_config runs in the same outer try as pluto_run.log. If metric/media logging raises, execution jumps to the broad handler and string or bool config updates from the same wandb.log call are never sent to Pluto.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3d52d9c. Configure here.

Comment thread pluto/compat/wandb.py Outdated
try:
json.dumps(value)
return True
except (TypeError, ValueError):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config gate rejects OmegaConf values

Medium Severity

_is_json_serializable uses plain json.dumps, while update_config normalizes via to_native_config (including OmegaConf). Logged DictConfig or nested OmegaConf nodes fail the JSON check and are treated as unforwardable instead of being stored as config.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3d52d9c. Configure here.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the W&B compatibility shim by implementing robust value routing to Pluto. Specifically, it routes string and boolean values to the run configuration, supports numpy scalars as metrics, and provides fallback handling for unforwardable values (either saving them as config if JSON-serializable or reporting them to Sentry). The reviewer's feedback identifies a critical performance bottleneck where logging string or boolean values at every step triggers redundant, synchronous configuration updates. The reviewer suggests caching the last logged configuration values to only sync actual changes, and provides a unit test to verify this optimization.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pluto/compat/wandb.py
Comment on lines +95 to +97
# Keys we've already warned about being unforwardable to Pluto, so a
# value logged every step warns once rather than spamming the logs.
self._unforwardable_warned: set = set()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Performance Bottleneck: Redundant Config Updates

Logging string or boolean values (such as training phase, status, or checkpoint paths) at every step is a very common pattern in machine learning training loops. Currently, every call to log() containing a string or boolean will trigger a call to self._pluto_run.update_config(pluto_config).

If the sync process is disabled or not yet initialized, update_config performs a synchronous, blocking HTTP POST request to the server. Even when the sync process is enabled, it triggers a synchronous write to the local SQLite database. Doing this at every single step will severely degrade training performance due to network or disk I/O bottlenecks.

To prevent this, we should cache the last logged config values and only send updates when a value actually changes.

Suggested change
# Keys we've already warned about being unforwardable to Pluto, so a
# value logged every step warns once rather than spamming the logs.
self._unforwardable_warned: set = set()
# Keys we've already warned about being unforwardable to Pluto, so a
# value logged every step warns once rather than spamming the logs.
self._unforwardable_warned: set = set()
# Cache of the last logged config values to avoid redundant updates.
self._last_logged_config: Dict[str, Any] = {}

Comment thread pluto/compat/wandb.py Outdated
Comment on lines +216 to +226
for key, value in data.items():
if isinstance(value, (int, float)):
if isinstance(value, bool):
# bool is a subclass of int, but Pluto drops bool
# metrics — surface it as config so it isn't lost.
pluto_config[key] = value
elif isinstance(value, (int, float)):
pluto_data[key] = value
elif _is_torch_tensor_scalar(value):
pluto_data[key] = value.item()
elif (num := _as_scalar_number(value)) is not None:
pluto_data[key] = num
elif isinstance(value, str):
pluto_config[key] = value

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Only add string and boolean values to pluto_config if they have actually changed from their previously logged values.

Suggested change
for key, value in data.items():
if isinstance(value, (int, float)):
if isinstance(value, bool):
# bool is a subclass of int, but Pluto drops bool
# metrics — surface it as config so it isn't lost.
pluto_config[key] = value
elif isinstance(value, (int, float)):
pluto_data[key] = value
elif _is_torch_tensor_scalar(value):
pluto_data[key] = value.item()
elif (num := _as_scalar_number(value)) is not None:
pluto_data[key] = num
elif isinstance(value, str):
pluto_config[key] = value
for key, value in data.items():
if isinstance(value, bool):
# bool is a subclass of int, but Pluto drops bool
# metrics — surface it as config so it isn't lost.
if self._last_logged_config.get(key) != value:
pluto_config[key] = value
elif isinstance(value, (int, float)):
pluto_data[key] = value
elif (num := _as_scalar_number(value)) is not None:
pluto_data[key] = num
elif isinstance(value, str):
if self._last_logged_config.get(key) != value:
pluto_config[key] = value

Comment thread pluto/compat/wandb.py
Comment on lines +256 to +263
if pluto_config:
try:
self._pluto_run.update_config(pluto_config)
except Exception as e:
logger.debug(
f'pluto.compat.wandb: Failed to sync string/bool '
f'values to Pluto config: {e}'
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the local config cache self._last_logged_config once the config has been successfully synced.

Suggested change
if pluto_config:
try:
self._pluto_run.update_config(pluto_config)
except Exception as e:
logger.debug(
f'pluto.compat.wandb: Failed to sync string/bool '
f'values to Pluto config: {e}'
)
if pluto_config:
try:
self._pluto_run.update_config(pluto_config)
self._last_logged_config.update(pluto_config)
except Exception as e:
logger.debug(
f'pluto.compat.wandb: Failed to sync string/bool '
f'values to Pluto config: {e}'
)

Comment thread pluto/compat/wandb.py Outdated
Comment on lines +290 to +292
if _is_json_serializable(value):
pluto_config[key] = value
return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Only add fallback JSON-serializable values to pluto_config if they have changed from their previously logged values.

Suggested change
if _is_json_serializable(value):
pluto_config[key] = value
return
if _is_json_serializable(value):
if self._last_logged_config.get(key) != value:
pluto_config[key] = value
return

Comment on lines +368 to +377
def test_json_serializable_unmapped_value_falls_back_to_config():
"""A dict/None with no metric mapping is preserved as config, not dropped."""
wrapper, pluto_run = _make_wrapper()

wrapper.log({'meta/info': {'kind': 'resume', 'attempt': 3}, 'note': None})

cfg = pluto_run.update_config.call_args.args[0]
assert cfg['meta/info'] == {'kind': 'resume', 'attempt': 3}
assert cfg['note'] is None
assert not pluto_run.log.called # no numeric metrics in this call

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add a unit test to verify that redundant config updates are skipped to avoid performance bottlenecks.

def test_json_serializable_unmapped_value_falls_back_to_config():
    """A dict/None with no metric mapping is preserved as config, not dropped."""
    wrapper, pluto_run = _make_wrapper()

    wrapper.log({'meta/info': {'kind': 'resume', 'attempt': 3}, 'note': None})

    cfg = pluto_run.update_config.call_args.args[0]
    assert cfg['meta/info'] == {'kind': 'resume', 'attempt': 3}
    assert cfg['note'] is None
    assert not pluto_run.log.called  # no numeric metrics in this call


def test_log_skips_redundant_config_updates():
    """Verify that redundant config updates are skipped to avoid performance bottlenecks."""
    wrapper, pluto_run = _make_wrapper()

    # First log: config is updated
    wrapper.log({'phase': 'train', 'loss': 0.5})
    assert pluto_run.update_config.call_count == 1

    # Second log with same config value: update_config should NOT be called again
    pluto_run.update_config.reset_mock()
    wrapper.log({'phase': 'train', 'loss': 0.4})
    assert pluto_run.update_config.call_count == 0

    # Third log with changed config value: update_config should be called
    wrapper.log({'phase': 'val', 'loss': 0.3})
    assert pluto_run.update_config.call_count == 1
    assert pluto_run.update_config.call_args.args[0] == {'phase': 'val'}

asaiacai pushed a commit that referenced this pull request Jun 6, 2026
Minimal end-to-end script mirroring linum's WandbLogger.log_checkpoint:
numeric step/epoch (as np.int64) plus string r2_path/local_path logged in
one wandb.log call, then read back from Pluto. Asserts step/epoch land as
metrics and the paths land in config. Fails on pre-fix builds, passes on
this branch. Uses wandb mode=disabled so no W&B account is needed.
@asaiacai asaiacai force-pushed the claude/confident-lamport-2yeUY branch from 8864c4c to 3d52d9c Compare June 6, 2026 00:09
Three fixes from PR #120 review (Cursor Bugbot + Gemini):

1. Config no longer skipped on metric-log failure. update_config() shared
   the outer try with pluto_run.log(); a metric/media logging exception
   jumped to the handler and dropped str/bool config from the same call.
   Metrics and config now send in independent try blocks.

2. OmegaConf values are storable as config. The fallback gate used plain
   json.dumps, but update_config normalizes via to_native_config (which
   deep-converts DictConfig/ListConfig). A logged OmegaConf node was wrongly
   treated as unforwardable (dropped + Sentry-alerted). _config_storable_value
   now mirrors update_config's normalization, so OmegaConf is kept while
   tensors/ndarrays/custom objects still fall through to Sentry.

3. Skip redundant config writes. Logging an unchanged str/bool/config value
   every step re-triggered update_config (a SQLite write) each step. Dedup
   against the last synced value via self._last_logged_config, updated only
   on a successful update_config. Uses a _MISSING sentinel so a first-time
   None is still sent (None != missing).

Tests: dedup skip/change behavior, OmegaConf-node fallback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants