Skip to content

refactor(hop): replace direct .merge() with safe_merge() for defensive engine consistency #1166

@lmeyerov

Description

@lmeyerov

Summary

graphistry/compute/hop.py has four direct .merge() calls in the hop pipeline (lines ~659, ~675, ~734, ~740) that bypass the engine-aware safe_merge() helper used everywhere else in the same function.

Affected lines

# line ~659 (min_hops path)
valid_endpoint_edges_with_nodes = valid_endpoint_edges.merge(...)

# line ~675 (min_hops path)
edge_records_with_endpoints = edge_hop_records.merge(...)

# line ~734 (track_edge_hops=True path)
final_edges = edges_indexed.merge(edge_labels_source, on=EDGE_ID, how='inner')

# line ~740 (default path)
final_edges = edges_indexed.merge(matches_edges, on=EDGE_ID, how='inner')

Why it matters

Today not a silent GPU→CPU downgrade (all operands come from the same engine-coerced source). However, safe_merge(engine=engine_concrete) provides two defensive guarantees raw .merge() does not:

  1. Mixed-type protection — if a future code path introduces mixed-engine operands, safe_merge normalises instead of raising a TypeError mid-pipeline.
  2. Explicit engine contract — signals intent and guards against engine drift as code evolves.

The rest of hop.py (lines ~866, ~872, ~878, ~907, ~926) already uses safe_merge; these four are inconsistent.

Suggested fix

Replace each with safe_merge(..., on=EDGE_ID, how='inner', engine=engine_concrete).

Context

Found during GPU audit on branch refactor/coerce-unification (issue #1133 Polars support). No regression observed today; filing for hygiene.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions