Add .explain() DSL predicate for node-level explanations by Copilot · Pull Request #968 · SkBlaz/py3plex

Copilot · 2025-12-30T21:09:09Z

Adds .explain() to the DSL query API for attaching explanations (community membership, top neighbors, layer footprint) to result nodes. Explanations are computed post-filtering for efficiency and expand via to_pandas(expand_explanations=True).

Changes

DSL API

QueryBuilder.explain(): Dual-mode method
- No args: returns execution plan (backward compatible)
- With args: attaches explanations to results (new feature)
ExplainSpec dataclass in AST for configuration
Supports include/exclude lists and per-block configuration

Explanation Engine (`py3plex/dsl/explain.py`)

Community block: community_id, community_size from network partition
Top neighbors block: Ranked by weight or degree, layer-aware filtering
Layer footprint block: layers_present, n_layers_present for multilayer nodes
Configurable neighbor selection: metric (weight/degree), scope (layer/global), direction
Built-in caching for neighbor lookups

Result Integration

Explanations stored as regular attributes in QueryResult
to_pandas(expand_explanations=True) expands complex structures to JSON strings
Executor runs explanation step after LIMIT/SORT (only on final rows)

Usage

from py3plex.dsl import Q, L

result = (
    Q.nodes()
    .from_layers(L["social"])
    .compute("degree", "betweenness")
    .limit(20)
    .explain(
        neighbors_top=10,
        include=["community", "top_neighbors", "layer_footprint"]
    )
    .execute(network)
)

df = result.to_pandas(expand_explanations=True)
# df columns: id, layer, degree, betweenness, community_id, 
#             community_size, top_neighbors, layers_present, n_layers_present

Testing

26 new tests covering all explanation blocks, per-layer grouping, and configuration options
All 67 DSL v2 tests pass (backward compatibility maintained)
Example demonstrating 5 usage patterns

Original prompt

This section details on the original issue you should resolve

<issue_title>explain()</issue_title>
<issue_description>
Goal

Add a new DSL predicate/operator: .explain(...) that can be chained before .execute(network)
It attaches “explanations” to each resulting row/entity (typically nodes), enabling:
result.to_pandas(expand_explanations=True) # or expand_explanations="columns"
Must support the flagship usage:

Q.nodes()...limit(20).explain(
neighbors_top=10,
include=["community", "top_neighbors", "layer_footprint"]
).execute(network)

and then:

df = result.to_pandas(expand_uncertainty=True, expand_explanations=True)
df[["id","layer",...,"top_neighbors"]]

High-level behavior

.explain() does NOT change which rows are returned; it adds metadata per row.
Explanations should be computed efficiently post-selection (only for returned rows).
Works with per-layer grouping output: if the result rows contain “layer”, neighbors should be computed within that layer when possible.

Deliverables

DSL API: Query.explain(...) method
Execution plan support for an “ExplainStep” (like ComputeStep/MutateStep)
Explanation engine functions for nodes (phase 1) with extensibility for edges/communities later
Result object storage + to_pandas(expand_explanations=True) support
Tests + docs + one example snippet in README/gallery

TODO 0 — Locate architecture touchpoints

TODO: Identify the DSL query class (e.g., py3plex/dsl/query.py) that implements chainable ops like:
- .where(), .compute(), .mutate(), .sort(), .limit(), .execute()
TODO: Identify how a query is represented internally (list of “steps”? AST nodes? pipeline ops?)
TODO: Identify the result wrapper returned from .execute(network):
- likely has .to_pandas(expand_uncertainty=...) already
TODO: Identify how node ids + layer info are stored in results (columns? internal schema?).

TODO 1 — Define .explain() public API
Implement a method on the Query object:

def explain(
    self,
    neighbors_top: int = 10,
    include: list[str] | None = None,
    exclude: list[str] | None = None,
    neighbors: dict | None = None,
    community: dict | None = None,
    layer_footprint: dict | None = None,
    cache: bool = True,
    as_columns: bool = True,
    prefix: str = "",
) -> "Query":

Semantics

include: which explanation blocks to compute.
Default: ["community", "top_neighbors", "layer_footprint"]
exclude: remove any from include
neighbors_top: max neighbors returned in top_neighbors explanation
neighbors: optional config dict (metric, weight handling, direction, layer behavior)
as_columns: if True, store structured objects in a dedicated explanations field but expose
also as top-level columns when expand_explanations=True.
prefix: optionally prefix explanation columns (e.g., "explain__")

Validation rules

TODO: validate include keys against supported set: {"community","top_neighbors","layer_footprint"}.
TODO: if both include and exclude provided, apply exclude after include resolution.
TODO: neighbors_top must be >= 1
TODO: ensure explain() can be called only once OR allow multiple explain() steps to merge config:
- choose one:
  A) multiple calls merge includes and override config
  B) raise if explain already present
  Prefer A for ergonomics, but implement carefully.

Return value

Returns a new Query instance (immutable) OR mutates self depending on your DSL style.
TODO: follow existing patterns for other predicates (e.g., .mutate returns new query?).

TODO 2 — Add pipeline “ExplainStep” to DSL execution plan

TODO: Introduce a new step type/class similar to ComputeStep/MutateStep:
- name: ExplainStep
- stores: resolved include list + options (neighbors_top, configs, prefix, cache, etc.)
TODO: Ensure query serialization / repr includes explain step for debugging.
TODO: Ensure .execute(network) recognizes ExplainStep in the step pipeline:
- Recommended: run ExplainStep after LIMIT/SORT and after all row-reducing steps
  so only final rows are explained.

Execution ordering

TODO: Decide where ExplainStep runs relative to .per_layer grouping:
- After end_grouping() results exist (rows with layer field)
- Must be able to compute explanations per row, potentially using row["layer"].

TODO 3 — Explanation engine (Phase 1: node explanations)
Implement in a new module, e.g. py3plex/dsl/explain.py or py3plex/analysis/explain.py.

Public entrypoint
def explain_rows(
network,
rows: list[dict] | pandas.DataFrame,
*...

Fixes explain() #967

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

- Added ExplainSpec dataclass to AST for explanation configuration - Implemented explain() method in QueryBuilder with validation - Created py3plex/dsl/explain.py with explanation engine for nodes - Community explanation (community_id, community_size) - Top neighbors explanation (ranked by weight or degree) - Layer footprint explanation (layers_present, n_layers_present) - Integrated explanation step into executor (runs after LIMIT) - Added expand_explanations parameter to QueryResult.to_pandas() - Explanations support per-layer grouping context Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

- Updated explain() to handle two modes: 1. No args: returns ExplainQuery (execution plan, backward compat) 2. With args: attaches explanations to results (new feature) - Removed duplicate explain() method - Added comprehensive tests covering all features - All 26 explain tests pass - Backward compatibility maintained (67 DSL v2 tests pass) Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

- Created example_dsl_explain.py demonstrating all features - Shows basic usage, custom explanation blocks, flagship pattern - Includes neighbor ranking options and exclusion patterns - All examples run successfully with correct output Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Initial plan

5eb8073

Copilot AI assigned Copilot and SkBlaz Dec 30, 2025

Copilot started work on behalf of SkBlaz December 30, 2025 21:09 View session

Copilot AI and others added 3 commits December 30, 2025 21:19

Copilot AI changed the title ~~[WIP] Add new DSL predicate operator .explain(...)~~ Add .explain() DSL predicate for node-level explanations Dec 30, 2025

Copilot AI requested a review from SkBlaz December 30, 2025 21:29

Copilot finished work on behalf of SkBlaz December 30, 2025 21:30

SkBlaz marked this pull request as ready for review December 31, 2025 08:45

SkBlaz merged commit 6c904a4 into master Dec 31, 2025
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add .explain() DSL predicate for node-level explanations#968

Add .explain() DSL predicate for node-level explanations#968
SkBlaz merged 4 commits intomasterfrom
copilot/add-explain-dsl-predicate

Copilot AI commented Dec 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

DSL API

Explanation Engine (py3plex/dsl/explain.py)

Result Integration

Usage

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Dec 30, 2025 •

edited

Loading

Explanation Engine (`py3plex/dsl/explain.py`)