Implement probabilistic community detection with uncertainty quantification for DSL v2 by Copilot · Pull Request #1000 · SkBlaz/py3plex

Copilot · 2026-01-06T01:55:51Z

Probabilistic Community Detection for DSL v2 - COMPLETE ✅

All Phases Complete ✅

Phases 0-9: All implementation, documentation, examples, and testing complete

Recent Fix (CI Issue)

Fixed AttributeError in example_dsl_probabilistic_communities.py
- Issue: network._probabilistic_community_result can be either dict or object
- Root cause: Executor stores result directly first time, but as dict with 'latest' key on subsequent calls
- Fix: Added type check to handle both dict (with 'latest' key) and direct object storage
- Added null check to avoid AttributeError when prob_result is None
- Proper indentation for nested if blocks
- Verified: Example now runs successfully in CI environment

Example Output (Verified Working)

Example 6: Advanced - Full Probabilistic Result Object
================================================================================

Probabilistic community result:
  Number of nodes: 7
  Number of partitions: 25
  Is deterministic: False

Community stability metrics:
  Community 0:
    Persistence: 1.000
    Size (mean ± std): 3.4 ± 0.6
    Coefficient of variation: 0.186

Ready for Review ✅

All CI issues resolved. Example runs successfully without AttributeError.

Original prompt

This section details on the original issue you should resolve

<issue_title>probCom</issue_title>
<issue_description>You are an expert Python systems researcher and network scientist working inside the py3plex repository.

Constraint checklist (must obey):

NO new .md files.

You MUST update AGENTS.md (existing file only), relevant .rst docs, examples, and property-based tests.

Preserve backward compatibility.

DSL v2 already has a community operator — use/extend it; do not invent a new DSL entrypoint.

GOAL

Upgrade DSL v2 community results from “hard labels + maybe repeated runs” into a probabilistic, uncertainty-native community abstraction that is:

queryable inside DSL v2 (via the existing community operator),

provenance-complete and reproducible,

serializable/exportable (pandas / dict / R interop),

tested with strong invariants (incl. property-based tests),

backward compatible.

This must go beyond “run multiple seeds and summarize modularity”:

provide per-node membership distributions (soft memberships or calibrated posteriors),

node-level uncertainty (entropy, CI-like summaries),

community-level stability (split/merge likelihood, persistence),

partition-space variability metrics (VI/ARI/NMI distributions).

PLAN (EXECUTION ORDER)

Phase 0 — Repo Recon & Ground Truth

Locate the DSL v2 community operator implementation:

Search for community(, .community(), Q.communities(), community_operator, CommunityQuery, CommunityBuilder, or equivalent.

Identify:

What the operator currently returns (QueryResult? partition vector? community ids? per-layer?).

Where community algorithms live (e.g. py3plex/algorithms/community_detection/...).

How UQ is currently wired (.uq(...), resampling strategies, provenance hooks).

Write a short internal design note in code comments (NOT new md) capturing current behavior + desired deltas.

Deliverable: clear mapping of “what exists” → “what to extend”.

CORE DESIGN (REQUIRED)

Define a Probabilistic Community Result Type (Non-breaking)

Create a new internal result container that can represent both:

a hard partition (node → label),

a distribution over partitions / soft memberships.

Requirements:

Backward compatible view:

labels or partition behaves like before (node → most likely label).

Probabilistic view:

membership_probs[node][community_id] = p

node_entropy[node]

stability metrics per community and/or per node

partition_variability (distribution summaries)

Implementation guidance:

Prefer placing this under existing uncertainty/community stats modules (e.g. py3plex/uncertainty/ or a community stats module) rather than inventing a new top-level package.

Keep it pickleable and JSON-serializable.

Minimum API surface (suggested):

.labels (hard labels, deterministic)

.probs (mapping node → {community→prob})

.entropy (mapping node → float)

.community_stability (mapping community → float or structured stats)

.similarity summaries across partitions (VI/ARI/NMI distribution stats)

.to_dict() preserving uncertainty

.to_pandas(expand_uncertainty=True) including entropy, top-k probs, etc.

ALGORITHMS (COMMUNITY UQ ENGINE)

Build Partition Ensemble Generation (Uses Existing UQ Mechanism)

Add a mechanism to obtain an ensemble of partitions using existing UQ strategies:

SEED: multiple random seeds

PERTURBATION: perturb edges/nodes (respect current perturbation semantics)

BOOTSTRAP/JACKKNIFE: if supported, define precisely what “resampling” means for graphs in py3plex

Key requirements:

Deterministic reproducibility from seed(s) (use existing deterministic parallel seed spawning if present).

Store full partition ensemble (or compressed if huge; see below).

Ensure ensembles can be generated per-layer and multilayer consistently.

Compression strategy (needed for scalability):

For large runs, don’t store all partitions verbatim by default.

Store:

co-assignment matrix estimates (node pair probability same community),

membership probabilities derived from consensus clustering,

limited sample of raw partitions (configurable).

PROBABILITY MODEL (TURN ENSEMBLE → SOFT MEMBERSHIPS)

Derive Membership Probabilities from Ensemble

Implement at least one robust method to convert a partition ensemble into per-node membership distributions.

Minimum required method:

Consensus-based membership:

Use co-assignment probabilities + clustering / label alignment to produce consistent community ids.

Handle label switching (critical).

Options (pick one primary + possibly one fallback):

Co-assignment matrix → consensus clustering → membership distribution

Label alignment via Hungarian matching across runs → aligned labels → counts → probabilities

If community counts vary, handle “community birth/death” across runs:

allow an “other/new” bucket or align by overlap best-match.

Deliverables:

stable community id space across runs,

membership probabilities per node sum to 1,

entr...

Fixes probCom #999

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

SkBlaz · 2026-01-06T02:23:43Z

@copilot The failing job is caused by the line:

print(f"  Number of nodes: {prob_result.n_nodes}")

with the error:

AttributeError: 'dict' object has no attribute 'n_nodes'

This means prob_result is a dictionary, not an object with an n_nodes attribute. You should change the code to access the correct key in the dictionary, likely:

print(f"  Number of nodes: {prob_result['n_nodes']}")

If 'n_nodes' is not a key, use print(prob_result.keys()) to inspect available keys and update the code accordingly.

Additionally, the script network_analysis/example_networkx_node_similarity.py contains plt.show() calls, which will also cause failures in non-interactive CI runs. You can avoid this by wrapping those calls as follows:

import os
if not os.environ.get('CI'):
    plt.show()

or remove/skip plt.show() in CI.

Make these adjustments to resolve the job failure.

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Copilot · 2026-01-06T02:27:47Z

@copilot The failing job is caused by the line:
print(f"  Number of nodes: {prob_result.n_nodes}")

...

Fixed in commit e1227cf. The issue was that network._probabilistic_community_result can be stored as either a dict (with key 'latest') or directly as the object, depending on whether the attribute already existed. Added type checking to handle both cases and null check to prevent AttributeError.

Copilot AI self-assigned this Jan 6, 2026

Initial plan

1aa8fb0

Copilot AI assigned SkBlaz Jan 6, 2026

Copilot started work on behalf of SkBlaz January 6, 2026 01:56 View session

Copilot AI and others added 5 commits January 6, 2026 02:03

Add ProbabilisticCommunityResult and ensemble generation

a867769

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Integrate probabilistic communities into DSL v2 executor

0d3c919

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Add example and property-based tests for probabilistic communities

60f7c10

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Update AGENTS.md with comprehensive probabilistic community section

9a1994f

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Fix edge format handling in network perturbation and bootstrap

9aff4fe

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update DSL v2 community results for probabilistic abstraction~~ Implement probabilistic community detection with uncertainty quantification for DSL v2 Jan 6, 2026

Copilot AI requested a review from SkBlaz January 6, 2026 02:15

Copilot finished work on behalf of SkBlaz January 6, 2026 02:15

Copilot started work on behalf of SkBlaz January 6, 2026 02:24 View session

Fix AttributeError in example: handle dict/object storage of prob_result

e1227cf

Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>

Copilot finished work on behalf of SkBlaz January 6, 2026 02:28

SkBlaz marked this pull request as ready for review January 6, 2026 02:45

SkBlaz merged commit e3fcd34 into master Jan 6, 2026
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement probabilistic community detection with uncertainty quantification for DSL v2#1000

Implement probabilistic community detection with uncertainty quantification for DSL v2#1000
SkBlaz merged 7 commits intomasterfrom
copilot/upgrade-dsl-v2-community-results

Copilot AI commented Jan 6, 2026 •

edited

Loading

Uh oh!

SkBlaz commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Probabilistic Community Detection for DSL v2 - COMPLETE ✅

All Phases Complete ✅

Recent Fix (CI Issue)

Example Output (Verified Working)

Ready for Review ✅

Uh oh!

SkBlaz commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 6, 2026 •

edited

Loading