Conversation
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
|
@copilot The failing job is caused by the line: print(f" Number of nodes: {prob_result.n_nodes}")with the error: This means prob_result is a dictionary, not an object with an n_nodes attribute. You should change the code to access the correct key in the dictionary, likely: print(f" Number of nodes: {prob_result['n_nodes']}")If 'n_nodes' is not a key, use print(prob_result.keys()) to inspect available keys and update the code accordingly. Additionally, the script network_analysis/example_networkx_node_similarity.py contains plt.show() calls, which will also cause failures in non-interactive CI runs. You can avoid this by wrapping those calls as follows: import os
if not os.environ.get('CI'):
plt.show()or remove/skip plt.show() in CI. Make these adjustments to resolve the job failure. |
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
... Fixed in commit e1227cf. The issue was that |
Probabilistic Community Detection for DSL v2 - COMPLETE ✅
All Phases Complete ✅
Phases 0-9: All implementation, documentation, examples, and testing complete
Recent Fix (CI Issue)
example_dsl_probabilistic_communities.pynetwork._probabilistic_community_resultcan be either dict or objectExample Output (Verified Working)
Ready for Review ✅
All CI issues resolved. Example runs successfully without AttributeError.
Original prompt
This section details on the original issue you should resolve
<issue_title>probCom</issue_title>
<issue_description>You are an expert Python systems researcher and network scientist working inside the py3plex repository.
Constraint checklist (must obey):
NO new .md files.
You MUST update AGENTS.md (existing file only), relevant .rst docs, examples, and property-based tests.
Preserve backward compatibility.
DSL v2 already has a community operator — use/extend it; do not invent a new DSL entrypoint.
GOAL
Upgrade DSL v2 community results from “hard labels + maybe repeated runs” into a probabilistic, uncertainty-native community abstraction that is:
queryable inside DSL v2 (via the existing community operator),
provenance-complete and reproducible,
serializable/exportable (pandas / dict / R interop),
tested with strong invariants (incl. property-based tests),
backward compatible.
This must go beyond “run multiple seeds and summarize modularity”:
provide per-node membership distributions (soft memberships or calibrated posteriors),
node-level uncertainty (entropy, CI-like summaries),
community-level stability (split/merge likelihood, persistence),
partition-space variability metrics (VI/ARI/NMI distributions).
PLAN (EXECUTION ORDER)
Phase 0 — Repo Recon & Ground Truth
Search for community(, .community(), Q.communities(), community_operator, CommunityQuery, CommunityBuilder, or equivalent.
What the operator currently returns (QueryResult? partition vector? community ids? per-layer?).
Where community algorithms live (e.g. py3plex/algorithms/community_detection/...).
How UQ is currently wired (.uq(...), resampling strategies, provenance hooks).
Deliverable: clear mapping of “what exists” → “what to extend”.
CORE DESIGN (REQUIRED)
Create a new internal result container that can represent both:
a hard partition (node → label),
a distribution over partitions / soft memberships.
Requirements:
Backward compatible view:
labels or partition behaves like before (node → most likely label).
Probabilistic view:
membership_probs[node][community_id] = p
node_entropy[node]
stability metrics per community and/or per node
partition_variability (distribution summaries)
Implementation guidance:
Prefer placing this under existing uncertainty/community stats modules (e.g. py3plex/uncertainty/ or a community stats module) rather than inventing a new top-level package.
Keep it pickleable and JSON-serializable.
Minimum API surface (suggested):
.labels (hard labels, deterministic)
.probs (mapping node → {community→prob})
.entropy (mapping node → float)
.community_stability (mapping community → float or structured stats)
.similarity summaries across partitions (VI/ARI/NMI distribution stats)
.to_dict() preserving uncertainty
.to_pandas(expand_uncertainty=True) including entropy, top-k probs, etc.
ALGORITHMS (COMMUNITY UQ ENGINE)
Add a mechanism to obtain an ensemble of partitions using existing UQ strategies:
SEED: multiple random seeds
PERTURBATION: perturb edges/nodes (respect current perturbation semantics)
BOOTSTRAP/JACKKNIFE: if supported, define precisely what “resampling” means for graphs in py3plex
Key requirements:
Deterministic reproducibility from seed(s) (use existing deterministic parallel seed spawning if present).
Store full partition ensemble (or compressed if huge; see below).
Ensure ensembles can be generated per-layer and multilayer consistently.
Compression strategy (needed for scalability):
For large runs, don’t store all partitions verbatim by default.
Store:
co-assignment matrix estimates (node pair probability same community),
membership probabilities derived from consensus clustering,
limited sample of raw partitions (configurable).
PROBABILITY MODEL (TURN ENSEMBLE → SOFT MEMBERSHIPS)
Implement at least one robust method to convert a partition ensemble into per-node membership distributions.
Minimum required method:
Consensus-based membership:
Use co-assignment probabilities + clustering / label alignment to produce consistent community ids.
Handle label switching (critical).
Options (pick one primary + possibly one fallback):
Co-assignment matrix → consensus clustering → membership distribution
Label alignment via Hungarian matching across runs → aligned labels → counts → probabilities
If community counts vary, handle “community birth/death” across runs:
allow an “other/new” bucket or align by overlap best-match.
Deliverables:
stable community id space across runs,
membership probabilities per node sum to 1,
entr...
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.