-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added trials parameter to leiden #790
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -209,6 +209,7 @@ def leiden( | |
is_weighted: Optional[bool] = None, | ||
weight_default: float = 1.0, | ||
check_directed: bool = True, | ||
trials: int = 1, | ||
) -> Dict[str, int]: | ||
""" | ||
Leiden is a global network partitioning algorithm. Given a graph, it will iterate | ||
|
@@ -241,7 +242,7 @@ def leiden( | |
for the node clustering and no nodes are moved to a new cluster in another | ||
iteration. As there is an element of randomness to the Leiden algorithm, it is | ||
sometimes useful to set ``extra_forced_iterations`` to a number larger than 0 | ||
where the entire process is forced to attempt further refinement. | ||
where the process is forced to attempt further refinement. | ||
resolution : float | ||
Default is ``1.0``. Higher resolution values lead to more communities and lower | ||
resolution values leads to fewer communities. Must be greater than 0. | ||
|
@@ -277,6 +278,13 @@ def leiden( | |
if it is found to be a directed graph. If you know it is undirected and wish to | ||
avoid this scan, you can set this value to ``False`` and only the lower triangle | ||
of the adjacency matrix will be used to generate the weighted edge list. | ||
trials : int | ||
Default is ``1``. Runs leiden ``trials`` times, keeping the best partitioning | ||
as judged by the quality maximization function (default: modularity, see | ||
``use_modularity`` parameter for details). This differs from | ||
``extra_forced_iterations`` by starting over from scratch each for each trial, | ||
while ``extra_forced_iterations`` attempts to make microscopic adjustments from | ||
the "final" state. | ||
Returns | ||
------- | ||
|
@@ -317,18 +325,23 @@ def leiden( | |
weight_default, | ||
check_directed, | ||
) | ||
if not isinstance(trials, int): | ||
raise TypeError("trials must be a positive integer") | ||
if trials < 1: | ||
raise ValueError("trials must be a positive integer") | ||
Comment on lines
+328
to
+331
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know we've talked about input checking at various times - I've always thought it could be helpful to use things like https://scikit-learn.org/stable/modules/generated/sklearn.utils.check_scalar.html going forward, just to be more concise and consistent. I don't care for the purposes of this PR but thought I'd bring it up. |
||
node_id_mapping, graph = _validate_and_build_edge_list( | ||
graph, is_weighted, weight_attribute, check_directed, weight_default | ||
) | ||
|
||
_improved, _modularity, partitions = gn.leiden( | ||
_modularity, partitions = gn.leiden( | ||
edges=graph, | ||
starting_communities=starting_communities, | ||
resolution=resolution, | ||
randomness=randomness, | ||
iterations=extra_forced_iterations + 1, | ||
use_modularity=use_modularity, | ||
seed=random_seed, | ||
trials=trials, | ||
) | ||
|
||
proper_partitions = { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't want to let this hold up the PR, but this seems like something we should add to the stack of #755 and consider how we name parameters like this in the future. This exact thing happens in
match
andcluster
at a minimum and there we call itn_init
.