Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added trials parameter to leiden #790

Merged
merged 1 commit into from
May 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 15 additions & 2 deletions graspologic/partition/leiden.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ def leiden(
is_weighted: Optional[bool] = None,
weight_default: float = 1.0,
check_directed: bool = True,
trials: int = 1,
) -> Dict[str, int]:
"""
Leiden is a global network partitioning algorithm. Given a graph, it will iterate
Expand Down Expand Up @@ -241,7 +242,7 @@ def leiden(
for the node clustering and no nodes are moved to a new cluster in another
iteration. As there is an element of randomness to the Leiden algorithm, it is
sometimes useful to set ``extra_forced_iterations`` to a number larger than 0
where the entire process is forced to attempt further refinement.
where the process is forced to attempt further refinement.
resolution : float
Default is ``1.0``. Higher resolution values lead to more communities and lower
resolution values leads to fewer communities. Must be greater than 0.
Expand Down Expand Up @@ -277,6 +278,13 @@ def leiden(
if it is found to be a directed graph. If you know it is undirected and wish to
avoid this scan, you can set this value to ``False`` and only the lower triangle
of the adjacency matrix will be used to generate the weighted edge list.
trials : int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't want to let this hold up the PR, but this seems like something we should add to the stack of #755 and consider how we name parameters like this in the future. This exact thing happens in match and cluster at a minimum and there we call it n_init.

Default is ``1``. Runs leiden ``trials`` times, keeping the best partitioning
as judged by the quality maximization function (default: modularity, see
``use_modularity`` parameter for details). This differs from
``extra_forced_iterations`` by starting over from scratch each for each trial,
while ``extra_forced_iterations`` attempts to make microscopic adjustments from
the "final" state.
Returns
-------
Expand Down Expand Up @@ -317,18 +325,23 @@ def leiden(
weight_default,
check_directed,
)
if not isinstance(trials, int):
raise TypeError("trials must be a positive integer")
if trials < 1:
raise ValueError("trials must be a positive integer")
Comment on lines +328 to +331
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've talked about input checking at various times - I've always thought it could be helpful to use things like https://scikit-learn.org/stable/modules/generated/sklearn.utils.check_scalar.html going forward, just to be more concise and consistent. I don't care for the purposes of this PR but thought I'd bring it up.

node_id_mapping, graph = _validate_and_build_edge_list(
graph, is_weighted, weight_attribute, check_directed, weight_default
)

_improved, _modularity, partitions = gn.leiden(
_modularity, partitions = gn.leiden(
edges=graph,
starting_communities=starting_communities,
resolution=resolution,
randomness=randomness,
iterations=extra_forced_iterations + 1,
use_modularity=use_modularity,
seed=random_seed,
trials=trials,
)

proper_partitions = {
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ include_package_data = True
install_requires =
anytree>=2.8.0
gensim>=3.8.0,<=3.9.0 # methods signatures changed in the 4.0.0beta release
graspologic-native
graspologic-native>=1.0.0
hyppo>=0.2.0
joblib>=0.17.0 # Older versions of joblib cause issue #806. Transitive dependency of hyppo.
matplotlib>=3.0.0,<=3.3.0
Expand Down
10 changes: 10 additions & 0 deletions tests/partition/test_leiden.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,16 @@ def test_correct_types(self):
args["use_modularity"] = 1234
leiden(graph=graph, **args)

with self.assertRaises(TypeError):
args = good_args.copy()
args["trials"] = "hotdog"
leiden(graph=graph, **args)

with self.assertRaises(ValueError):
args = good_args.copy()
args["trials"] = 0
leiden(graph=graph, **args)

args = good_args.copy()
args["random_seed"] = 1234
leiden(graph=graph, **args)
Expand Down