Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add something like n_init to Leiden #757

Closed
bdpedigo opened this issue Apr 12, 2021 · 7 comments
Closed

Add something like n_init to Leiden #757

bdpedigo opened this issue Apr 12, 2021 · 7 comments
Labels
enhancement New feature or request

Comments

@bdpedigo
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

I often re-run Leiden multiple times and take the best modularity partition.
This is distinct from what the extra_forced_iterations parameter does, as far as I can tell.

Describe the solution you'd like

I have code that basically looks like:

def optimize_leiden(g, n_restarts=25, resolution=1, randomness=0.001):
    best_modularity = -np.inf
    best_partition = {}
    for i in range(n_restarts):
        partition = leiden(
            g,
            resolution=resolution,
            randomness=randomness,
            check_directed=False,
            extra_forced_iterations=10,
        )
        modularity_score = modularity(g, partitions=partition, resolution=resolution)
        if modularity_score > best_modularity:
            best_partition = partition
            best_modularity = modularity_score
    return best_partition, best_modularity

so basically wrapping the current code in a for loop and adding a parameter.

Describe alternatives you've considered

Might be faster to do this implementation-wise at the Rust level, but I don't have a desire to do that.

Additional context

Happy to open a PR if this is acceptable. cc @dwaynepryce but feel free to @ anyone on the MSR side who might have opinions.

@bdpedigo bdpedigo added the enhancement New feature or request label Apr 12, 2021
@loftusa
Copy link
Collaborator

loftusa commented Apr 14, 2021

if this issue isn't resolved by the summer, I'm potentially interested in doing the implementation in Rust, I've been interested in learning some Rust for awhile now

@bdpedigo
Copy link
Collaborator Author

cool, depends on @dwaynepryce 's thoughts I think tho.

Ideally, I'd like to have this sooner, esp. since the code would be so simple in python, and I'm using this for a few things in my work. And I don't mind making a PR that is basically the one above.

@loftusa could still do it in rust later though, as I think this would avoid repeatedly converting the data into whatever format rust is using, which I suspect would help performance significantly for large networks.

@daxpryce
Copy link
Contributor

I'm wondering at the utility of this over the iterations currently in the function?

As it is, it'll go until your q score doesn't get any better, then return. If you want it to try again, just set iterations==25 (for instance) and it'll use the latest best partitioning and keep trying another 24 times or so. If the randomization aspect can find a better mapping, it keeps going.

This isn't exactly the same as your code, that starts from scratch each time, so let me know if that's the specific behavior you want!

@bdpedigo
Copy link
Collaborator Author

This isn't exactly the same as your code, that starts from scratch each time, so let me know if that's the specific behavior you want!

I believe start from scratch is explicitly what I want - there's some variance in the modularity of the final partition even when using the extra_forced_iterations which I believe just comes from the original initialization. I have found in practice that running a bunch of restarts can 1. make the final partition more stable (would take me a bit to explain what I mean by this but I could if there's interest) and 2. make the final modularity slightly higher. It's not a huge effect but noticeable.

@bdpedigo
Copy link
Collaborator Author

But if you think I'm wrong about the above, would be happy to discuss more, I could show you a demo

@daxpryce
Copy link
Contributor

daxpryce commented Apr 14, 2021

Nope! You're absolutely right.

extra_forced_iterations ultimately only has a very minor impact, whereas doing the whole thing from scratch until you get a best score is a more expensive process. I just wanted to make sure you knew there was a attempt at maximizing your quality score by one mechanism already in it.

This should be in the graspologic-native tracker though. We absolutely would want to push this into rust itself just to minimize the native representation -> python representation processes, which aren't free

@bdpedigo
Copy link
Collaborator Author

closed by #790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants