Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed and random results for embeddings #144

Open
sebkaz opened this issue Jul 26, 2023 · 1 comment
Open

Seed and random results for embeddings #144

sebkaz opened this issue Jul 26, 2023 · 1 comment

Comments

@sebkaz
Copy link

sebkaz commented Jul 26, 2023

Hi!

I want to ask You about seed parameters for most node embeddings.
In the documentation, You have info that you put seed=42 as a default, but when You run, for example, Node2Vec twice, you get different embedding vectors.

Do you plan to make some changes so that if you have seed as default, there will also be workers=1?

best regards
S.

@tomlincr
Copy link
Contributor

@sebkaz I've also noticed this issue (different embedding vectors per iteration of the same algorithm/params/seed).

I think it's a hard one to solve at the karateclub level, across all algorithms, given reliance on other packages under the hood.

E.g. NetMF uses sklearn's TruncatedSVD which defaults to a randomised solver and seems to acknowledge this issue in the documentation:

SVD suffers from a problem called “sign indeterminacy”, which means the sign of the components_ and the output from transform depend on the algorithm and random state. To work around this, fit instances of this class to data once, then keep the instance around to do transformations.

It would seem to me that any workarounds (e.g. setting workers=1, using other solvers) would lead to an increased compute time and on balance isn't worth it? E.g. if you have a specific use case where you need it to be reproducible then the user can address that on a case-by-case basis?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants