Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with gensim 4.0.0+ #37

Open
cthoyt opened this issue Apr 27, 2021 · 3 comments
Open

Issue with gensim 4.0.0+ #37

cthoyt opened this issue Apr 27, 2021 · 3 comments

Comments

@cthoyt
Copy link
Contributor

cthoyt commented Apr 27, 2021

It appears one of the argument names has changed in the newly released version of GenSim. This has also caused some pain in other libraries using this package for node2vec implementations (e.g., krishnanlab/PecanPy#16)

Traceback (most recent call last):
  File "embed_nodevectors.py", line 150, in <module>
    main()
  File "/Users/cthoyt/.virtualenvs/indra/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cthoyt/.virtualenvs/indra/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cthoyt/.virtualenvs/indra/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cthoyt/.virtualenvs/indra/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "embed_nodevectors.py", line 137, in main
    model.fit(graph)
  File "/Users/cthoyt/.virtualenvs/indra/lib/python3.8/site-packages/nodevectors/node2vec.py", line 130, in fit
    self.model = gensim.models.Word2Vec(
TypeError: __init__() got an unexpected keyword argument 'size'
@VHRanger
Copy link
Owner

VHRanger commented Apr 27, 2021

Thanks.

I can make a patch that checks the gensim version for now and routes the argument depending on the version.

Long term the idea would be to remove the gensim dependency entirely. It's a heavy dependency that's a moving target and only used for this one part of Node2Vec.

It has a lot of overhead for Node2Vec. For one, we need to map nodenames back from random walks to a format gensim accepts.

We could just train a word2vec model directly on the nodeIDs (ints, so would be faster) and re-map the embedding dictionary keys from nodeID -> node name only once after everything is trained.

This could be achieved either by stripping the node2vec C code and integrating it in CSRGraphs or by using another C/C++ implementation, like this one:

https://github.com/xgfs/node2vec-c

(which works on CSR representation already, not too far from csrgraphs) or this one:

https://github.com/snap-stanford/snap/tree/master/examples/node2vec

and integrating it into CSRGraphs.

@hhu1
Copy link

hhu1 commented Jul 7, 2021

Following bash command worked for me:

pip3 install -I gensim==3.8.0

@Wapiti08
Copy link

Following bash command worked for me:

pip3 install -I gensim==3.8.0

That did not solve my problem. It is still there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants