Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Peer Review] alias_draw function is too hard to understand #7

Closed
cthoyt opened this issue Oct 2, 2020 · 1 comment
Closed

[Peer Review] alias_draw function is too hard to understand #7

cthoyt opened this issue Oct 2, 2020 · 1 comment

Comments

@cthoyt
Copy link
Contributor

cthoyt commented Oct 2, 2020

It's non-obvious how the following alias_draw() function works.

@jit(nopython=True, nogil=True)
def alias_draw(J, q):
"""
Draw sample from a non-uniform discrete distribution using alias sampling.
"""
K = J.size
kk = np.random.randint(K)
if np.random.rand() < q[kk]:
return kk
else:
return J[kk]

I've also copy-pasted a similar python implementation from StackOverflow, where nobody seemed to know how it worked either. However, because this piece of code is now part of an academic paper, it would do some good to make it instructive to a potential reader - code is, after all, the methods section and major result of your paper.

  1. Improve the docstring such that it explains what this function does and why
  2. Include a reference to where you found it and what you extra reading you did to understand it yourself
  3. Add type annotations for the arguments for the function
  4. Improve the names of all variables appearing in the function
  5. Write examples to illustrate when you put certain data in, you get certain data out. You'll have to seed the numpy random number generator (ref) to make sure the results are consistent.
  6. Unit tests (which just make it possible to automatically check that your examples really do what they say they should)

I'd be happy to assist in writing the unit tests and making them easy to run after you've prepared the rest if you want any help with that. I'd also help you set up Travis-CI to automatically run the tests every time you make changes to the package if you'd like

@RemyLau
Copy link
Contributor

RemyLau commented Oct 16, 2020

@cthoyt The original implementation of node2vec pointed to this (the url is slightly different since the website was updated) blog post, which I've also included in the docstring for alias_setup(). The explanations in the blog post about the alias method is pretty thorough. But the inline comments are omitted in the source code here for compactness, should I copy and paste all the comments from the blog post to the source code here to make it more clear?

@cthoyt cthoyt changed the title alias_draw function is too hard to understand [Peer Review] alias_draw function is too hard to understand Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants