Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct signature size calculation #14

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

snie2012
Copy link

@snie2012 snie2012 commented Aug 7, 2017

According to the description in this file, signature size equals to R * b instead of R * s

According to the description in this file, signature size equals to R * b instead of R * n
@@ -66,7 +66,7 @@ public LSHMinHash(final int s, final int b, final int n) {
*/
public LSHMinHash(final int s, final int b, final int n, final long seed) {
super(s, b);
int signature_size = computeSignatureSize(s, n);
int signature_size = computeSignatureSize(s, b);
this.mh = new MinHash(signature_size, n, seed);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter should be changed to b as well.

@tdebatty
Copy link
Owner

Hi,
I should check this very carefully, because I don't use the same notation as in the book:

  • the book calls bands (b) what I in the code call stages (s)
  • in the code b stands for buckets (per stage)

@snie2012
Copy link
Author

snie2012 commented Aug 12, 2017

Hi, I checked again based on your comment. It looks like my correction is wrong. Inside the computeSignatureSize function, r is the number of rows and s is the number of bands, so the signature size should be r * s. Please confirm! (The notations in the comments above the function are a little bit confusing, you might want to change that.)

By the way, the parameter n passed to computeSignatureSize is not used. Is this for consistency?

@tavish
Copy link

tavish commented Jan 14, 2019

I also had a question about the parameter n not being used in computeSignatureSize.

@tdebatty, could you comment on this please? Thank you, in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants