Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster/better tree building algorithm (see #64) #65

Merged
merged 1 commit into from
May 25, 2015
Merged

Conversation

erikbern
Copy link
Collaborator

I haven't validated this with any real dataset, need to do that

@erikbern
Copy link
Collaborator Author

I want to run this on a real dataset to see if it helps.

I'm also a bit concern what happens in degenerate cases such as i==j but I think it will work out fine

@erikbern
Copy link
Collaborator Author

Before

Ran 21 tests in 115.291s

After

Ran 21 tests in 23.230s

@erikbern
Copy link
Collaborator Author

See accuracy test that I just added: f5c1f58

The improvements are pretty spectacular:

Before

             angular   25 accuracy: 34.64%
             angular   50 accuracy: 20.44%
             angular  100 accuracy: 12.90%
             angular  200 accuracy:  7.99%

           euclidean   25 accuracy: 33.71%
           euclidean   50 accuracy: 21.04%
           euclidean  100 accuracy: 13.37%
           euclidean  200 accuracy:  9.31%

After

             angular   25 accuracy: 46.80%
             angular   50 accuracy: 31.00%
             angular  100 accuracy: 24.50%
             angular  200 accuracy: 15.18%

           euclidean   25 accuracy: 47.34%
           euclidean   50 accuracy: 33.04%
           euclidean  100 accuracy: 25.10%
           euclidean  200 accuracy: 17.99%

The runtime of the tests also dropped from about 6000s to 1500s meaning a speedup of 4x.

Pretty cool for a code review that's mostly red!

I haven't validated this with any real dataset, need to do that
erikbern added a commit that referenced this pull request May 25, 2015
faster/better tree building algorithm (see #64)
@erikbern erikbern merged commit a144f87 into master May 25, 2015
@erikbern erikbern deleted the erikbern/fix-64 branch May 25, 2015 22:01
}
// Same as Angular version, but no normalization and has to compute the offset too
int i = random->index(nodes.size());
int j = random->index(nodes.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if i == j?

@a1k0n
Copy link
Contributor

a1k0n commented May 26, 2015

That is super awesome, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants