Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jaccard #1461

Merged
merged 15 commits into from Mar 5, 2021
Merged

Jaccard #1461

merged 15 commits into from Mar 5, 2021

Conversation

c5sire
Copy link

@c5sire c5sire commented Mar 2, 2021

This PR adds the Jaccard similarity function at character level: jaccard('ab', 'aaabbbb') = 1.0.

@c5sire c5sire marked this pull request as draft March 2, 2021 10:21
@c5sire c5sire marked this pull request as ready for review March 2, 2021 12:09
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good. One minor comment:

return map_of_chars;
}

static inline map<char, idx_t> TabulateCharacters(map<char, idx_t> str, map<char, idx_t> txt) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are making copies of the map here, is that intended?

This function also appears to only be used in one place. Perhaps better to inline it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right!

@c5sire c5sire marked this pull request as draft March 2, 2021 13:11
@c5sire c5sire marked this pull request as ready for review March 2, 2021 15:47
@Mytherin
Copy link
Collaborator

Mytherin commented Mar 5, 2021

Thanks, this looks good now.

@Mytherin Mytherin merged commit 33600d9 into duckdb:master Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants