Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow token weighting #10

Closed
3 tasks
lewinfox opened this issue May 11, 2023 · 0 comments · Fixed by #12
Closed
3 tasks

Allow token weighting #10

lewinfox opened this issue May 11, 2023 · 0 comments · Fixed by #12
Assignees
Labels
enhancement New feature or request

Comments

@lewinfox
Copy link
Owner

Add the ability to supply pre-computed token weights to functions.

levitate::lev_token_set_ratio("ltd bill","ltd b") 
#> 0.625

levitate::lev_ratio("bill","b") 
#> 0.25

levitate::lev_ratio("ltd bill","ltd b") 
#> 0.625

Now, suppose we had lev_ratio( “ltd bill”, “ltd b”, weights = list(ltd = 0.1)). This would construct the lev_ratio as (0.1*3 + 1)/(0.1*3 + 3) = 0.39. Similarly, lev_ratio(“ltd bill”, “ltd b”,weights = list(ltd = 0.01) = 0.34.

Will require:

  • Tokenising all input, not just lev_token_*() functions
  • Constructing a weight matrix to match the output from stringdist functions
  • Matmul weights * scores
@lewinfox lewinfox added the enhancement New feature or request label May 11, 2023
@lewinfox lewinfox self-assigned this May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant