-
Notifications
You must be signed in to change notification settings - Fork 679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension endpoint, dealing with numbers #1078
Comments
Another example that ideally works like this:
|
I didn't yet have the time to look into this in more detail, but two thoughts upfront. On Problem 1:
The workaround at the moment is to lowercase it yourself. I'm not sure if auto-lowercasing is a good idea. Let's discuss this when we have some time. On Problem 2:
This is only a limitation of the |
|
On Problem 1:The contextionary itself (based on the training data) is all lowercase at the moment. That's why we currently have the requirement that additions also have to be lowercased. If we magically lowercase all the user input, the user will never know that the meaning they added is lost. So for example, they might add On Problem 2:Makes sense, this is an easy fix, I'll prioritize this. |
On Problem 1I'm a bit confused, because with the current behavior
Or am I missing something? |
Correct and I think that's good. Because if we don't error on (1) and lowercase the input, it would overwerite (0), but the user would be none the wiser. |
As agreed on Hangouts, we're only focusing on Problem 2 for now. We're leaving Problem 1 as is - this means the user will have to add some In the future we should look into what it would take to build a cased contextionary and accept both entries as individual extenstions. |
closes #1078 Prior to this we used a very simple regex to validate. Now instead we're accepting any letter/digit utf-8 value. There were also no tests on this feature, so I've added a few simple test cases to cover the most common cases and make this a bit more robust.
…ndopint gh-1078 support numbers in c11y endpoint
Currently, we can't add extensions with numbers in them.
Current behavior: Add concept with capitals
results in:
Expected behavior: Weaviate handles capitalization automatically.
Current behavior: Add concept with lowercase letters
results in a
200 OK
Current behavior: Locate the new concept
results in:
Expected behavior: Correct representation for this word.
PS:
Source of the example: https://attack.mitre.org/groups/G0005/
The text was updated successfully, but these errors were encountered: