-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Information-Gain & TF-IDF support in Contextual classification #1125
Comments
This was referenced Apr 20, 2020
etiennedi
added a commit
that referenced
this issue
Apr 20, 2020
etiennedi
added a commit
that referenced
this issue
Apr 21, 2020
A prepare run is a run done before the individual per-item classifications to provide an additional context. As part of this prepare run we plan to fetch information that stays valid for the entire run, such as target vectors. Additionally this is the place to calculate tf-idf scores over all documents.
etiennedi
added a commit
that referenced
this issue
Apr 22, 2020
etiennedi
added a commit
that referenced
this issue
Apr 22, 2020
etiennedi
added a commit
that referenced
this issue
Apr 22, 2020
This should be an actual batch method in the c11y. This is just a temporary shortcut to move a bit faster. However, this is very problematic as we still have a lot of request overhead and have no clean way to determine word-not-found errors at the moment. This will be addressed before completing #1125
etiennedi
added a commit
that referenced
this issue
Apr 22, 2020
etiennedi
added a commit
that referenced
this issue
Apr 23, 2020
NOTE: The paramters are currently still hard-coded as we haven't added validation/default setting for them yet
etiennedi
added a commit
that referenced
this issue
Apr 23, 2020
NOTE: The parameters are all still hard-coded atm.
etiennedi
added a commit
that referenced
this issue
Apr 23, 2020
etiennedi
added a commit
that referenced
this issue
Apr 25, 2020
etiennedi
added a commit
that referenced
this issue
Apr 27, 2020
This is a rather elaborate fake for "only" a unit test, but it provides us with a lot of confidence that the elaborate implementation for the contextual classification is working on a unit test level as well and not just on an e2e level.
etiennedi
added a commit
that referenced
this issue
Apr 29, 2020
etiennedi
added a commit
that referenced
this issue
Apr 29, 2020
etiennedi
added a commit
that referenced
this issue
Apr 29, 2020
This was referenced Apr 29, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is the implementation for the POC findings of #1118, originally proposed in #1115.
Todos
MultiVectorForWord
in c11y clientMultiVectorForWord
in c11y (Provide MutliVectorForWord method contextionary#29)The text was updated successfully, but these errors were encountered: