-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merging all the codes of dev to main branch. #9
Conversation
This adds some more definite types for our NLP tasks and tokenization configurations. This is the first step in allowing users to more easily import their own transformer models via something other than hugging face. Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Adds support for `question_answering` NLP models within the pytorch model uploader. Related: elastic/elasticsearch#85958 Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
This improves the user consumed functions and classes for PyTorch NLP model upload to Elasticsearch. Previously it was difficult to wrap your own module for uploading to Elasticsearch. This commit splits some classes out, adds new ones, and adds tests showing how to wrap some simple modules. Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
…es (#465) This switches our sklearn.DecisionTreeClassifier serialization logic to account for multi-valued leaves in the tree. The key difference between our inference and DecisionTreeClassifier, is that we run a softMax over the leaf where sklearn simply normalizes the results. This means that our "probabilities" returned will be different than sklearn. Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Co-authored-by: Seth Michael Larson <seth.larson@elastic.co> Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
… NLP task type from model config (#475) For many model types, we don't need to require the task requested. We can infer the task type based on the model configuration and architecture. This commit makes the `task-type` parameter optional for the model up load script and adds logic for auto-detecting the task type based on the 🤗 model. Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
…n elastic vs open search Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
…arning Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
See several files contains "elastic" like this one |
* Added support for XGBoost 1.6 (`#458`_) | ||
* Added support for ``question_answering`` NLP tasks (`#457`_) | ||
|
||
.. _#457: https://github.com/elastic/eland/pull/457 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep this change log file? Seems most content is about eland
@@ -57,3 +57,100 @@ If you discover a potential security issue in this project we ask that you notif | |||
## Licensing | |||
|
|||
See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. | |||
**Repository:** <https://github.com/elastic/eland> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this part by referring to https://github.com/opensearch-project/opensearch-py/blob/main/CONTRIBUTING.md ?
- Support for Docker | ||
- Support for continuous integration | ||
- Regenerating Sphinx docs | ||
- Creating tutorials for `opensearch-py-ml` in both notebook and video form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about follow the readme template of https://github.com/opensearch-project/opensearch-py, add "Code of Conduct", "Liscense" and "Copyright"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was planning to add those in the next PR.
@@ -0,0 +1,2319 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to use this demo notebook for community user. Does it depend on some data set? How to prepare the environment to run this demo? I think explain this in some doc could help, like readme or some other doc
@@ -0,0 +1,1213 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file name seems not intuitive, we have "demo.ipynb" and this "demo_notebook.ipynb". How about change to other name to reflect the things it want to demo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to remove both of those in fact. And then add a detailed read me to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved now, you can fix comments in next PR
Description
Merging all the codes of dev to main branch
Issues Resolved
Merging all the codes of dev to main branch
--signoff::dhrubo@amazon.com
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.