Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JumpStart CuratedHub Launch #4748

Merged
merged 19 commits into from
Jun 21, 2024
Merged

JumpStart CuratedHub Launch #4748

merged 19 commits into from
Jun 21, 2024

Conversation

malav-shastri
Copy link
Collaborator

@malav-shastri malav-shastri commented Jun 20, 2024

Issue #, if available:
CuratedHub launch
Description of changes:

Testing done:

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

malav-shastri and others added 4 commits June 20, 2024 12:30
* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
…umpStart public hub models (aws#1456)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

* implement list_jumpstart_service_hub_models function

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
* get_model_spec() changes to support hub_arn and hub_content_type

* implement get_hub_model_reference()

* support hub_arn and hub_content_type for specs retrieval

* add support for hub_arn and hub_content_type for serializers, deserializers, estimators, models, predictors and various spec retrieval functionalities

* address nits and test failures

* remove hub_content_type support

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
* implement HubContentDocument parser

* modify the parser to remove aliases for hubcontent documents

* bug fix

* update boto3

* Bug Fix in the parser

* Improve Hub Class and related functionalities

* Bug Fix and parser updates

* add missing hub_arn support

* Add model reference deployment support and other minor bug fixes

* fix: retrieve correct image_uri (parser update)

* fix: retrieve correct model URI and model data path from HubContentDocument (parser update)

* Add model reference deployment support

* Model accessor and cache retrival bug fixes

* fix: curated hub model training workflow

* fix: pass sagemaker sessions object to retrieve model specs from describe_hub_content call

* fix: fix payload retrieval for curated hub models

* modify constants, enums

* fix: update parser

* Address nits in the parser

* Add unit tests for parser

* implement pagination for list_models utility

* feat: support wildcard chars for model versions

* Address nits and comments

* Add Hub Content Arn Tag to training and hosting

* Add Hub Content Arn Tag to training and hosting

* fix: HubContentDocument schema version

* fix broken unit tests

* fix prepare_container_def unit tests to include ModelReferenceArn

* fix unit tests for test_session.py

* revert boto version changes

* Fix unit tests

* support wildcard model versions for training workflow

* Add test cases for get_model_versions

* Add/fix unit tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
@benieric
Copy link
Contributor

For this one, check the failing unit and codestyle-doc tests. The integ tests failing are unrelated to this change (fixing in our infra account)

Copy link
Contributor

@AWS-pratab AWS-pratab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets address these nit comments in next/later release

)
model_specs.set_hub_content_type(HubContentType.MODEL)
return model_specs
except: # noqa: E722
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can catch the specific exception here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can check model-ref first, assuming there are more chances Hub has more refs than models.

@@ -489,6 +549,9 @@ def _add_tags_to_kwargs(kwargs: JumpStartModelDeployKwargs) -> Dict[str, Any]:
kwargs.tags, kwargs.model_id, full_model_version, kwargs.model_type
)

if kwargs.hub_arn:
kwargs.tags = add_hub_content_arn_tags(kwargs.tags, kwargs.hub_arn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it is passing the hub_arn instead of hub_content_arn to the method.

Can be fixed after initial release, since for revenue tracking we will be needing the key part only.

@@ -10,6 +10,7 @@
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
# pylint: skip-file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make sure we follow up on fixing these lints instead of doing pylint: skip-file

@benieric
Copy link
Contributor

Failing integs not an issue, saw these failing earlier and need to run cleaning scripts and should work during release. Everything else looks good

@benieric benieric merged commit 3f9acac into aws:master Jun 21, 2024
10 of 11 checks passed
selvask-aws pushed a commit to selvask-aws/sagemaker-python-sdk that referenced this pull request Jun 21, 2024
* Implement CuratedHub APIs (aws#1449)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement list_jumpstart_service_hub_models function to fetch JumpStart public hub models (aws#1456)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

* implement list_jumpstart_service_hub_models function

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* Feat/Curated Hub hub_arn and hub_content_type support (aws#1453)

* get_model_spec() changes to support hub_arn and hub_content_type

* implement get_hub_model_reference()

* support hub_arn and hub_content_type for specs retrieval

* add support for hub_arn and hub_content_type for serializers, deserializers, estimators, models, predictors and various spec retrieval functionalities

* address nits and test failures

* remove hub_content_type support

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement curated hub parser and bug bash fixes (aws#1457)

* implement HubContentDocument parser

* modify the parser to remove aliases for hubcontent documents

* bug fix

* update boto3

* Bug Fix in the parser

* Improve Hub Class and related functionalities

* Bug Fix and parser updates

* add missing hub_arn support

* Add model reference deployment support and other minor bug fixes

* fix: retrieve correct image_uri (parser update)

* fix: retrieve correct model URI and model data path from HubContentDocument (parser update)

* Add model reference deployment support

* Model accessor and cache retrival bug fixes

* fix: curated hub model training workflow

* fix: pass sagemaker sessions object to retrieve model specs from describe_hub_content call

* fix: fix payload retrieval for curated hub models

* modify constants, enums

* fix: update parser

* Address nits in the parser

* Add unit tests for parser

* implement pagination for list_models utility

* feat: support wildcard chars for model versions

* Address nits and comments

* Add Hub Content Arn Tag to training and hosting

* Add Hub Content Arn Tag to training and hosting

* fix: HubContentDocument schema version

* fix broken unit tests

* fix prepare_container_def unit tests to include ModelReferenceArn

* fix unit tests for test_session.py

* revert boto version changes

* Fix unit tests

* support wildcard model versions for training workflow

* Add test cases for get_model_versions

* Add/fix unit tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* address unit tests failures in codebuild

* change list_jumpstart_service_hub_models to list_sagemaker_public_hub_models()

* fix: Changing list input output shapes

* fix: gated model training bug

* run black -l 100

* flake 8

* address formatting issues

* black -l

* DocStyle issues

* address flake8, pylint

* blake -l

* pass model type down

* disabling pylint for release

* disable pylint

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
Co-authored-by: chrstfu <chrstfu@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
selvask-aws pushed a commit to selvask-aws/sagemaker-python-sdk that referenced this pull request Jun 21, 2024
* Implement CuratedHub APIs (aws#1449)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement list_jumpstart_service_hub_models function to fetch JumpStart public hub models (aws#1456)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

* implement list_jumpstart_service_hub_models function

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* Feat/Curated Hub hub_arn and hub_content_type support (aws#1453)

* get_model_spec() changes to support hub_arn and hub_content_type

* implement get_hub_model_reference()

* support hub_arn and hub_content_type for specs retrieval

* add support for hub_arn and hub_content_type for serializers, deserializers, estimators, models, predictors and various spec retrieval functionalities

* address nits and test failures

* remove hub_content_type support

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement curated hub parser and bug bash fixes (aws#1457)

* implement HubContentDocument parser

* modify the parser to remove aliases for hubcontent documents

* bug fix

* update boto3

* Bug Fix in the parser

* Improve Hub Class and related functionalities

* Bug Fix and parser updates

* add missing hub_arn support

* Add model reference deployment support and other minor bug fixes

* fix: retrieve correct image_uri (parser update)

* fix: retrieve correct model URI and model data path from HubContentDocument (parser update)

* Add model reference deployment support

* Model accessor and cache retrival bug fixes

* fix: curated hub model training workflow

* fix: pass sagemaker sessions object to retrieve model specs from describe_hub_content call

* fix: fix payload retrieval for curated hub models

* modify constants, enums

* fix: update parser

* Address nits in the parser

* Add unit tests for parser

* implement pagination for list_models utility

* feat: support wildcard chars for model versions

* Address nits and comments

* Add Hub Content Arn Tag to training and hosting

* Add Hub Content Arn Tag to training and hosting

* fix: HubContentDocument schema version

* fix broken unit tests

* fix prepare_container_def unit tests to include ModelReferenceArn

* fix unit tests for test_session.py

* revert boto version changes

* Fix unit tests

* support wildcard model versions for training workflow

* Add test cases for get_model_versions

* Add/fix unit tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* address unit tests failures in codebuild

* change list_jumpstart_service_hub_models to list_sagemaker_public_hub_models()

* fix: Changing list input output shapes

* fix: gated model training bug

* run black -l 100

* flake 8

* address formatting issues

* black -l

* DocStyle issues

* address flake8, pylint

* blake -l

* pass model type down

* disabling pylint for release

* disable pylint

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
Co-authored-by: chrstfu <chrstfu@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
jiapinw pushed a commit to jiapinw/sagemaker-python-sdk that referenced this pull request Jun 25, 2024
* Implement CuratedHub APIs (aws#1449)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement list_jumpstart_service_hub_models function to fetch JumpStart public hub models (aws#1456)

* Implement CuratedHub Admin APIs

* making some parameters optional in create_hub_content_reference as per the API design

* add describe_hub and list_hubs APIs

* implement delete_hub API

* Implement list_hub_contents API

* create CuratedHub class and supported utils

* implement list_models and address comments

* Add unit tests

* add describe_model function

* cache retrieval for describeHubContent changes

* fix curated hub class unit tests

* add utils needed for curatedHub

* Cache retrieval

* implement get_hub_model_reference()

* cleanup HUB type datatype

* cleanup constants

* rename list_public_models to list_jumpstart_service_hub_models

* implement describe_model_reference

* Rename CuratedHub to Hub

* address nit

* address nits and fix failing tests

* implement list_jumpstart_service_hub_models function

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* Feat/Curated Hub hub_arn and hub_content_type support (aws#1453)

* get_model_spec() changes to support hub_arn and hub_content_type

* implement get_hub_model_reference()

* support hub_arn and hub_content_type for specs retrieval

* add support for hub_arn and hub_content_type for serializers, deserializers, estimators, models, predictors and various spec retrieval functionalities

* address nits and test failures

* remove hub_content_type support

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* feat: implement curated hub parser and bug bash fixes (aws#1457)

* implement HubContentDocument parser

* modify the parser to remove aliases for hubcontent documents

* bug fix

* update boto3

* Bug Fix in the parser

* Improve Hub Class and related functionalities

* Bug Fix and parser updates

* add missing hub_arn support

* Add model reference deployment support and other minor bug fixes

* fix: retrieve correct image_uri (parser update)

* fix: retrieve correct model URI and model data path from HubContentDocument (parser update)

* Add model reference deployment support

* Model accessor and cache retrival bug fixes

* fix: curated hub model training workflow

* fix: pass sagemaker sessions object to retrieve model specs from describe_hub_content call

* fix: fix payload retrieval for curated hub models

* modify constants, enums

* fix: update parser

* Address nits in the parser

* Add unit tests for parser

* implement pagination for list_models utility

* feat: support wildcard chars for model versions

* Address nits and comments

* Add Hub Content Arn Tag to training and hosting

* Add Hub Content Arn Tag to training and hosting

* fix: HubContentDocument schema version

* fix broken unit tests

* fix prepare_container_def unit tests to include ModelReferenceArn

* fix unit tests for test_session.py

* revert boto version changes

* Fix unit tests

* support wildcard model versions for training workflow

* Add test cases for get_model_versions

* Add/fix unit tests

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>

* address unit tests failures in codebuild

* change list_jumpstart_service_hub_models to list_sagemaker_public_hub_models()

* fix: Changing list input output shapes

* fix: gated model training bug

* run black -l 100

* flake 8

* address formatting issues

* black -l

* DocStyle issues

* address flake8, pylint

* blake -l

* pass model type down

* disabling pylint for release

* disable pylint

---------

Co-authored-by: Malav Shastri <malavhs@amazon.com>
Co-authored-by: chrstfu <chrstfu@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants