Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'trainingJobName' failed to satisfy constraint #41

Closed
pm3310 opened this issue Jul 31, 2018 · 9 comments
Closed

'trainingJobName' failed to satisfy constraint #41

pm3310 opened this issue Jul 31, 2018 · 9 comments
Labels
bug Something isn't working

Comments

@pm3310
Copy link
Contributor

pm3310 commented Jul 31, 2018

I get the following error with the latest version but not for sagify version 0.10

An error occurred (ValidationException) when calling the CreateTrainingJob operation: 1 validation error detected: Value 'XXXXXXXXXXXX.dkr.ecr.us-east-1.amazonaw-2018-07-31-16-42-36-397' at 'trainingJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9])*
@andreas-grivas
Copy link
Contributor

andreas-grivas commented Sep 21, 2018

I think this may be because an additional tag seems to be added to image_name from two locations after the changes in 0.11.0.

The first time docker_tag is added is in sagify/api/cloud.py:

image_name = config.image_name+':'+docker_tag
however image_name is passed as an argument to functions in sagify/sagemaker/sagemaker.py, which then call:

    def _construct_image_location(self, image_name):
        account = self.boto_session.client('sts').get_caller_identity()['Account']
        region = self.boto_session.region_name

        return '{account}.dkr.ecr.{region}.amazonaws.com/{image}:latest'.format(
            account=account,
            region=region,
            image=image_name
        )

Note that in the return statement ':latest' is always appended - so I think what potentially happens is you get an image_name with ':something:something' appended and sagemaker doesn't like that.

I found these changes have solved this error for me for now:

https://github.com/andreas-grivas/sagify/commit/d23cccd19106c50f1f7a50e17cbf9e0aac636755

Until this is investigated further, a quick fix is to specify the name parameter in the call to sage.Model() - eg. sage.Model(name='MyModel', restofparams) in the deploy function under sagify/sagemaker/sagemaker.py:

https://github.com/andreas-grivas/sagify/blob/d23cccd19106c50f1f7a50e17cbf9e0aac636755/sagify/sagemaker/sagemaker.py#L124

@pm3310
Copy link
Contributor Author

pm3310 commented Sep 24, 2018

I think I found the issue. SageMaker doesn't like underscores (_) in the image names :-(

@andreas-grivas
Copy link
Contributor

Does sagify add underscores anywhere though?

@ilazakis
Copy link
Contributor

ilazakis commented Oct 6, 2018

@pm3310 It could be underscores too, but what @andreas-grivas suggests is definitely happening and needs fixing.

A colon is appended to the image name every time we call sagify cloud ..., regardless of whether a tag was provided or not.

SageMakerClient though also appends the latest tag every time as well.

In other words, if not tag is provided, we end up with an image name like this:

name-img::latest (note the double colon)

and when a tag e.g. 'my-tag' is indeed provided the image name looks like this:

name-img:my-tag:latest

The fix is straightforward I think. We need to check for the existence of a tag in the above linked lines and append the tag or latest respectively only if needed.

The docker tag feature was implemented rather hastily. In addition to the above, no additional method doc entries were added where the docker_tag parameter was added; we can add them as part of fixing the main issue.

@ilazakis ilazakis mentioned this issue Nov 3, 2018
@ilazakis
Copy link
Contributor

ilazakis commented Nov 5, 2018

With #44 merged, @pm3310 can you try to see if the errors you were getting persist?

@andreas-grivas You should not need to use your workaround anymore, let me know if that's not the case and I'll get to it 😄

@ilazakis ilazakis added the bug Something isn't working label Nov 5, 2018
@pm3310
Copy link
Contributor Author

pm3310 commented Nov 5, 2018

New version deployed https://pypi.org/project/sagify/ :-D Version 0.12.1 is available via pypi

@pm3310
Copy link
Contributor Author

pm3310 commented Nov 6, 2018

@andreas-grivas Would you like to become a Sagify active contributor? If you wish so, I can give you contributor access permissions :-) That means you'll be able to deploy new versions and review PRs :-)

@pm3310 pm3310 closed this as completed Nov 8, 2018
@andreas-grivas
Copy link
Contributor

andreas-grivas commented Nov 8, 2018

@pm3310 Thank you Pavlos, that would be super awesome but it is still uncertain if sagemaker fits our needs - so I don't know if I will be working with sagify a lot in the future. @ilazakis @pm3310 Thank you both for fixing this :)

@pm3310
Copy link
Contributor Author

pm3310 commented Nov 8, 2018

Hey @andreas-grivas no worries :-) We're planning to launch an enterprise solution early next year for training/deploying ML models on SageMaker much easier. Of course, it's going to be based on this open-source project :-)

The benefits of the solution will be:

  • Hide engineering details of SageMaker and focus more on ML
  • Keeping track of historical training jobs (who, what, where, when, which git commit)
  • Deploy ML models as REST endpoints in 1 click from UI

Please, let me know if you'd like to be of the early adopters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants