Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SageMaker Core TrainingJob Incorrectly expects a resource Tag instead of shape Tag #243

Open
benieric opened this issue Feb 12, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@benieric
Copy link
Collaborator

Describe the bug
A clear and concise description of what the bug is.

When creating a TrainingJob and passing in the a Tag shape get an error like:

ValidationError: 1 validation error for create
tags.0
  Input should be a valid dictionary or instance of Tag [type=model_type, input_value=Tag(key='key', 
value='value'), input_type=Tag]
    For further information visit https://errors.pydantic.dev/2.9/v/model_type

Example of failing code:

from sagemaker.modules import Session
from sagemaker_core.resources import TrainingJob
from sagemaker_core.shapes import AlgorithmSpecification, ResourceConfig, StoppingCondition, Tag, OutputDataConfig

import time

session = Session()
training_job_name = f"base-name-{int(time.time())}"
role = session.get_caller_identity_arn()
output_bucket = session.default_bucket()


training_job = TrainingJob.create(
    training_job_name=training_job_name,
    role_arn=role,
    algorithm_specification=AlgorithmSpecification(
        training_image=image,
        training_input_mode="File",
    ),
    resource_config=ResourceConfig(
        instance_count=1,
        instance_type="ml.m5.large",
        volume_size_in_gb=30,
    ),
    output_data_config=OutputDataConfig(
        s3_output_path=f"s3://{output_bucket}/output",
    ),
    stopping_condition=StoppingCondition(
        max_runtime_in_seconds=86400,
    ),
    tags=[Tag(key="key", value="value")],

This is because there are duplicate class names for Tag in the shape module and resource module.

To reproduce
A clear, step-by-step set of instructions to reproduce the bug.
The provided code need to be complete and runnable, if additional data is needed, please include them in the issue.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.

Bug information
A description of your system. Please provide:

  • SageMaker Core version:
  • Python version:
  • Is the issue with autogen code or with generate code ?:

Additional context
Add any other context about the problem here.

@benieric benieric added the bug Something isn't working label Feb 12, 2025
@MattFriedman
Copy link

MattFriedman commented Mar 12, 2025

In ModelTrainer pydantic is demanding one version of Tag and when you call model_trainer.train() it is expecting a different version of Tag. So if you are using ModelTrainer and use [Tag] it will break.

Maintainers need to pick a consistent version of Tag and stick with it.

There are at least 2 versions of Tag in the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants