Impossible to configure shm_size when launching a CommandJob with AzureML SDK v2 #6571
Labels
Auto-Assign
Auto assign by bot
bug
This issue requires a change to an existing behavior in the product in order to be resolved.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
CXP Attention
This issue is handled by CXP team.
extension/ml
Machine Learning
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Describe the bug
I get an Validation error when I want to launch a CommandJob with a custom shm_size
Related command
az ml job create --file file.yaml
Errors
Configured default 't-bs-mf-explore-iris2-phd' for arg resource_group_name
Configured default 'aml-dcy-int-iris2-phd' for arg workspace_name
Met error <class 'marshmallow.exceptions.ValidationError'>:Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
Please check log in debug mode for more details.
Command ran in 2.107 seconds (init: 0.441, invoke: 1.666)
Issue script & Debug output
Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 143, in load_from_dict
return schema(context=context).load(data, **kwargs)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 722, in load
return self._do_load(
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 909, in _do_load
raise exc
marshmallow.exceptions.ValidationError: {'resources': {'shm_size': ['Unknown field.']}}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/custom/job.py", line 60, in ml_job_create
job = load_job(path=file, params_override=params_override)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 74, in load_job
return load_common(Job, path, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 59, in load_common
return cls._load(data=yaml_dict, yaml_path=path, params_override=params_override, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/job.py", line 235, in _load
return job_type._load_from_dict(
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/command_job.py", line 166, in _load_from_dict
loaded_data = load_from_dict(CommandJobSchema, data, context, additional_message, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 146, in load_from_dict
raise ValidationError(decorate_validation_error(schema, pretty_error, additional_message))
marshmallow.exceptions.ValidationError: Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
cli: None
cli: Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 143, in load_from_dict
return schema(context=context).load(data, **kwargs)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 722, in load
return self._do_load(
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 909, in _do_load
raise exc
marshmallow.exceptions.ValidationError: {'resources': {'shm_size': ['Unknown field.']}}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/custom/job.py", line 60, in ml_job_create
job = load_job(path=file, params_override=params_override)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 74, in load_job
return load_common(Job, path, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 59, in load_common
return cls._load(data=yaml_dict, yaml_path=path, params_override=params_override, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/job.py", line 235, in _load
return job_type._load_from_dict(
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/command_job.py", line 166, in _load_from_dict
loaded_data = load_from_dict(CommandJobSchema, data, context, additional_message, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 146, in load_from_dict
raise ValidationError(decorate_validation_error(schema, pretty_error, additional_message))
marshmallow.exceptions.ValidationError: Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
cli.azure.cli.core.azclierror: Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 143, in load_from_dict
return schema(context=context).load(data, **kwargs)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 722, in load
return self._do_load(
File "/anaconda/envs/torch12/lib/python3.8/site-packages/marshmallow/schema.py", line 909, in _do_load
raise exc
marshmallow.exceptions.ValidationError: {'resources': {'shm_size': ['Unknown field.']}}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/az/extensions/ml/azext_mlv2/manual/custom/job.py", line 60, in ml_job_create
job = load_job(path=file, params_override=params_override)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 74, in load_job
return load_common(Job, path, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_load_functions.py", line 59, in load_common
return cls._load(data=yaml_dict, yaml_path=path, params_override=params_override, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/job.py", line 235, in _load
return job_type._load_from_dict(
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_job/command_job.py", line 166, in _load_from_dict
loaded_data = load_from_dict(CommandJobSchema, data, context, additional_message, **kwargs)
File "/opt/az/extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/entities/_util.py", line 146, in load_from_dict
raise ValidationError(decorate_validation_error(schema, pretty_error, additional_message))
marshmallow.exceptions.ValidationError: Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda/envs/torch12/lib/python3.8/site-packages/knack/cli.py", line 233, in invoke
cmd_result = self.invocation.execute(args)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/core/commands/init.py", line 663, in execute
raise ex
File "/anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/core/commands/init.py", line 726, in _run_jobs_serially
results.append(self._run_job(expanded_arg, cmd_copy))
File "/anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/core/commands/init.py", line 697, in _run_job
result = cmd_copy(params)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/core/commands/init.py", line 333, in call
return self.handler(*args, **kwargs)
File "/anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
return op(**command_args)
File "/opt/az/extensions/ml/azext_mlv2/manual/custom/job.py", line 77, in ml_job_create
log_and_raise_error(err, debug)
File "/opt/az/extensions/ml/azext_mlv2/manual/custom/raise_error.py", line 117, in log_and_raise_error
raise cli_error
knack.util.CLIError: Met error <class 'marshmallow.exceptions.ValidationError'>:Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
Please check log in debug mode for more details.
cli.azure.cli.core.azclierror: Met error <class 'marshmallow.exceptions.ValidationError'>:Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
Please check log in debug mode for more details.
az_command_data_logger: Met error <class 'marshmallow.exceptions.ValidationError'>:Validation for CommandJobSchema failed:
{
"resources": {
"shm_size": [
"Unknown field."
]
}
}
If you are trying to configure a job that is not of type command, please specify the correct job type in the 'type' property.
For a more detailed breakdown of the CommandJob schema, please see: https://aka.ms/ml-cli-v2-job-command-yaml-reference.
The easiest way to author a specification file is using IntelliSense and auto-completion Azure ML VS code extension provides: https://code.visualstudio.com/docs/datascience/azure-machine-learning
To set up: https://docs.microsoft.com/azure/machine-learning/how-to-setup-vs-code
Please check log in debug mode for more details.
cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7fd1e33cb040>]
az_command_data_logger: exit code: 1
cli.main: Command ran in 1.322 seconds (init: 0.568, invoke: 0.753)
telemetry.main: Begin splitting cli events and extra events, total events: 1
telemetry.client: Accumulated 0 events. Flush the clients.
telemetry.main: Finish splitting cli events and extra events, cli events: 1
telemetry.save: Save telemetry record of length 4340 in cache
telemetry.check: Returns Positive.
telemetry.main: Begin creating telemetry upload process.
telemetry.process: Creating upload process: "/anaconda/envs/torch12/bin/python /anaconda/envs/torch12/lib/python3.8/site-packages/azure/cli/telemetry/init.py /home/azureuser/.azure"
telemetry.process: Return from creating process
telemetry.main: Finish creating telemetry upload process.
Expected behavior
The job should be launched
Environment Summary
azure-cli 2.50.0
core 2.50.0
telemetry 1.0.8
Extensions:
ml 2.5.0
Dependencies:
msal 1.22.0
azure-mgmt-resource 23.1.0b2
Python location '/anaconda/envs/torch12/bin/python'
Extensions directory '/opt/az/extensions'
Python (Linux) 3.8.15 (default, Nov 24 2022, 15:19:38)
[GCC 11.2.0]
Legal docs and information: aka.ms/AzureCliLegal
Additional context
I have removed personnal arguments
The text was updated successfully, but these errors were encountered: