You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
UnexpectedStatusException: Error for Transform job mxnet-inference-2021-06-11-10-33-31-500: Failed. Reason: AlgorithmError: See job logs for more information
If I set wait to false, I just get an empty output array. I'm not really sure what to do here, happy to provide more info if needed.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We've made an Sagemaker.mxnet.MXNet model and we've managed to train it, inferencing it has been a struggle however.
The transform job keeps failing though, this is the code that could be causing the error:
batch_output = 's3://{}/{}/{}'.format(bucket, prefix, 'batch-inference')
print(batch_output) ( -> s3://sagemaker-soln-pdm-js-km5c0kuw-956591629228-eu-west-2/pred-maintenance-artifacts/batch-inference)
transformer = m.transformer(instance_count=1, instance_type='ml.m5.4xlarge', output_path=batch_output,
model_name=config.model_name)
print(transformer) (-> <sagemaker.transformer.Transformer object at 0x7f9c4ea91ad0>)
s3_test_key = "{}/battery_data/test-0.csv".format(prefix)
s3_transform_input = "{}/batch-transform-input".format(prefix)
job_name, input_key = utils.get_transform_input(bucket, config.solution_prefix, s3_test_key, s3_transform_input)
transformer.transform(input_key, wait=True)
At that last line, I get this:
................................................*
UnexpectedStatusException Traceback (most recent call last)
in
----> 1 transformer.transform(input_key, wait=True)
~/.local/lib/python3.7/site-packages/sagemaker/transformer.py in transform(self, data, data_type, content_type, compression_type, split_type, job_name, input_filter, output_filter, join_source, experiment_config, model_client_config, wait, logs)
218
219 if wait:
--> 220 self.latest_transform_job.wait(logs=logs)
221
222 def delete_model(self):
~/.local/lib/python3.7/site-packages/sagemaker/transformer.py in wait(self, logs)
394 self.sagemaker_session.logs_for_transform_job(self.job_name, wait=True)
395 else:
--> 396 self.sagemaker_session.wait_for_transform_job(self.job_name)
397
398 def stop(self):
~/.local/lib/python3.7/site-packages/sagemaker/session.py in wait_for_transform_job(self, job, poll)
2621 """
2622 desc = _wait_until(lambda: _transform_job_status(self.sagemaker_client, job), poll)
-> 2623 self._check_job_status(job, desc, "TransformJobStatus")
2624 return desc
2625
~/.local/lib/python3.7/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name)
2669 ),
2670 allowed_statuses=["Completed", "Stopped"],
-> 2671 actual_status=status,
2672 )
2673
UnexpectedStatusException: Error for Transform job mxnet-inference-2021-06-11-10-33-31-500: Failed. Reason: AlgorithmError: See job logs for more information
If I set wait to false, I just get an empty output array. I'm not really sure what to do here, happy to provide more info if needed.
Beta Was this translation helpful? Give feedback.
All reactions