AttributeError: 'Model' object has no attribute '__framework_name__' #1201

cfregly · 2019-12-27T19:03:23Z

Please fill out the form below.

System Information

Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): TensorFlow
Framework Version: 2.0.0
Python Version: Python3
CPU or GPU: CPU
Python SDK Version: 1.49.0
StepFunctions SDK: 1.0.0.3
Are you using a custom image: No

Describe the problem

Describe the problem or feature request clearly here.

Here is the error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-42-23cc1e7f4991> in <module>()
      3     role=workflow_execution_role,
      4     inputs=training_data_uri,
----> 5     s3_bucket=model_output_path
      6 )

~/anaconda3/envs/python3/lib/python3.6/site-packages/stepfunctions/template/pipeline/train.py in __init__(self, estimator, role, inputs, s3_bucket, client, **kwargs)
     64             self.pipeline_name = 'training-pipeline-{date}'.format(date=self._generate_timestamp())
     65 
---> 66         self.definition = self.build_workflow_definition()
     67         self.input_template = self._extract_input_template(self.definition)
     68 

~/anaconda3/envs/python3/lib/python3.6/site-packages/stepfunctions/template/pipeline/train.py in build_workflow_definition(self)
     95             instance_type=train_instance_type,
     96             model=model,
---> 97             model_name=default_name
     98         )
     99 

~/anaconda3/envs/python3/lib/python3.6/site-packages/stepfunctions/steps/sagemaker.py in __init__(self, state_id, model, model_name, instance_type, **kwargs)
    171         """
    172         if isinstance(model, FrameworkModel):
--> 173             parameters = model_config(model=model, instance_type=instance_type, role=model.role, image=model.image)
    174             if model_name:
    175                 parameters['ModelName'] = model_name

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/workflow/airflow.py in model_config(instance_type, model, role, image)
    577 
    578     if isinstance(model, sagemaker.model.FrameworkModel):
--> 579         container_def = prepare_framework_container_def(model, instance_type, s3_operations)
    580     else:
    581         container_def = model.prepare_container_def(instance_type)

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/workflow/airflow.py in prepare_framework_container_def(model, instance_type, s3_operations)
    519         deploy_image = fw_utils.create_image_uri(
    520             region_name,
--> 521             model.__framework_name__,
    522             instance_type,
    523             model.framework_version,

AttributeError: 'Model' object has no attribute '__framework_name__'

We are doing this through the stepfunctions SDK with TrainingPipeline

PyTorch is OK.

pytorch/model.py

tensorflow/model.py

Minimal repro / logs

Please provide any logs and a bare minimum reproducible test case, as this will be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Exact command to reproduce:

...
pipeline = TrainingPipeline(
    estimator=tensorflow_mnist_estimator,
    role=workflow_execution_role,
    inputs=training_data_uri,
    s3_bucket=model_output_path
)
...

Here is the link to the successful PyTorch version: https://github.com/awslabs/amazon-sagemaker-examples/blob/346811d27ba9e1cd6a3db94a90f584b951f7aa33/step-functions-data-science-sdk/training_pipeline_pytorch_mnist/training_pipeline_pytorch_mnist.ipynb

There is no equivalent example using TensorFlow.

The text was updated successfully, but these errors were encountered:

cfregly · 2019-12-27T19:16:58Z

The error is here:

sagemaker-python-sdk/src/sagemaker/workflow/airflow.py

Line 521 in 042c3a0

model.__framework_name__,

cfregly · 2019-12-27T19:23:44Z

Further debugging indicates that the model instance is not being created properly. Many attributes are None including entry_point, framework_version, etc

nadiaya · 2020-01-03T18:04:58Z

Could you provide a more detailed code sample you have been running?

PyTorch and TensorFlow support implementation differ (as TF has script mode and framework mode for older versions as well as TFS for inference). For TF 2.0 serving you need to use a different Model

sagemaker-python-sdk/src/sagemaker/tensorflow/serving.py

Line 122 in 042c3a0

class Model(sagemaker.model.FrameworkModel):

cfregly · 2020-01-04T17:19:25Z

Below is a more-complete code sample. Can you provide more details on how to use the different Model you mention above?

Note: I'm using the TrainingPipeline class from the step-functions SDK. Does the step-functions SDK implementation need to change to use this new model?

I'm following this example...

https://github.com/awslabs/amazon-sagemaker-examples/blob/346811d27ba9e1cd6a3db94a90f584b951f7aa33/step-functions-data-science-sdk/training_pipeline_pytorch_mnist/training_pipeline_pytorch_mnist.ipynb

... and adapting to TensorFlow as seen below.

from sagemaker.tensorflow import TensorFlow

mnist_estimator = TensorFlow(entry_point='mnist_keras_tf2.py',
                             source_dir='./src',
                             role=sagemaker_execution_role,
                             framework_version='2.0.0',
                             train_instance_count=1,
                             train_instance_type='ml.p3.2xlarge',
                             py_version='py3',
                             distributions={'parameter_server': {'enabled': True}})

pipeline = TrainingPipeline(
    estimator=tensorflow_mnist_estimator,
    role=workflow_execution_role,
    inputs=training_data_uri,
    s3_bucket=model_output_path
)

cfregly · 2020-01-07T18:00:55Z

any update?

laurenyu · 2020-01-08T22:23:18Z

Sorry for the delay in response. I've reached out to the team who originally authored the notebook to see if they have any insight. Thanks for your patience!

cfregly · 2020-01-08T22:47:03Z

Note: The original notebook is working OK because it’s using PyTorch. I’m trying to use TensorFlow.

shunjd · 2020-01-09T19:33:59Z

I think the issue here is that TensforFlow returns a different model object.

The TrainingPipeline is expecting to have model.TensorFlowModel, but instead, the actual model is serving.Model.

sagemaker-python-sdk/src/sagemaker/tensorflow/model.py

Line 51 in bf48fb1

class TensorFlowModel(FrameworkModel):

sagemaker-python-sdk/src/sagemaker/tensorflow/serving.py

Line 122 in bf48fb1

class Model(sagemaker.model.FrameworkModel):

With some code analysis, I figure out that script_mode is always on when python version is set to 'py3' which makes the create_model call always return serving.Model.

sagemaker-python-sdk/src/sagemaker/tensorflow/estimator.py

Lines 256 to 257 in bf48fb1

    
                       script_mode (bool): If set to True will the estimator will use the Script Mode 
        
                           containers (default: False). This will be ignored if py_version is set to 'py3'.

sagemaker-python-sdk/src/sagemaker/tensorflow/estimator.py

Line 572 in 80e017a

if endpoint_type == "tensorflow-serving" or self._script_mode_enabled():

@laurenyu Could you provide some insights on why does Tensorflow have two different models?

laurenyu · 2020-01-09T19:46:41Z

legacy reasons - there were two different families of pre-built images for supporting TFS. sagemaker.tensorflow.serving.Model is the one we recommend/actively support at this point.

shunjd · 2020-01-14T19:23:08Z

I dig into the code a little bit, i think we need a fix in the sagemaker.workflow.airflow package.

The following code can reproduce the error easily.

from sagemaker.tensorflow import TensorFlow
from sagemaker.workflow.airflow import model_config, training_config

sagemaker_execution_role = 'SM_EXECUTION_ROLE'

mnist_estimator = TensorFlow(entry_point='mnist_keras_tf2.py',
                             source_dir='./scripts',
                             role=sagemaker_execution_role,
                             framework_version='2.0.0',
                             train_instance_count=1,
                             train_instance_type='ml.p3.2xlarge',
                             py_version='py3',
                             distributions={'parameter_server': {'enabled': True}})

train = training_config(mnist_estimator))
model = mnist_estimator.create_model()

print(model_config('ml.p3.2xlarge', model)) # <-- Will throw the exception

The issue is related to the following code blocks.

In Airflow package, the method requires three mandatory attributes(model.__framework_name__, model.framework_version, model.py_version) which are missing in the serving.Model.

sagemaker-python-sdk/src/sagemaker/workflow/airflow.py

Lines 517 to 525 in bf48fb1

    
           if not deploy_image: 
        
               region_name = model.sagemaker_session.boto_session.region_name 
        
               deploy_image = fw_utils.create_image_uri( 
        
                   region_name, 
        
                   model.__framework_name__, 
        
                   instance_type, 
        
                   model.framework_version, 
        
                   model.py_version, 
        
               )

Instead, the serving.Model has its own logic to create the model image. I believe the behavior should be consistent across all the Models.

sagemaker-python-sdk/src/sagemaker/tensorflow/serving.py

Lines 261 to 277 in bf48fb1

    
               def _get_image_uri(self, instance_type, accelerator_type=None): 
        
                   """ 
        
                   Args: 
        
                       instance_type: 
        
                       accelerator_type: 
        
                   """ 
        
                   if self.image: 
        
                       return self.image 
        
                   region_name = self.sagemaker_session.boto_region_name 
        
                   return create_image_uri( 
        
                       region_name, 
        
                       Model.FRAMEWORK_NAME, 
        
                       instance_type, 
        
                       self._framework_version, 
        
                       accelerator_type=accelerator_type, 
        
                   )

cfregly · 2020-01-14T23:34:50Z

Any update on this? @shunjd

And should this issue should be labeled a "bug" not "question"? Because of this issue, TrainingPipeline is not working for TensorFlow. This seems like a bug.

Thanks!

laurenyu · 2020-01-14T23:39:16Z

thanks for looking into this, @shunjd!

yes, you're right that it's a bug. we'll work on getting a fix out.

ajaykarpur · 2020-01-18T01:06:45Z

Fixed in https://github.com/aws/sagemaker-python-sdk/releases/tag/v1.50.5.

shunjd · 2020-01-20T18:37:29Z

@laurenyu @ajaykarpur Thank you for the fix, but the problem still exists.

The problem is that when Tensorflow.create_model() is being called, the entry_point from the estimator is not being passed into the model. PyTorch does not have this problem if you compare the code below.

https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/pytorch/estimator.py#L164-L169

https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/estimator.py#L572-L579

ajaykarpur · 2020-01-20T21:28:47Z

@shunjd I've just merged another fix for the issue you identified (#1252). It will be released later today.

cfregly mentioned this issue Jan 4, 2020

TensorFlow TrainingPipeline Example not working properly aws/amazon-sagemaker-examples#970

Closed

laurenyu added the type: question label Jan 8, 2020

shunjd mentioned this issue Jan 10, 2020

Support Tensorflow with the new sagemaker.tensorflow.serving.Model aws/aws-step-functions-data-science-sdk-python#17

Closed

laurenyu added bug and removed type: question labels Jan 14, 2020

ajaykarpur mentioned this issue Jan 17, 2020

fix: Use serving_image_uri for Airflow #1245

Merged

7 tasks

ajaykarpur closed this as completed Jan 18, 2020

ajaykarpur mentioned this issue Jan 20, 2020

fix: do not use script for TFS when entry_point is not provided #1252

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'Model' object has no attribute '__framework_name__' #1201

AttributeError: 'Model' object has no attribute '__framework_name__' #1201

cfregly commented Dec 27, 2019 •

edited

cfregly commented Dec 27, 2019

cfregly commented Dec 27, 2019

nadiaya commented Jan 3, 2020

cfregly commented Jan 4, 2020

cfregly commented Jan 7, 2020

laurenyu commented Jan 8, 2020

cfregly commented Jan 8, 2020

shunjd commented Jan 9, 2020

laurenyu commented Jan 9, 2020

shunjd commented Jan 14, 2020

cfregly commented Jan 14, 2020 •

edited

laurenyu commented Jan 14, 2020

ajaykarpur commented Jan 18, 2020

shunjd commented Jan 20, 2020

ajaykarpur commented Jan 20, 2020 •

edited

AttributeError: 'Model' object has no attribute '__framework_name__' #1201

AttributeError: 'Model' object has no attribute '__framework_name__' #1201

Comments

cfregly commented Dec 27, 2019 • edited

System Information

Describe the problem

Minimal repro / logs

cfregly commented Dec 27, 2019

cfregly commented Dec 27, 2019

nadiaya commented Jan 3, 2020

cfregly commented Jan 4, 2020

cfregly commented Jan 7, 2020

laurenyu commented Jan 8, 2020

cfregly commented Jan 8, 2020

shunjd commented Jan 9, 2020

laurenyu commented Jan 9, 2020

shunjd commented Jan 14, 2020

cfregly commented Jan 14, 2020 • edited

laurenyu commented Jan 14, 2020

ajaykarpur commented Jan 18, 2020

shunjd commented Jan 20, 2020

ajaykarpur commented Jan 20, 2020 • edited

cfregly commented Dec 27, 2019 •

edited

cfregly commented Jan 14, 2020 •

edited

ajaykarpur commented Jan 20, 2020 •

edited