Skip to content

IP Insights: "AttributeError: 'str' object has no attribute 'feature_dim'" #2915

@jonrau-lightspin

Description

@jonrau-lightspin

Describe the bug
When fitting input data to an IP Insights Estimator, the following error is raised: AttributeError: 'str' object has no attribute 'feature_dim' despite feature_dim not being present in any IP Insights related documentation.

To reproduce
The following versions installed as of 8 FEB 2022

(.venv) user:~/environment/IpInsights $ pip3 list
Package             Version
------------------- ---------
attrs               21.4.0
boto3               1.20.50
botocore            1.23.50
dill                0.3.4
google-pasta        0.2.0
importlib-metadata  4.8.3
jmespath            0.10.0
multiprocess        0.70.12.2
numpy               1.19.5
packaging           21.3
pandas              1.1.5
pathos              0.2.8
pip                 21.3.1
pox                 0.3.0
ppft                1.6.6.4
protobuf            3.19.4
protobuf3-to-dict   0.1.5
pyarrow             6.0.1
pyparsing           3.0.7
python-dateutil     2.8.2
pytz                2021.3
s3transfer          0.5.1
sagemaker           2.75.1
setuptools          59.6.0
six                 1.16.0
smdebug-rulesconfig 1.0.1
typing_extensions   4.0.1
urllib3             1.26.8
wheel               0.37.1
zipp                3.6.0

The following code snippet shows the failure

def ip_insights_endpoint_deployment():
    prepTraining = prepare_ip_insights_training_data()

    csvName = prepTraining[0]
    numEntityVectors = int(prepTraining[1])
    todaysDate = prepTraining[2]

    bucket = sagemaker.session.Session().default_bucket()
    inputs = sagemaker.s3.S3Uploader.upload(
        csvName,
        f's3://{bucket}/IPInsights/training-{todaysDate}'
    )

    jsonDeSerializer = sagemaker.deserializers.JSONDeserializer(accept='application/json')
    csvSerializer = sagemaker.serializers.CSVSerializer(content_type='text/csv')

    # Recommended vector_dim value is 128.
    ipi = sagemaker.IPInsights(
        role=sagemaker.get_execution_role(),
        instance_count=2,
        instance_type='ml.m5.xlarge',
        num_entity_vectors=numEntityVectors,
        vector_dim=128 
    )
    # Fit the IP Insights Estimator with the input training data
    try:
        ipi.fit(inputs)
    except Exception as e:
        raise e
    # Deploy the Endpoint
    predictor = ipi.deploy(
        serializer=csvSerializer,
        deserializer=jsonDeSerializer
    )

Expected behavior
Estimator should be fit, and Training job should begin

Screenshots or logs
Errors:

Traceback (most recent call last):
  File "tii.py", line 383, in <module>
    ip_insights_endpoint_deployment()
  File "tii.py", line 333, in ip_insights_endpoint_deployment
    raise e
  File "tii.py", line 330, in ip_insights_endpoint_deployment
    ipi.fit(inputs)
  File "/home/ubuntu/environment/IpInsights/.venv/lib/python3.6/site-packages/sagemaker/amazon/amazon_estimator.py", line 243, in fit
    self._prepare_for_training(records, job_name=job_name, mini_batch_size=mini_batch_size)
  File "/home/ubuntu/environment/IpInsights/.venv/lib/python3.6/site-packages/sagemaker/amazon/ipinsights.py", line 175, in _prepare_for_training
    records, mini_batch_size=mini_batch_size, job_name=job_name
  File "/home/ubuntu/environment/IpInsights/.venv/lib/python3.6/site-packages/sagemaker/amazon/amazon_estimator.py", line 190, in _prepare_for_training
    feature_dim = records.feature_dim
AttributeError: 'str' object has no attribute 'feature_dim'

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.75.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): IP Insights
  • Framework version: Default (1?)
  • Python version: 3.6
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions