Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] huggingface-pytorch-inference container cannot initialize on AWS SageMaker #3606

Open
3 of 5 tasks
garrett-mesalabs opened this issue Jan 9, 2024 · 0 comments
Open
3 of 5 tasks
Labels
bug Something isn't working

Comments

@garrett-mesalabs
Copy link

garrett-mesalabs commented Jan 9, 2024

Checklist

Concise Description:

The image 763104351884.dkr.ecr.us-west-1.amazonaws.com/huggingface-pytorch-inference:1.13.1-transformers4.26.0-cpu-py39-ubuntu20.04 fails to start on AWS SageMaker with the following error:

Traceback (most recent call last):
  File "/usr/local/bin/deep_learning_container.py", line 22, in <module>
    import botocore.session
  File "/opt/conda/lib/python3.10/site-packages/botocore/session.py", line 25, in <module>
    import botocore.configloader
  File "/opt/conda/lib/python3.10/site-packages/botocore/configloader.py", line 19, in <module>
    from botocore.compat import six
  File "/opt/conda/lib/python3.10/site-packages/botocore/compat.py", line 25, in <module>
    from botocore.exceptions import MD5UnavailableError
  File "/opt/conda/lib/python3.10/site-packages/botocore/exceptions.py", line 15, in <module>
    from botocore.vendored.requests.exceptions import ConnectionError
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/__init__.py", line 58, in <module>
    from . import utils
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/utils.py", line 26, in <module>
    from .compat import parse_http_list as _parse_list_header
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/compat.py", line 7, in <module>
    from .packages import chardet
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/packages/__init__.py", line 3, in <module>
    from . import urllib3
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/packages/urllib3/__init__.py", line 10, in <module>
    from .connectionpool import (
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 38, in <module>
    from .response import HTTPResponse
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/packages/urllib3/response.py", line 9, in <module>
    from ._collections import HTTPHeaderDict
  File "/opt/conda/lib/python3.10/site-packages/botocore/vendored/requests/packages/urllib3/_collections.py", line 1, in <module>
    from collections import Mapping, MutableMapping

DLC image/dockerfile:

763104351884.dkr.ecr.us-west-1.amazonaws.com/huggingface-pytorch-inference:1.13.1-transformers4.26.0-cpu-py39-ubuntu20.04

Current behavior:

The container crashes when starting up.

Expected behavior:

The container starts successfully.

Additional context:

Here is the CloudFormation resource for the hosted model:

  LlamaSageMakerModel:
    Type: AWS::SageMaker::Model
    Properties:
      PrimaryContainer:
        Image: '763104351884.dkr.ecr.us-west-1.amazonaws.com/huggingface-pytorch-inference:1.13.1-transformers4.26.0-cpu-py39-ubuntu20.04'
        Mode: SingleModel
        ModelDataUrl: !Sub s3://mybucket-ml-models/nsql-llama-2-7B.tar.gz
        Environment:
          {
            "HF_TASK":"text-generation"
          }
      ExecutionRoleArn: !GetAtt LlamaExecutionRole.Arn
      ModelName: llama-7b-sql-model
@sirutBuasai sirutBuasai added the bug Something isn't working label Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants