Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting an FileNotFoundError: [Errno 2] No such file or directory: 'inference.py' #4007

Open
Nuwan1654 opened this issue Jul 19, 2023 · 8 comments

Comments

@Nuwan1654
Copy link

Nuwan1654 commented Jul 19, 2023

Describe the bug
Probably this is not a Bug, but when I try to deploy the Sagemaker Pytorch model, I am getting a FileNotFoundError: [Errno 2] No such file or directory: 'inference.py' error

To reproduce

  1. Create 'my_model.tar.gz' file according to the instructions on this page
  2. upload to s3
    My folder structure as follow
.
├── best.pt
├── coco.yaml
└── code
    ├── common.py
    ├── experimental.py
    ├── general.py
    ├── inference.py
    ├── loss.py
    ├── requirements.txt
    ├── torch_utils.py
    └── yolo.py

  1. Run predictor = pytorch_model.deploy(instance_type='ml.c4.xlarge', initial_instance_count=1)

Expected behavior
successful deploy

Screenshots or logs

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[14], line 1
----> 1 predictor = pytorch_model.deploy(instance_type='ml.c4.xlarge', initial_instance_count=1)

File ~/.local/lib/python3.8/site-packages/sagemaker/model.py:1260, in Model.deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, explainer_config, **kwargs)
   1257     if self._base_name is not None:
   1258         self._base_name = "-".join((self._base_name, compiled_model_suffix))
-> 1260 self._create_sagemaker_model(
   1261     instance_type, accelerator_type, tags, serverless_inference_config
   1262 )
   1264 serverless_inference_config_dict = (
   1265     serverless_inference_config._to_request_dict() if is_serverless else None
   1266 )
   1267 production_variant = sagemaker.production_variant(
   1268     self.name,
   1269     instance_type,
   (...)
   1275     container_startup_health_check_timeout=container_startup_health_check_timeout,
   1276 )

File ~/.local/lib/python3.8/site-packages/sagemaker/model.py:693, in Model._create_sagemaker_model(self, instance_type, accelerator_type, tags, serverless_inference_config)
    671 def _create_sagemaker_model(
    672     self, instance_type=None, accelerator_type=None, tags=None, serverless_inference_config=None
    673 ):
    674     """Create a SageMaker Model Entity
    675 
    676     Args:
   (...)
    691             not provided in serverless inference. So this is used to find image URIs.
    692     """
--> 693     container_def = self.prepare_container_def(
    694         instance_type,
    695         accelerator_type=accelerator_type,
    696         serverless_inference_config=serverless_inference_config,
    697     )
    699     if not isinstance(self.sagemaker_session, PipelineSession):
    700         # _base_name, model_name are not needed under PipelineSession.
    701         # the model_data may be Pipeline variable
    702         # which may break the _base_name generation
    703         self._ensure_base_name_if_needed(
    704             image_uri=container_def["Image"],
    705             script_uri=self.source_dir,
    706             model_uri=self.model_data,
    707         )

File ~/.local/lib/python3.8/site-packages/sagemaker/pytorch/model.py:298, in PyTorchModel.prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config)
    290     deploy_image = self.serving_image_uri(
    291         region_name,
    292         instance_type,
    293         accelerator_type=accelerator_type,
    294         serverless_inference_config=serverless_inference_config,
    295     )
    297 deploy_key_prefix = model_code_key_prefix(self.key_prefix, self.name, deploy_image)
--> 298 self._upload_code(deploy_key_prefix, repack=self._is_mms_version())
    299 deploy_env = dict(self.env)
    300 deploy_env.update(self._script_mode_env_vars())

File ~/.local/lib/python3.8/site-packages/sagemaker/model.py:626, in Model._upload_code(self, key_prefix, repack)
    611     self.uploaded_code = fw_utils.UploadedCode(
    612         s3_prefix=repacked_model_data, script_name=os.path.basename(self.entry_point)
    613     )
    615 LOGGER.info(
    616     "Repacking model artifact (%s), script artifact "
    617     "(%s), and dependencies (%s) "
   (...)
    623     repacked_model_data,
    624 )
--> 626 utils.repack_model(
    627     inference_script=self.entry_point,
    628     source_directory=self.source_dir,
    629     dependencies=self.dependencies,
    630     model_uri=self.model_data,
    631     repacked_model_uri=repacked_model_data,
    632     sagemaker_session=self.sagemaker_session,
    633     kms_key=self.model_kms_key,
    634 )
    636 self.repacked_model_data = repacked_model_data

File ~/.local/lib/python3.8/site-packages/sagemaker/utils.py:516, in repack_model(inference_script, source_directory, dependencies, model_uri, repacked_model_uri, sagemaker_session, kms_key)
    513 with _tmpdir(directory=local_download_dir) as tmp:
    514     model_dir = _extract_model(model_uri, sagemaker_session, tmp)
--> 516     _create_or_update_code_dir(
    517         model_dir,
    518         inference_script,
    519         source_directory,
    520         dependencies,
    521         sagemaker_session,
    522         tmp,
    523     )
    525     tmp_model_path = os.path.join(tmp, "temp-model.tar.gz")
    526     with tarfile.open(tmp_model_path, mode="w:gz") as t:

File ~/.local/lib/python3.8/site-packages/sagemaker/utils.py:577, in _create_or_update_code_dir(model_dir, inference_script, source_directory, dependencies, sagemaker_session, tmp)
    575     os.mkdir(code_dir)
    576 try:
--> 577     shutil.copy2(inference_script, code_dir)
    578 except FileNotFoundError:
    579     if os.path.exists(os.path.join(code_dir, inference_script)):

File /usr/lib/python3.8/shutil.py:435, in copy2(src, dst, follow_symlinks)
    433 if os.path.isdir(dst):
    434     dst = os.path.join(dst, os.path.basename(src))
--> 435 copyfile(src, dst, follow_symlinks=follow_symlinks)
    436 copystat(src, dst, follow_symlinks=follow_symlinks)
    437 return dst

File /usr/lib/python3.8/shutil.py:264, in copyfile(src, dst, follow_symlinks)
    262     os.symlink(os.readlink(src), dst)
    263 else:
--> 264     with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
    265         # macOS
    266         if _HAS_FCOPYFILE:
    267             try:

FileNotFoundError: [Errno 2] No such file or directory: 'inference.py'
'''

**System information**
A description of your system. Please provide:
- **SageMaker Python SDK version**: 2.155.0
- **Framework name:**: PyTorch
- **Framework version**: 2.0.1+cu117
- **Python version**: 3.8.10
- **CPU or GPU**: GPU
- **Custom Docker image (Y/N)**: Y

Any help on this would be highly appreciated.
@Nuwan1654 Nuwan1654 added the bug label Jul 19, 2023
@hiyamgh
Copy link

hiyamgh commented Aug 21, 2023

Facing the same issue!

@hooNpk
Copy link

hooNpk commented Oct 6, 2023

Is there anyone who solve this problem? I have a same issue

@kenny-chen
Copy link

I conjecture that you are compiling segamaker using a local environment. The reason for the issue is the absence of inference.py.

When executing pytorch_model.deploy(), it will initiate the Docker server image. Since I am using TensorFlow, it will launch the server image 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.11-gpu.

However, this image lacks "inference.py", resulting in a FileNotFoundError: [Errno 2] No such file or directory: 'inference.py' error.

Solution:

Firstly, create ./code/inference.py.
Refer to the following link for the code: https://github.com/aws/sagemaker-tensorflow-serving-container/blob/master/test/resources/examples/test1/inference.py

import json
from collections import namedtuple

Context = namedtuple('Context',
                     'model_name, model_version, method, rest_uri, grpc_uri, '
                     'custom_attributes, request_content_type, accept_header')


def input_handler(data, context):
    if context.request_content_type == 'application/json':
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))


def output_handler(data, context):
    if data.status_code != 200:
        raise ValueError(data.content.decode('utf-8'))

    response_content_type = context.accept_header
    prediction = data.content
    return prediction, response_content_type

Next, deploy the model using the inference approach. The steps below demonstrate how to locally deploy the model using TensorFlow:

import os
from sagemaker.tensorflow import TensorFlowModel

model = TensorFlowModel(
    entry_point='inference.py',
    source_dir='./code',
    role=os.environ['AWS_ROLE'],
    model_data=f'{output}/model.tar.gz',
    framework_version='2.11'
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type='local_gpu',
)
  1. The ./code/inference.py will be automatically loaded into the server image.
  2. {output}/model.tar.gz will be automatically loaded as the local model.
  3. The model.deploy() command will successfully initiate the server image.

@nehyaeeg3
Copy link

whats the solution?

@kitty2121
Copy link

I think i've figured it out ; you MUST create the app.py file WITHIN here :

file browser pane, browse to "./lab1/packages/{account_id}-lab1_code-1.0/src/

Some tutorials say here: "In the file browser pane, browse to ./lab1/packages/{account_id}-lab1_code-1.0/. Find descriptor.json." BUT the right tutorial (https://catalog.workshops.aws/panorama-immersion-day/en-US/20-lab1-object-detection/21-lab1) correctly states to create and save the app.py txt file in ""./lab1/packages/{account_id}-lab1_code-1.0/src/" i.e. the SRC folder found in Lab 1.

Wrong version - https://explore.skillbuilder.aws/learn/course/17780/play/93251/aws-panorama-building-edge-computer-vision-cv-applications
Right version - https://catalog.workshops.aws/panorama-immersion-day/en-US/20-lab1-object-detection/21-lab1

@evankozliner
Copy link

Also facing this issue. I think the docs on https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#bring-your-own-model may be faulty as well?

@Itto1992
Copy link

I also encountered the same problem.
I tried from notebook instance.
Here is my implementation:

import sagemaker
from sagemaker.pytorch import PyTorchModel
from sagemaker.serverless import ServerlessInferenceConfig

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

model = PyTorchModel(
    entry_point='inference.py',
    role=role,
    model_data='s3://***/model.tar.gz',
    framework_version='2.1',
    py_version='py310',
)

serverless_config = ServerlessInferenceConfig(
    max_concurrency=1,
    memory_size_in_mb=3072,
)

deploy_params = {
    'instance_type': 'ml.t3.medium',
    'initial_instance_count': 1,
    'serverless_inference_config': serverless_config,
}

predictor = model.deploy(**deploy_params)

@Itto1992
Copy link

I solved this problem by changing how to create tar.gz file.
The key point is to use tar command in the same directory as the model file like:

$ tar czvf ../model.tar.gz *
code/
code/requirements.txt
code/inference.py
model.pth

I failed when I run tar command in the parent directory of the model file:

$tar czvf model.tar.gz model
model/
model/model.pth
model/code/
model/code/requirements.txt
model/code/inference.py

As you can see, these results are different each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants