ResNet50 inference: fail to set device #148

mguo2021 · 2023-08-04T01:08:26Z

I am trying to run resnet50 inference pytorch on our cluster. I launched a container on one compute node (it has 8 PVC cards), and called quickstart/image_recognition/pytorch/resnet50v1_5/inference/gpu/inference_block_format.sh.

These are the parameters I set
export DATASET_DIR=/mnt/daos/datasets_shared/imagenet export OUTPUT_DIR=/mnt/daos/scratch_guob/resnet50/inference/logs export BATCH_SIZE=1024 export NUM_ITERATIONS=2 export Tile=1

And this is the error I got
resnet50 int8 inference block oneccl_bindings_for_pytorch not available! Use XPU: 0 => using pre-trained model 'resnet50' /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum orNonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=ResNet50_Weights.IMAGENET1K_V1. You can also use weights=ResNet50_Weights.DEFAULTto get the most up-to-date weights. warnings.warn(msg) Traceback (most recent call last): File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 1116, in <module> main() File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 276, in main main_worker(ngpus_per_node, args) File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 416, in main_worker torch.xpu.set_device(args.xpu) File "/nfs/home/guob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/xpu/__init__.py", line 159, in set_device intel_extension_for_pytorch._C._setDevice(device) AttributeError: module 'intel_extension_for_pytorch._C' has no attribute '_setDevice' awk: cmd. line:1: fatal: division by zero attempted

Could someone tell me what's wrong and how to solve the problem? Thanks!

The text was updated successfully, but these errors were encountered:

mguo2021 · 2023-08-04T19:03:44Z

I fixed it by using
python -m pip install intel_extension_for_pytorch==1.10.200+gpu -f https://developer.intel.com/ipex-whl-stable-xpu

mguo2021 closed this as completed Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ResNet50 inference: fail to set device #148

ResNet50 inference: fail to set device #148

mguo2021 commented Aug 4, 2023

mguo2021 commented Aug 4, 2023

ResNet50 inference: fail to set device #148

ResNet50 inference: fail to set device #148

Comments

mguo2021 commented Aug 4, 2023

mguo2021 commented Aug 4, 2023