You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run resnet50 inference pytorch on our cluster. I launched a container on one compute node (it has 8 PVC cards), and called quickstart/image_recognition/pytorch/resnet50v1_5/inference/gpu/inference_block_format.sh.
These are the parameters I set export DATASET_DIR=/mnt/daos/datasets_shared/imagenet export OUTPUT_DIR=/mnt/daos/scratch_guob/resnet50/inference/logs export BATCH_SIZE=1024 export NUM_ITERATIONS=2 export Tile=1
And this is the error I got resnet50 int8 inference block oneccl_bindings_for_pytorch not available! Use XPU: 0 => using pre-trained model 'resnet50' /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum orNonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=ResNet50_Weights.IMAGENET1K_V1. You can also use weights=ResNet50_Weights.DEFAULTto get the most up-to-date weights. warnings.warn(msg) Traceback (most recent call last): File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 1116, in <module> main() File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 276, in main main_worker(ngpus_per_node, args) File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 416, in main_worker torch.xpu.set_device(args.xpu) File "/nfs/home/guob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/xpu/__init__.py", line 159, in set_device intel_extension_for_pytorch._C._setDevice(device) AttributeError: module 'intel_extension_for_pytorch._C' has no attribute '_setDevice' awk: cmd. line:1: fatal: division by zero attempted
Could someone tell me what's wrong and how to solve the problem? Thanks!
The text was updated successfully, but these errors were encountered:
I am trying to run resnet50 inference pytorch on our cluster. I launched a container on one compute node (it has 8 PVC cards), and called quickstart/image_recognition/pytorch/resnet50v1_5/inference/gpu/inference_block_format.sh.
These are the parameters I set
export DATASET_DIR=/mnt/daos/datasets_shared/imagenet export OUTPUT_DIR=/mnt/daos/scratch_guob/resnet50/inference/logs export BATCH_SIZE=1024 export NUM_ITERATIONS=2 export Tile=1
And this is the error I got
resnet50 int8 inference block oneccl_bindings_for_pytorch not available! Use XPU: 0 => using pre-trained model 'resnet50' /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /nfs/home/guob/.local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or
Nonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing
weights=ResNet50_Weights.IMAGENET1K_V1. You can also use
weights=ResNet50_Weights.DEFAULTto get the most up-to-date weights. warnings.warn(msg) Traceback (most recent call last): File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 1116, in <module> main() File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 276, in main main_worker(ngpus_per_node, args) File "/applications.devops.montecristo.onboarding/example_resnet50_inference_pt/code/AImodels/models/image_recognition/pytorch/resnet50v1_5/inference/gpu/main.py", line 416, in main_worker torch.xpu.set_device(args.xpu) File "/nfs/home/guob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/xpu/__init__.py", line 159, in set_device intel_extension_for_pytorch._C._setDevice(device) AttributeError: module 'intel_extension_for_pytorch._C' has no attribute '_setDevice' awk: cmd. line:1: fatal: division by zero attempted
Could someone tell me what's wrong and how to solve the problem? Thanks!
The text was updated successfully, but these errors were encountered: