You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 17, 2023. It is now read-only.
I am assigning it using the command on Mac terminal:
qsub -l nodes=1:xeon:batch:ppn=2 -d . job.sh
The job ran for something around 3 hours and produced 2 output files: job.sh.e934264 & job.sh.o934264
The job.sh.e934264 file is as follows:
2021-07-26 03:49:45.014693: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /glob/development-tools/versions/oneapi/2021.3/inteloneapi/vpl/2021.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/tbb/2021.3.0/env/../lib/intel64/gcc4.8:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/rkcommon/1.6.1/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray_studio/0.7.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray/2.6.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/openvkl/0.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/oidn/1.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//libfabric/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib/release:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mkl/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/itac/2021.3.0/slib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ippcp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/embree/3.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dnnl/2021.3.0/cpu_dpcpp_gpu_dpcpp/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/gdb/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/libipt/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/dep/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dal/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/x64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/emu:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/host/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ccl/2021.3.0/lib/cpu_gpu_dpcpp 2021-07-26 03:49:45.014777: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-07-26 03:49:50.062319: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /glob/development-tools/versions/oneapi/2021.3/inteloneapi/vpl/2021.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/tbb/2021.3.0/env/../lib/intel64/gcc4.8:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/rkcommon/1.6.1/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray_studio/0.7.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray/2.6.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/openvkl/0.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/oidn/1.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//libfabric/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib/release:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mkl/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/itac/2021.3.0/slib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ippcp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/embree/3.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dnnl/2021.3.0/cpu_dpcpp_gpu_dpcpp/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/gdb/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/libipt/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/dep/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dal/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/x64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/emu:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/host/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ccl/2021.3.0/lib/cpu_gpu_dpcpp 2021-07-26 03:49:50.062403: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303) 2021-07-26 03:49:50.062449: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (s001-n061): /proc/driver/nvidia/version does not exist 2021-07-26 03:49:50.062948: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-07-26 03:52:31.660446: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2) 2021-07-26 03:52:31.679568: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3400000000 Hz /var/spool/torque/mom_priv/jobs/934264.v-qsvr-1.aidevcloud.SC: line 4: 110188 Killed python master.py
########################################################################
End of output for job 934264.v-qsvr-1.aidevcloud
Date: Mon 26 Jul 2021 06:52:21 AM PDT
########################################################################
`
The desired output and code weren't produced and I am facing this issue/error. Can someone please help me with this? Thanks
The text was updated successfully, but these errors were encountered:
I'm trying to execute a python file on devcloud. The job script job.sh is as follows:
#!/bin/bash
source /opt/intel/inteloneapi/setvars.sh > /dev/null 2>&1
python master.py
I am assigning it using the command on Mac terminal:
qsub -l nodes=1:xeon:batch:ppn=2 -d . job.sh
The job ran for something around 3 hours and produced 2 output files: job.sh.e934264 & job.sh.o934264
The job.sh.e934264 file is as follows:
2021-07-26 03:49:45.014693: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /glob/development-tools/versions/oneapi/2021.3/inteloneapi/vpl/2021.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/tbb/2021.3.0/env/../lib/intel64/gcc4.8:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/rkcommon/1.6.1/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray_studio/0.7.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray/2.6.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/openvkl/0.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/oidn/1.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//libfabric/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib/release:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mkl/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/itac/2021.3.0/slib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ippcp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/embree/3.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dnnl/2021.3.0/cpu_dpcpp_gpu_dpcpp/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/gdb/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/libipt/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/dep/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dal/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/x64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/emu:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/host/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ccl/2021.3.0/lib/cpu_gpu_dpcpp
2021-07-26 03:49:45.014777: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-07-26 03:49:50.062319: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /glob/development-tools/versions/oneapi/2021.3/inteloneapi/vpl/2021.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/tbb/2021.3.0/env/../lib/intel64/gcc4.8:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/rkcommon/1.6.1/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray_studio/0.7.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ospray/2.6.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/openvkl/0.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/oidn/1.4.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//libfabric/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib/release:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mpi/2021.2.0//lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/mkl/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/itac/2021.3.0/slib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ippcp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ipp/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/embree/3.13.0/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dnnl/2021.3.0/cpu_dpcpp_gpu_dpcpp/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/gdb/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/libipt/intel64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/debugger/10.1.2/dep/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/dal/2021.3.0/lib/intel64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/x64:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/emu:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/host/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/lib/oclfpga/linux64/lib:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin:/glob/development-tools/versions/oneapi/2021.3/inteloneapi/ccl/2021.3.0/lib/cpu_gpu_dpcpp
2021-07-26 03:49:50.062403: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-07-26 03:49:50.062449: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (s001-n061): /proc/driver/nvidia/version does not exist
2021-07-26 03:49:50.062948: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-26 03:52:31.660446: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-07-26 03:52:31.679568: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3400000000 Hz /var/spool/torque/mom_priv/jobs/934264.v-qsvr-1.aidevcloud.SC: line 4: 110188 Killed python master.py
`
job.sh.o934264 is:
`
########################################################################
Date: Mon 26 Jul 2021 03:49:38 AM PDT
Job ID: 934264.v-qsvr-1.aidevcloud
User: u65358
Resources: neednodes=1:xeon:batch:ppn=2,nodes=1:xeon:batch:ppn=2,walltime=06:00:00
########################################################################
########################################################################
End of output for job 934264.v-qsvr-1.aidevcloud
Date: Mon 26 Jul 2021 06:52:21 AM PDT
########################################################################
`
The desired output and code weren't produced and I am facing this issue/error. Can someone please help me with this? Thanks
The text was updated successfully, but these errors were encountered: