XLA: Could not open input file: Is a directory #8947

Earthson · 2017-04-04T08:35:28Z

XLA failed with Could not open input file: Is a directory

Environment info

Operating System: Ubuntu 16.04

Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*):

.opt/anaconda/lib64/libcudadevrt.a
.opt/anaconda/lib64/libcudart.so
.opt/anaconda/lib64/libcudart.so.8.0
.opt/anaconda/lib64/libcudart.so.8.0.61
.opt/anaconda/lib64/libcudart_static.a
.opt/anaconda/lib64/libcudnn.so
.opt/anaconda/lib64/libcudnn.so.6
.opt/anaconda/lib64/libcudnn.so.6.0.20
.opt/anaconda/lib64/libcudnn_static.a

code setup

If installed from binary pip package, provide:

code init

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
tf.reset_default_graph()
tl.layers.set_name_reuse(True)
placehold_mapping, networks = c_network(None, label_indices=label_index, feature_indices=feature_index)
network = networks[0]
config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
sess = tf.Session(config=config)
tl.layers.initialize_global_variables(sess)

Log

2017-04-04 16:26:48.275644: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-04-04 16:26:48.275648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y
2017-04-04 16:26:48.275653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:0a:00.0)
2017-04-04 16:26:48.479102: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-04-04 16:26:48.479122: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 16 visible devices
2017-04-04 16:26:48.481008: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x5dd4360 executing computations on platform Host. Devices:
2017-04-04 16:26:48.481021: I tensorflow/compiler/xla/service/service.cc:191]   StreamExecutor device (0): <undefined>, <undefined>
2017-04-04 16:26:48.481138: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-04-04 16:26:48.481146: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 16 visible devices
2017-04-04 16:26:48.482239: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x5f0f950 executing computations on platform CUDA. Devices:
2017-04-04 16:26:48.482248: I tensorflow/compiler/xla/service/service.cc:191]   StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
GEN DATASET: 0.00 seconds elapsed
ROUND:  0
2017-04-04 16:26:57.149563: F tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/utils.cc:31] -1:-1: Could not open input file: Is a directory

The text was updated successfully, but these errors were encountered:

gunan · 2017-04-04T22:09:52Z

This is a package we do not maintain.
I am not sure when it was built, and which commit it was synced to.
I would recommend reaching out to the contributors of the package you installed.

Either they can escalate to us whith information about how they built the package, or you can then share the information you got from the package maintainers with us.

Earthson · 2017-04-06T02:29:40Z

the package is built using bazel.

TF_ROOT_DIR=$HOME/git/tensorflow

mkdir -p $HOME/git

if [ -d $TF_ROOT_DIR ]; then
  cd $TF_ROOT_DIR
  git pull
else
  cd $HOME/git
  git clone https://github.com/tensorflow/tensorflow
  cd $TF_ROOT_DIR
fi

git checkout r1.1

echo  $PREFIX

bazel clean
echo $PYTHON_BIN_PATH
PYTHON_BIN_PATH=$(which python) \
PYTHON_LIB_PATH=$PREFIX/lib/python3.6/site-packages \
TF_NEED_MKL=1 \
MKL_INSTALL_PATH=$PREFIX \
CC_OPT_FLAGS="-march=native" \
TF_NEED_JEMALLOC=1 \
TF_NEED_GCP=0 \
TF_NEED_HDFS=0 \
TF_ENABLE_XLA=1 \
TF_NEED_OPENCL=0 \
TF_NEED_CUDA=1 \
GCC_HOST_COMPILER_PATH=$(which gcc) \
TF_CUDA_VERSION="8.0" \
CUDA_TOOLKIT_PATH=$PREFIX \
TF_CUDNN_VERSION=6 \
CUDNN_INSTALL_PATH=$PREFIX \
TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
./configure

bazel build -c opt --copt=-march=native --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k  //tensorflow/tools/pip_package:build_pip_package
rm -rf /tmp/tensorflow_pkg
mkdir -p /tmp/tensorflow_pkg
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install $(ls /tmp/tensorflow_pkg/tensorflow*)

gunan · 2017-04-06T03:18:29Z

Thanks for the information.
@tatatodd @hawkinsp is this an issue we have seen before?

@av8ramit Looks like this is an issue in the release branch. We may think about a cherrypick based on the investigation.

tatatodd · 2017-04-06T03:32:10Z

No, (at least) I have never seen this issue before.

Note that the code example is setting CUDA_VISIBLE_DEVICES=0, and then enabling session-level JIT. Enabling session-level JIT only supports GPU, as explained here (in the starred blue box):
https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation

It would be nice to not return a cryptic error, but at a high-level, setting CUDA_VISIBLE_DEVICES=0 and then enabling session-level JIT is at best not going to turn XLA on anyways. I'll advise against doing this.

tatatodd · 2017-04-06T03:33:29Z

Oops, sorry, brain freeze. I just realized CUDA_VISIBLE_DEVICES=0 is selecting the 0th device, and the logs show it is being detected.

So my response is back to - "no, I've never seen this, we should probably debug".

asimshankar · 2017-04-08T00:00:44Z

I suspect it's because TensorFlow cannot find the CUDA libraries, though I'm not sure since I'm not sure what $PREFIX is in your snippet above. To confirm, try running the program after setting the TF_CPP_MIN_VLOG_LEVEL environment variable to 1 before starting Python.

In particular, I'm interested in the log messages from gpu_backend_lib.cc, that might help figure out which file it's trying to load (and failing on)

Will help with issues like tensorflow#8947 Change: 152733558

asimshankar · 2017-05-31T09:07:21Z

Closing due to inactivity. If you're still running into this, please feel free to file an updated issue (including any output from suggestions above). Thanks!

asimshankar added the stat:awaiting response Status - Awaiting response from author label Apr 8, 2017

drpngx pushed a commit to drpngx/tensorflow that referenced this issue Apr 11, 2017

xla: Add filename to failure message in LoadIRModule

1792a0a

Will help with issues like tensorflow#8947 Change: 152733558

asimshankar closed this as completed May 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLA: Could not open input file: Is a directory #8947

XLA: Could not open input file: Is a directory #8947

Earthson commented Apr 4, 2017 •

edited

gunan commented Apr 4, 2017 •

edited

Earthson commented Apr 6, 2017

gunan commented Apr 6, 2017

tatatodd commented Apr 6, 2017

tatatodd commented Apr 6, 2017

asimshankar commented Apr 8, 2017

asimshankar commented May 31, 2017

XLA: Could not open input file: Is a directory #8947

XLA: Could not open input file: Is a directory #8947

Comments

Earthson commented Apr 4, 2017 • edited

Environment info

code setup

code init

Log

gunan commented Apr 4, 2017 • edited

Earthson commented Apr 6, 2017

gunan commented Apr 6, 2017

tatatodd commented Apr 6, 2017

tatatodd commented Apr 6, 2017

asimshankar commented Apr 8, 2017

asimshankar commented May 31, 2017

Earthson commented Apr 4, 2017 •

edited

gunan commented Apr 4, 2017 •

edited