Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel bring up for ROCm #10703

Closed
aditya4d1 opened this issue Jun 14, 2017 · 24 comments
Closed

Bazel bring up for ROCm #10703

aditya4d1 opened this issue Jun 14, 2017 · 24 comments
Assignees
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:build/install Build and install issues

Comments

@aditya4d1
Copy link

Hi,
I am trying to add new backend to tensorflow. As a first step, I started changing bazel files around (Commit here). When I enable XLA + ROCM during configure, and run bazel build -s --config=opt --config=rocm //tensorflow/tools/pip_package:build_pip_package , I am getting the following error:

ERROR: no such package '@local_config_rocm//': error loading package 'external': The repository named 'local_config_rocm' could not be resolved.
INFO: Elapsed time: 0.227s

It would be great if someone can parse the commit mentioned and suggest changes.
Thank you!

@aselle
Copy link
Contributor

aselle commented Jun 14, 2017

Could you try to make the minimal change that exhibits your year. Unfortunately, we have many issues and your commit is over 1296 lines.

@aselle aselle added stat:awaiting response Status - Awaiting response from author type:build/install Build and install issues labels Jun 14, 2017
@aditya4d1
Copy link
Author

Hi,
Thank you response. You can ignore the LICENSE and CROSSTOOL_HCC.bzl.
I guess the point of interest will be the rocm_configure.bzl file. https://github.com/ROCmSoftwarePlatform/tensorflow/blob/b75ea3f499a5f63f2580066ae132c93e2b03d0ad/third_party/gpus/rocm_configure.bzl
If you are familiar with bazel infrastructure, can you guide me to add AMDGPU support to bazel?

@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label Jun 14, 2017
@aselle
Copy link
Contributor

aselle commented Jun 15, 2017

I am unfortunately far from a bazel expert, but @jart may have some suggestions.

@aselle aselle added stat:community support Status - Community Support stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:community support Status - Community Support labels Jun 15, 2017
@jart
Copy link
Contributor

jart commented Jun 15, 2017

I'm noticing you added this code:

# Macros for building CUDA code.
def if_rocm(if_true, if_false = []):
    return select({
        "@local_config_rocm//rocm:using_hcc": if_true,
        "//conditions:default": if_false
    })

What is @local_config_rocm? In order to have a label like that, you need to add something like the following to your tensorflow/workspace.bzl file:

native.new_http_archive(
    name = "local_config_rocm",
   # ...

@jart jart added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jun 15, 2017
@aditya4d1
Copy link
Author

Huh. Didn't know that. I'll add local_config_rocm to tensorflow/workspace.bzl and see how it goes.
Thank you for the response.

@jart
Copy link
Contributor

jart commented Jun 15, 2017

Feel free to ping this bug again if you get stuck on Bazel. If you're writing a feature for TensorFlow, then I'm happy to support you.

@jart jart closed this as completed Jun 15, 2017
@aditya4d1
Copy link
Author

aditya4d1 commented Jun 21, 2017

Hi,
I pushed my code. I need help stitching bazel around it so that the host compiler can touch them and link against rocm libraries.

@aditya4d1 aditya4d1 changed the title Bazel build error: Error loading package 'external' Bazel bring up for ROCm Jun 21, 2017
@jart
Copy link
Contributor

jart commented Jun 23, 2017

Have you taken a look into cuda_configure.bzl. We've got a lot of code for configuring cuda in our build process. I'm not familiar with ROCm but if it's AMD's version of CUDA, we'd love to support it, but I must warn you that it would most likely be a highly nontrivial undertaking.

@aditya4d1
Copy link
Author

aditya4d1 commented Jun 23, 2017

Hi @jart ,
We are focusing on supporting XLA first. We want to move away from using HCC (device code compiler for ROCm) and use just LLVM, libraries and runtime to run TF code. The file cuda_configure.bzl seems have nvcc specific functions, but for rocm support we want just host compiler (g++/clang++) build functions. Will this make the problem trivial?
PS: Can you re-open the issue?

@jart jart reopened this Jun 24, 2017
@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label Jun 24, 2017
@aditya4d1
Copy link
Author

aditya4d1 commented Jun 26, 2017

Hi @jart,
I added bazel files for rocm+xla here I am getting the following error:

bazel build --config=opt --config=rocm //tensorflow/tools/pip_package:build_pip_package 
ERROR: /home/aditya/rocm/tensorflow/tensorflow/core/platform/default/build_config/BUILD:31:1: error loading package 'tensorflow/stream_executor': Extension file not found. Unable to load package for '@local_config_rocm//rocm:build_defs.bzl': BUILD file not found on package path and referenced by '//tensorflow/core/platform/default/build_config:stream_executor'.
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted.
INFO: Elapsed time: 1.676s

@cy89
Copy link

cy89 commented Jul 5, 2017

@jart, this does seem like a Bazel issue; PTAL when you have a moment.

@cy89 cy89 added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 5, 2017
@chanil1218
Copy link

@adityaatluri
I found a suspicious line of the BUILD file not found error occurs.
I think changing cuda:BUILD to rocm:BUILD could resolve the error.

I would like to support enabling ROCm backend for XLA.
It would be better separate forked tensorflow repo instead of dangling Github tree.
Could you apply your changes to your forked tensorflow repository?

@aditya4d1
Copy link
Author

Hi @chanil1218
Thank you for the pointer.

You are most welcome in enabling ROCm XLA backend. We are actually waiting for CLA approval from our legal team. We would love to have bazel stuff figured out before implementing actual code.

@aditya4d1
Copy link
Author

aditya4d1 commented Sep 1, 2017

Hi,
I started fresh with the code, as plan is not to use AMD compiler (HCC). Here it is:

Code:
https://github.com/ROCmSoftwarePlatform/tensorflow/tree/rocm-v1

The build instructions are here:
https://github.com/ROCmSoftwarePlatform/tensorflow/blob/rocm-v1/ROCM.md

I am getting the following error:

$ bazel build --config=opt --config=rocm //tensorflow/tools/pip_package:build_pip_package
 ERROR: no such package '@local_config_rocm//': Traceback (most recent call last):
        File "/home/aditya/tensorflow/third_party/gpus/rocm_configure.bzl", line 555
                _create_local_rocm_repository(repository_ctx)
        File "/home/aditya/tensorflow/third_party/gpus/rocm_configure.bzl", line 506, in _create_local_rocm_repository
                _tpl(repository_ctx, "rocm:build_defs.b...", ...)})
        File "/home/aditya/tensorflow/third_party/gpus/rocm_configure.bzl", line 238, in _tpl
                repository_ctx.template(out, Label(("//third_party/gpus/%s...)), ...)
 Unable to load package for //third_party/gpus/rocm:build_defs.bzl.tpl: not found.
 INFO: Elapsed time: 0.301s FAILED: Build did NOT complete successfully (0 packages loaded) 

build_defs.bzl.tpl is present here: https://github.com/ROCmSoftwarePlatform/tensorflow/blob/rocm-v1/third_party/gpus/rocm/build_defs.bzl.tpl
CC: @jart

@aditya4d1
Copy link
Author

@cy89 can you help me resolve this issue? Thanks!

@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

1 similar comment
@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@tensorflowbutler
Copy link
Member

Nagging Awaiting TensorFlower: It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

2 similar comments
@tensorflowbutler
Copy link
Member

Nagging Awaiting TensorFlower: It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@tensorflowbutler
Copy link
Member

Nagging Awaiting TensorFlower: It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@Mandrewoid
Copy link

@adityaatluri although this issue appears stale, I thought I'd let you know the https://github.com/ROCmSoftwarePlatform/tensorflow links are dead

@vigchand2705
Copy link

@Mandrewoid I am guessing this is the new link https://github.com/ROCmSoftwarePlatform/hiptensorflow

@tensorflowbutler
Copy link
Member

Nagging Awaiting TensorFlower: It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@tensorflowbutler
Copy link
Member

Nagging Assignee @aselle: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

8 participants