[ROCm] Bazel build and continuous integration infrastructure by whchung · Pull Request #20277 · tensorflow/tensorflow

whchung · 2018-06-25T15:24:41Z

This pull request is to start introduce support for ROCm platform to TensorFlow. In this initial pull request, 2 components are addressed:

bazel build system
continuous integration logic

Authors:

Jack Chung: jack.chung@amd.com
Jeffrey Poznanovic: Jeffrey.Poznanovic@amd.com
Peng Sun: Peng.Sun@amd.com

whchung · 2018-07-06T14:16:11Z

ping?

yifeif

configure and build script change lgtm

caisq · 2018-07-10T18:39:07Z

+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+# Workaround: use HIP PR#457 and then build from source


Probably better to provide the full link here for posterity.

HIP PR#457 ( ROCm/hip#457 ) had been merged into HIP mainline after this PR was created. I'll amend this PR to address this fact.

/cc @paralleo for awareness

@caisq pushed new commit to address this

caisq · 2018-07-10T18:41:37Z

      set_trisycl_include_dir(environ_cp)

+  set_action_env_var(environ_cp, 'TF_NEED_ROCM', 'ROCm', False)
+  if environ_cp.get('TF_NEED_ROCM') == '1':


nit: These two if statements can be consolidated as one, because neither of them has an else branch.

I'd like to keep them as is for now. There are additional upcoming ROCm-specific checks / env vars in future PRs. For example, like in CUDA path where it's possible to build TensorFlow with either nvcc or CUDA clang, we are also working on similar route where it's possible to switch between incumbent HIP/HCC toolchain or upcoming HIP clang toolchain.

I thought the eventual goal was to just have clang?

@gunan for the time being HIP/HCC would be the incumbent toolchain on ROCm and we'll switch to HIP clang toolchain once its performance beats the incumbent solution.

I'm working on revising this PR to address comments from all reviewers now and will ping you once it's ready.

caisq · 2018-07-10T18:46:25Z

    else:
      set_trisycl_include_dir(environ_cp)

+  set_action_env_var(environ_cp, 'TF_NEED_ROCM', 'ROCm', False)


TF_NEED_CUDA, TF_NEED_SYCL and TF_NEED_ROCM are all mutually exclusive, right? If so, we need a sanity check here that at most one of them is true.

From TensorFlow perspective they are not mutually exclusive. It's possible to enable all three and still have TensorFlow built. Although I'd be surprised to see such configuration in real life. J

Looking at some of the options below, enabling both at the same time would break other things. For example, the _gpu cc_test targets. Which GPU would they run on?
Let's add the check @caisq requested here.

ok will do.

revised the PR to raise error in case more than 2 GPU platforms are specified (CUDA/SYCL/ROCm).

gunan · 2018-07-10T22:34:48Z

                          hdrs=[],
                          **kwargs):
-  copts = copts + _cuda_copts() + if_cuda(cuda_copts) + tf_copts()
+  copts=copts + tf_copts() + _cuda_copts() + _rocm_copts() + if_cuda_is_configured(cuda_copts) + if_rocm_is_configured(cuda_copts)


let's hide the additional complexities here by adding an arg to "_cuda_copts" and "_rocm_copts"

So this line becomes:
copts=copts + tf_copts() + _cuda_copts(opts=cuda_copts) + _rocm_copts(opts=cuda_copts)

@gunan Thanks. I'll address this.

gunan · 2018-07-10T22:37:43Z

@@ -0,0 +1,97 @@
+# This Dockerfile provides a starting point for a ROCm installation of 
+# MIOpen and tensorflow.  
+FROM ubuntu:xenial-20170619


Why not a more generic image? "ubuntu:xenial"

@parallelo Please weigh in. I believe the specific tag was specified to ensure the proper Linux kernel version is used?

Correct. We typically use a 4.13-45 linux kernel.

Do we have an upper limit on the kernel version?
If we depend on one very specific kernel version this will be very very brittle.
Is there a way to lift this restriction?

we had an issue with 4.15+ kernels in the older version of ROCm stack. Now the issue has been fixed in the upcoming ROCm release so I'll remove this.

gunan · 2018-07-10T22:38:48Z

+if [[ "${TF_NEED_ROCM}" -eq 1 ]]; then
+  # ROCm requires the video group in order to use the GPU for compute. If it
+  # exists on the host, add it to the container.
+  getent group video || addgroup video && adduser "${CI_BUILD_USER}" video


@caisq I am quite unfamiliar with this script. Are the changes here OK?

@parallelo could weign in.

This line is to fulfill permission requirements from ROCm stack specified at: https://github.com/RadeonOpenCompute/ROCm#next-set-your-permissions

In the case of ROCm, we add the container's user to the video group (but only if the host user was a member of the video group). This group membership is currently a requirement for ROCm.

gunan · 2018-07-10T22:40:39Z

+bazel test --config=rocm --test_tag_filters=-no_oss,-oss_serial,-no_gpu,-benchmark-test -k \
+    --test_lang_filters=cc --jobs=${N_JOBS} --test_timeout 300,450,1200,3600 \
+    --build_tests_only --test_output=errors --local_test_jobs=1 --config=opt \
+    --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute -- \


I recommend omitting this flag, and setting local_test_jobs=1.
This is highly specialized for VMs with 8 k80 GPUs attached.

thanks. will do.

gunan · 2018-07-10T22:40:55Z

+bazel test --config=rocm --test_tag_filters=-no_oss,-oss_serial,-no_gpu,-benchmark-test -k \
+    --test_lang_filters=py --jobs=${N_JOBS} --test_timeout 300,450,1200,3600 \
+    --build_tests_only --test_output=errors --local_test_jobs=1 --config=opt \
+    --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute -- \


ditto above. remove this option.

thanks. will do.

gunan · 2018-07-10T22:42:49Z

+bazel test --config=rocm --test_tag_filters=-no_gpu,-benchmark-test,-no_oss -k \
+    --jobs=${N_JOBS} --test_timeout 300,450,1200,3600 \
+    --build_tests_only --test_output=errors --local_test_jobs=1 \
+    --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute \


remove this option.

thanks. will do.

gunan · 2018-07-10T22:43:23Z

@@ -0,0 +1,277 @@
+major_version: "local"


@meteorcloudy @mhlopko could you help review this file?

gunan · 2018-07-10T22:43:48Z

@@ -0,0 +1,239 @@
+#!/usr/bin/env python


@meteorcloudy @mhlopko could you review?

gunan · 2018-07-10T22:45:00Z

+    },
+)
+
+config_setting(


I do not think we need these duplicated here. Is there a reason to not reuse them from //tensorflow/BUILD?

@gunan configs here are implemented following tensorflow/third_party/gpus/cuda/BUILD.tpl where CUDA-specific configs are specified. These CUDA-specific configs aren't in //tensorflow/BUILD either.

Since the purpose of this PR is to introduce ROCm-specific build scripts without major refactoring to existing TensorFlow build infrastructure I'd like to propose keep this file as is. Thoughts?

gunan · 2018-07-10T22:50:18Z

@@ -0,0 +1,32 @@
+# Macros for building ROCm code.


@meteorcloudy @mhlopko could you help review?

meteorcloudy · 2018-07-11T07:38:27Z

+  as is as a string to --compiler-options of hipcc. When "-x rocm" is not
+  present, this wrapper invokes gcc with the input arguments as is.
+
+NOTES:


I think we can remove this note, we don't have crosstool_wrapper_driver_rocm internally.

meteorcloudy · 2018-07-11T07:47:07Z

+
+def _find_rocm_lib(lib, repository_ctx, cpu_value, basedir, version="",
+                   static=False):
+  """Finds the given CUDA or cuDNN library on the system.


Fix comment, please

thanks. will do.

meteorcloudy · 2018-07-11T07:48:06Z

+    lib: The name of the library, such as "rocmrt"
+    repository_ctx: The repository context.
+    cpu_value: The name of the host operating system.
+    basedir: The install directory of CUDA or cuDNN.


thanks. will do.

meteorcloudy · 2018-07-11T07:49:11Z

+      return struct(file_name=file_name, path=str(path.realpath))
+
+  elif cpu_value == "Windows":
+    path = repository_ctx.path("%s/lib/x64/%s" % (basedir, file_name))


Is ROCm support actually available on Windows?

no it's not yet available on Windows. some infrastructure work has been undergoing but it'll be some time before we can actually enable it. I'll remove these checks.

meteorcloudy · 2018-07-11T07:49:29Z

+  auto_configure_fail("Cannot find rocm library %s" % file_name)
+
+def _find_libs(repository_ctx, rocm_config):
+  """Returns the CUDA and cuDNN libraries on the system.


thanks. will fix it.

meteorcloudy · 2018-07-11T07:49:43Z

+
+  Args:
+    repository_ctx: The repository context.
+    rocm_config: The CUDA config as returned by _get_rocm_config


thanks. will fix it.

meteorcloudy · 2018-07-11T07:55:50Z

+  }
+
+def _rocmrt_static_linkopt(cpu_value):
+  """Returns additional platform-specific linkopts for rocmrt."""


This is a copy of _cudart_static_linkopt, right? -lrt is needed during linking cudart static library on Linux, no sure the same option is needed for ROCm. Does the rocmrt static library even exist?

thanks. rocmrt_static is actually a static library in HIP. but actually it's not really needed for TensorFlow. logic here is indeed a copy from CUDA counterpart. I'll remove it.

meteorcloudy · 2018-07-11T07:59:45Z

+  _tpl(repository_ctx, "rocm:BUILD",
+       {
+           "%{rocmrt_static_lib}": rocm_libs["hip"].file_name,
+           "%{rocmrt_static_linkopt}": '',


If rocmrt_static_linkopt is not needed, we can remove them from both BUILD.tpl and rocm_configure.bzl
Does %{rocmrt_static_lib} library exist? Why is it the same as %{rocmrt_lib}?

I'll remove it.

meteorcloudy · 2018-07-11T08:05:02Z

+  cc = find_cc(repository_ctx)
+  host_compiler_includes = _host_compiler_includes(repository_ctx, cc)
+  rocm_defines = {
+           "%{rocm_include_path}": _rocm_include_path(repository_ctx,


Looks like you have hard-coded rocm include paths in CROSSTOOL, should we remove this field or not hard-code?

I'll migrate hard-coded paths from CROSSTOOL to rocm_configure.bzl so it's easier to maintain. In CROSSTOOL we'll honor %{rocm_include_path}.

meteorcloudy · 2018-07-11T08:06:18Z

+  # linker_flag: "-Wl,--detect-odr-violations"
+
+  # Include directory for ROCm headers.
+  cxx_builtin_include_directory: "/opt/rocm/hsa/include"


Should we replace them with %{rocm_include_path} so that they can be configured?

thanks. let me see how to remove these cxx_builtin_include_directory and have them populated from rocm_configure.bzl.

perfinion · 2018-07-12T08:12:14Z

  nccl_configure(name="local_config_nccl")
  git_configure(name="local_config_git")
  sycl_configure(name="local_config_sycl")
+  rocm_configure(name="local_config_rocm")


You probably need to add an exclude to build_pip_package.sh for this too.
here:

tensorflow/tensorflow/tools/pip_package/build_pip_package.sh

Line 30 in 622a111

for f in `find . ! -type d ! -name '*.py' ! -path '*local_config_cuda*' ! -path '*local_config_tensorrt*' ! -path '*org_tensorflow*'`; do

gunan · 2018-07-16T20:46:26Z

      set_trisycl_include_dir(environ_cp)

+  set_action_env_var(environ_cp, 'TF_NEED_ROCM', 'ROCm', False)
+  if environ_cp.get('TF_NEED_ROCM') == '1':


I thought the eventual goal was to just have clang?

gunan · 2018-07-16T20:49:07Z

+  if environ_cp.get('TF_NEED_ROCM') == '1':
+    if 'LD_LIBRARY_PATH' in environ_cp and environ_cp.get(
+        'LD_LIBRARY_PATH') != '1':
+      write_action_env_to_bazelrc('LD_LIBRARY_PATH',


Currently, if both CUDA and ROCM is enabled, we write the environment variable twice, possibly one overriding the other.

We should merge the logic to write LD_LIBRARY_PATH to check both TF_NEED_CUDA and TF_NEED_ROCM and write it only once.

TF_NEED_ROCM and TF_NEED_CUDA would be changed so they are mutually exclusive.

gunan · 2018-07-16T20:50:12Z

 )
-load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda")
+load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda", "if_cuda_is_configured")
+load("@local_config_rocm//rocm:build_defs.bzl", "if_rocm", "if_rocm_is_configured")


@jlebar Internally we do not have equivalents of these.
Are we planning to create the internal equivalents of these macros, just like what we have for CUDA?

Are we planning to create the internal equivalents of these macros, just like what we have for CUDA?

It's @Artem-B and my opinion that --config=cuda (and by corollary if_cuda) is a harmful hack. TF uses it as a switch to say "build for GPU or not", but it shouldn't: This should be decided by BUILD rules. (E.g. you depend on :tensorflow_cpu or :tensorflow_gpu.)

The reason we introduced --config=cuda is because at the time, Skylark was missing some features we needed in order to make the toolchain work properly. We think these features are there now.

Unfortunately the direction of these patches -- including the Eigen patch -- sort of doubles down on the notion of --config=cuda. If we can't build a TF which includes both cuda and rocm bits, then we'll effectively never be able to get rid of --config=cuda.

That's why in XLA we've insisted that we maintain the ability to build for both cuda and rocm, determined by BUILD dependencies.

That said, it's unclear to me what is the alternative to these macros for TF, given that the structure of the rocm Eigen patches does not (?) allow us to build for AMDGPU and NVGPU in the same binary. So TF may need its own versions of them internally; I don't see how else to do it.

@jlebar Unfortunately given how GPU common runtime is designed I think it's hard to let TF be configured and built with both CUDA and ROCm at the same time. In gpu_device.cc, EigenCudaStreamDevice has a compile-time dependency to CUDA constructs. In my current implementation for ROCm, I renamed EigenCudaStreamDevice to EigenGpuStreamDevice and use TENSORFLOW_USE_ROCM macro to switch to ROCm-functional equivalents.

For XLA compiler, it's relatively easy to specify a new set of compiler backend and let it target AMDGPU. But for GPU common runtime, a bigger overhaul might be required if the ultimate goal is to get rid of --config=cuda.

That said, I believe such effort to modularize TF runtime, should be deferred to future PRs, after we have better consensus on how that be achieved?

Let me try to clarify the way I'd like to see things working.
The goal is to be able to build any (and all) variants without specifying any extra bazel flags.
I.e. one should be able to say bazel //my_project:app_cuda //my_project:app_rocm //my_project:app_cpu and get all three executables.

Let's suppose the app consists of main.cpp, kernel.cu (for CUDA and ROCm) and kernel_cpu.cpp (for CPU-only). All three app variants would use the same main.o, app_cuda will use kernel-cuda.o built from kernel.cu using CUDA-specific options/defines, kernel-rocm.o from kernel.cu using ROCm-specific options/defines and kernel-cpu would be compiled form kernel_cpu.cpp. User should be able to build any combination of them simultaneously. When you change anything in the build system, there's only one build configuration to test. If you've built an app there's no confusion about what it supports (or does not). I can't count number of times when someone attemted to run CPU-only TF and complained that it does not sees the GPUS or ran CUDA-enabled GPU on machine without GPUs and complained that it failed. Single build configuration also saves on overall build time as the objects that don't care about CUDA/ROCm will be built only once, instead of once per build config. Considering that CUDA files constitute relatively small subset of tensorflow, the difference is substantial.

--config=cuda was inherited from internal Google build and exists for number of reasons that are not relevant to open-source tensorflow. I understand that it is convenient to continue adding extra dimensions to config parameters, but now that we're adding support for another accelerator is the good time to make sure we do it right. Maintaining and debugging multiple build configurations is a royal pain. Having single set of build rules for everything makes things somewhat easier to deal with.

@Artem-B / @jlebar let me see if i understand this correctly to assess the efforts. It seems you expect one bazel build to get executables for all specified targets, and ditch --config=XXX.

As a corollary, would the following command be good for you? 1 bazel build get 3 PIP packages:

# 1 bazel build get 3 PIP packages bazel build //tensorflow/tools/pip_package:build_cpu_pip_package //tensorflow/tools/pip_package:build_cuda_pip_package //tensorflow/tools/pip_package:build_rocm_pip_package

Also I'm wondering how shall we deal with test targets in bazel test?

@gregestren sorry for inviting you to an new discussion thread without prior notification.

I, @whchung, am discussing with @jlebar / @Artem-B / @gunan about how to improve the build system of TensorFlow to make it to support multiple CPU / GPU platforms. And I've discovered your work of dynamic bazel configurations [1], skylark build configurations [2], and 2018 Bazel Configurability / Multiplatform Roadmap [3] which eventually led me to bazel roadmaps pages [4] [5].

[1] https://docs.google.com/document/d/1uoU8t7loTOu6uyzez-ilhcYgEXg4xjViJu0aqrf69TY/edit#

[2] https://docs.google.com/document/d/1vc8v-kXjvgZOdQdnxPTaV0rrLxtP2XwnD2tAZlYJOqw/edit#

[3] https://docs.google.com/document/d/1p7pdEwqnLOC-ATl-FdyHBJdk4Ae-5GzAYDvpJeMh4tU/edit?ts=5ac517b9#

[4] https://www.bazel.build/roadmaps/configuration.html

[5] https://www.bazel.build/roadmaps/platforms.html

I'm very new to the implementation of bazel, let alone how to test / adapt all these upcoming features of it in TensorFlow. I'm wondering could you help share some working example projects with targets tied to different toolchains, so I can learn better from those examples? Thank you very much.

Sorry, I've completely forgot that OSS TF still uses custom crosstool to compile with nvcc. :-(
That's going to be a problem. Crosstool is not something that can be switched per build target,
and NVCC does not support clang as the host compiler. That forces us to have multiple build configurations. We still would be able to do something like bazel --config=nvcc foo_cuda foo_cpu, or bazel foo_rocm foo_cpu (probably with something like --config=rocm) but we will not be able to do all three at once. This complicates things.

I'll need to think about it a bit.

@gregestren -- It sounds like bazel has grown a lot of features lately that I've been missing for CUDA compilation. I'm glad to see that my handwavy proposal for TF compilation seems to be roughly in line with the general direction bazel configurability roadmap seems to head towards. I'll need to take a closer look at the recent changes to get better idea what we can do these days, but it appears that I may have more tools at my disposal than I used to in the past.

Ok to merge then?

Considering that at the moment we do not have a working alternative, that's probably the least bad option.
It does entangle the build with multiple crosstools, but it's a marginal bump in amount of work we'll need to do in addition to what's needed to deal with --config=nvcc. Making sure rocm build remains working will be a bit of a pain, but I expect bulk of the issues to be figured out on nvcc build, so overall it should not be a major issue.

I'm OK with the patch, but it's ultimately TF team's call.

Hi Jack, Artem,

Apologies for my delayed response - I was doing some personal traveling last week.

I'd love to discuss more both your goals and how Bazel's multiplatform changes could help them. We're at a weird state now where lots of new possibilities are opening up (like being able to really support per-target crosstools) but the public APIs are still all coming together. So it's not as simple as "just follow this pattern in the Bazel documentation" but that doesn't mean there aren't options for you.

Since I'm just coming in late into this conversation, I suggest we all get on the same basic page of what's desired. Then we can clarify what features can address your goals, and how well.

Would that work?

gunan · 2018-07-16T20:51:11Z

    deps = [
        "//tensorflow/core:framework",
        "//tensorflow/core:lib",
+    ] + if_cuda([


I feel like this whole library should only be added if cuda is enabled. What is the need for a double check?

Unfortunately the header file in cuda_solvers target is not CUDA-specific. There are some utility structures used by operators such as SegmentSumGPUOp which is supported on ROCm.

I'll submit additional PRs to refactor SegmentSumGPUOp. And I'll remove this check here in this PR.

gunan · 2018-07-16T20:52:15Z

+    deps = MATH_DEPS + if_cuda_is_configured([
+        ":cuda_solvers",
+    ]) + if_rocm_is_configured([
        ":cuda_solvers",


This is quite surprising to me.
cuda_solvers is an empty library if cuda is not enabled. why even link it here?

Unfortunately the header file in cuda_solvers target is not CUDA-specific. There are some utility structures used by operators such as SegmentSumGPUOp which is supported on ROCm.

I'll submit additional PRs to refactor SegmentSumGPUOp. And I'll remove cuda_solvers dependency here.

gunan · 2018-07-16T20:52:51Z

    "cuda_default_copts",
 )
+load(
+    "@local_config_rocm//rocm:build_defs.bzl",


Ditto. @jlebar we need to decide on the internal versions of these before we can merge this.

So do we approve, and start wrestling with this change internally?

gunan · 2018-07-16T20:53:55Z

-      extra_copts=extra_copts,
-      linkopts=linkopts,
-      args=args)
+  if if_cuda_is_configured(True) or if_rocm_is_configured(True):


Internally, we do not have a configure script. So we need this enabled unconditionally. Please revert. and just expand the switch statements.

gunan · 2018-07-16T20:56:28Z

                          hdrs=[],
                          **kwargs):
-  copts = copts + _cuda_copts() + if_cuda(cuda_copts) + tf_copts()
+  copts=copts + tf_copts() + _cuda_copts(opts=cuda_copts) + _rocm_copts(opts=cuda_copts)


nit. please revert the spacing change around =
It needs to be copts = copts + .... for our code linter checks to pass.
Not the difference between local variable assignment here, vs a function kwarg, which does not need the space around the =.

thanks. will fix.

gunan · 2018-07-16T20:58:43Z

    else:
      set_trisycl_include_dir(environ_cp)

+  set_action_env_var(environ_cp, 'TF_NEED_ROCM', 'ROCm', False)


Looking at some of the options below, enabling both at the same time would break other things. For example, the _gpu cc_test targets. Which GPU would they run on?
Let's add the check @caisq requested here.

gunan · 2018-07-16T21:01:24Z

@@ -0,0 +1,97 @@
+# This Dockerfile provides a starting point for a ROCm installation of 
+# MIOpen and tensorflow.  
+FROM ubuntu:xenial-20170619


Do we have an upper limit on the kernel version?
If we depend on one very specific kernel version this will be very very brittle.
Is there a way to lift this restriction?

rmlarsen · 2018-07-17T20:46:20Z

@gunan could you please take another look?

whchung · 2018-07-17T21:02:14Z

@rmlarsen / @gunan please hold it for now . I haven't finished addressing all comments from reviewers yet. will ping you guys once i feel more comfortable with the PR. And I think I'll probably need to squash the commits to make commit history looks nicer.

whchung · 2018-07-17T23:15:28Z

@rmlarsen / @gunan I believe I've already address all the comments and updated the PR. Please help review it again. Thanks!

whchung · 2018-07-20T20:18:59Z

a gentle ping?

gunan · 2018-08-30T04:07:58Z

No, we do not have the hardware, toolchains, license reviews or anything in place for us to be able to build with ROCm. So they definitely wont become blocking presubmits for now.
This, of course will be subject to review later.

In the meantime, we can work with you to setup community supported builds, as outlined here:
https://github.com/tensorflow/community/blob/master/sigs/build/community-builds.md

whchung · 2018-08-31T14:54:11Z

@gunan Thanks for sharing the community build page with me. @parallelo would look into it and adapt our CI infrastructure to accommodate that.

Also we do like to revive our other outstanding PRs, specifically those in StreamExecutor and GPU common runtime which are blocked by this particular PR. Also we'll start submitting PRs to enable operators on ROCm which condition upon TENSORFLOW_USE_ROCM introduced in this PR. Would it be possible to help expedite allowing this PR be merged? Thanks a lot.

With more developers working on TensorFlow ROCm port now we expect subsequent PRs be revised / maintained in timely fashion.

yifeif · 2018-09-05T21:46:20Z

This will need a manual pull for sure @gunan. @whchung do you mind resolving the latest conflicts? Reviewers, let me know if this is ready and I can give pulling a shot.

The commit contains following components to support TensorFlow on ROCm platform - bazel build system - continuous integration logic Authors: - Jack Chung: jack.chung@amd.com - Jeffrey Poznanovic: Jeffrey.Poznanovic@amd.com - Peng Sun: Peng.Sun@amd.com

whchung · 2018-09-06T00:56:40Z

@yifeif @gunan, my colleague @deven-amd has debased and modified the PR

dagamayank · 2018-09-21T15:52:57Z

@yifeif any update to pull this PR in?

drpngx · 2018-09-21T16:19:20Z

@aaroey @meteorcloudy any additional comments?

drpngx · 2018-09-24T18:46:01Z

Oh, it looks like we have to import this manually.

gunan · 2018-09-24T18:50:46Z

@yifeif is working on the manual import of this, she has been working on this for 2 weeks now.

PiperOrigin-RevId: 214793113

gunan · 2018-09-27T17:33:38Z

We had to revert some "if_cuda_is_configured" uses to if_cuda to make all internal tests to pass, but other than that all of this change has been merged.

yifeif · 2018-09-27T17:35:17Z

We finally got this change merged! Thanks for the patience. We needed to change if_cuda_is_configured in tf_cuda_library back to if_cuda to get some internal targets to pass. Feel free to send another PR if this causes any issue and we can work out a patch.

drpngx · 2018-09-27T17:44:27Z

Woohoo! Thank you @yifeif !

whchung · 2018-09-27T17:45:58Z

Thank you @yifeif and @gunan . We’ll revise other pending PRs for ROCm , as well as submitting new ones :)

PiperOrigin-RevId: 219362371

googlebot added the cla: yes label Jun 25, 2018

tensorflowbutler assigned rmlarsen Jun 26, 2018

gunan requested review from caisq, meteorcloudy and yifeif June 29, 2018 17:01

yifeif previously approved these changes Jul 10, 2018

View reviewed changes

caisq previously approved these changes Jul 10, 2018

View reviewed changes

caisq reviewed Jul 10, 2018

View reviewed changes

whchung dismissed stale reviews from caisq and yifeif via 76be23a July 10, 2018 18:58

gunan reviewed Jul 10, 2018

View reviewed changes

whchung mentioned this pull request Jul 11, 2018

rename interfaces in StreamInterface and StreamExecutorInterface to support both CUDA and ROCm #20675

Merged

meteorcloudy requested changes Jul 11, 2018

View reviewed changes

perfinion reviewed Jul 12, 2018

View reviewed changes

whchung mentioned this pull request Jul 12, 2018

[XLA][AMDGPU] enable AMDGPU target #20749

Closed

gunan reviewed Jul 16, 2018

View reviewed changes

whchung force-pushed the upstream-staging branch from 4885f5e to e2b5544 Compare July 17, 2018 19:16

whchung force-pushed the upstream-staging branch 2 times, most recently from acb4dc5 to 6d5ca1c Compare July 17, 2018 23:14

whchung mentioned this pull request Jul 18, 2018

[XLA:GPU] add XLA_AMDGPU device #20845

Closed

whchung mentioned this pull request Jul 20, 2018

Revise bazel build system and continuous integration logic ROCm/tensorflow-upstream#77

Merged

whchung mentioned this pull request Sep 4, 2018

Consolidate versioning for GPU hardware #20786

Closed

deven-amd force-pushed the upstream-staging branch from 45c2244 to 69d3b8f Compare September 6, 2018 00:48

gunan added the kokoro:force-run Tests on submitted change label Sep 11, 2018

kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 11, 2018

gunan approved these changes Sep 11, 2018

View reviewed changes

aaroey self-requested a review September 12, 2018 20:19

yifeif added ready to pull PR ready for merge process kokoro:force-run Tests on submitted change labels Sep 16, 2018

kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 17, 2018

meteorcloudy approved these changes Sep 24, 2018

View reviewed changes

tensorflow-copybara merged commit 69d3b8f into tensorflow:master Sep 27, 2018

tensorflow-copybara pushed a commit that referenced this pull request Sep 27, 2018

Merge pull request #20277 from ROCmSoftwarePlatform:upstream-staging

62e6016

PiperOrigin-RevId: 214793113

whchung mentioned this pull request Sep 28, 2018

Added a fix for Dockerfile target.lst files and also added gfx906 ROCm/tensorflow-upstream#183

Merged

This was referenced Oct 2, 2018

Fix for the broken "--config=rocm" build (followup to PR 20277) #22658

Closed

[ROCm] StreamExecutor logic for ROCm platform (PR 20709 continued) #22669

Closed

tensorflow-copybara pushed a commit that referenced this pull request Oct 30, 2018

Switch if_cuda back to if_cuda_is_configured as in PR #20277.

ea42e88

PiperOrigin-RevId: 219362371

sunway513 deleted the upstream-staging branch May 4, 2019 15:06

Conversation

whchung commented Jun 25, 2018

Uh oh!

whchung commented Jul 6, 2018

Uh oh!

yifeif left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whchung Jul 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whchung Jul 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whchung Jul 10, 2018 •

edited

Loading

whchung Jul 10, 2018 •

edited

Loading