Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

楼主能否说下r1.14如何编译的? #13

Closed
yrwy opened this issue Jul 20, 2019 · 3 comments
Closed

楼主能否说下r1.14如何编译的? #13

yrwy opened this issue Jul 20, 2019 · 3 comments

Comments

@yrwy
Copy link

yrwy commented Jul 20, 2019

我碰到好几个问题

第一个是检测cuda dylib出错,明明有却出现 soname error 通过修改bzl 文件可以过。。
第二个是absl 编译时出错,好像是模板用了什么关键字导致错误。难道你用了旧版的替换了吗?

@TomHeaven
Copy link
Owner

v2.0.0 / v1.14 需要七种新补丁,这个修改笔记,我也没时间整理了,你凑合看吧:

v2.0.0-beta

使用CUDA版本:3.0,3.5,5.0,5.2,6.1,7.0

  • 编译错误1:找不到CUDA相关的库。与third-party/gpu相关。

None of the libraries match their SONAME: /usr/local/cuda/lib64/libcudart.10.0.dylib

Workaround: 直接注释掉third-party/gpus/cuda_configure.bazel:554-560

def find_lib(repository_ctx, paths, check_soname = True):
    """
      Finds a library among a list of potential paths.
      Args:
        paths: List of paths to inspect.
      Returns:
        Returns the first path in paths that exist.
    """
    objdump = repository_ctx.which("objdump")
    mismatches = []
    for path in [repository_ctx.path(path) for path in paths]:
        if not path.exists:
            continue
        #if check_soname and objdump != None and not _is_windows(repository_ctx):
        # output = repository_ctx.execute([objdump, "-p", str(path)]).stdout
        # output = [line for line in output.splitlines() if "SONAME" in line]
        # sonames = [line.strip().split(" ")[-1] for line in output]
        # if not any([soname == path.basename for soname in sonames]):
        # mismatches.append(str(path))
        # continue
        return path
    if mismatches:
        auto_configure_fail(
            "None of the libraries match their SONAME: " + ", ".join(mismatches),
        )
    auto_configure_fail("No library found under: " + ", ".join(paths))

或者修改为

output = repository_ctx.execute([objdump, "-p", str(path)]).stdout

output = [line for line in output.splitlines() if "name @rpath/" in line]

sonames = [line.strip().split("/")[-1] for line in output]
sonames = [sonames[0].strip().split(" ")[0] for line in output]

错误原因为objdump命令获取的信息在osx和linux上不一样。

  • 编译错误2:
ERROR: /private/var/tmp/_bazel_tomheaven/561821a038e9c8d51ab53646fb4bd33f/external/local_config_cuda/cuda/BUILD:168:1: Couldn't build file external/local_config_cuda/cuda/cuda/include/builtin_types.h: Executing genrule @local_config_cuda//cuda:cuda-include failed (Exit 1)
cp: the -H, -L, and -P options may not be specified with the -r option.

原因: osx cp命令不识别参数 -rLf,修改 third-party/gpus/cuda_configure.bazel:935行为

  #cmd = \"""cp -rLf "%s/." "%s/" \""",
#)""" % (name, "\n".join(outs), src_dir, out_dir)

 cmd = \"""cp -r -f "%s/." "%s/" \""",
)""" % (name, "\n".join(outs), src_dir, out_dir)
  • 编译错误3:找不到libcuda。原因:cuda_configure.bazel在cd /usr/cuda/lib64/stubs目录下找libcuda.dylib,而licbcuda.dylib在 /usr/cuda/lib64/目录下。解决

方案1:修改 third-party/gpus/cuda_configure.bazel:605 find_lib函数

    #stub_dir = "" if _is_windows(repository_ctx) else "/stubs"
    stub_dir = "" if _is_windows(repository_ctx) else ""

方案2:将libcuda.dylib复制过去

cd /usr/cuda/lib64/

sudo cp libcuda.dylib stubs/
  • 编译错误4:./tensorflow/core/util/gpu_device_functions.h(144): error: identifier "__nvvm_read_ptx_sreg_laneid" is undefined

修改142-147为

#if GOOGLE_CUDA
//#if __clang__
// return __nvvm_read_ptx_sreg_laneid();
//#else // __clang__
  asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
//#endif // __clang__
  • 编译错误5:

external/com_google_absl/absl/container/internal/compressed_tuple.h:170:53: error: use 'template' keyword to treat 'Storage' as a dependent template name
return (std::move(*this).internal_compressed_tuple::Storage< CompressedTuple, I> ::get()); 

修改源码bazel-tensorflow/external/com_google_absl/absl/container/internal/compressed_tuple.h:168-178,注释掉两个问题函数:

/*template <int I>
  ElemT<I>&& get() && {
    return std::move(*this).internal_compressed_tuple::template Storage<CompressedTuple, I>::get();
  }
  template <int I>
  constexpr const ElemT<I>&& get() const&& {
    return absl::move(*this).internal_compressed_tuple::template Storage<CompressedTuple, I>::get();
  }*/

这两个函数是魔鬼,怎么改都编译错误,只能注释掉。

参考:https://stackoverflow.com/questions/3786360/confusing-template-error

  • 编译错误6:
tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc(46): error: calling a __host__ function("std::__1::operator ==<float> ") from a __global__ function("tensorflow::SolveForSizeOneOrTwoKernel< ::std::__1::complex<float> > ") is not allowed

tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc(55): error: calling a __host__ function("std::__1::operator ==<float> ") from a __global__ function("tensorflow::SolveForSizeOneOrTwoKernel< ::std::__1::complex<float> > ") is not allowed

修改global为device

//__global__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,

__device__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,
  • 编译错误7:
tensorflow/core/kernels/conv_grad_filter_ops.cc:736:18: error: constexpr variable 'kComputeInNHWC' must be initialized by a constant expression
  constexpr auto kComputeInNHWC =

修改多个源码文件 conv_grad_filter_ops.cc, conv_grad_input_ops.cc, conv_ops.cc (v1.14.0正式版还需要修改这个) ,分别去掉两处constexpr

  • 之前的源码补丁继续用。照常编译。

@yrwy
Copy link
Author

yrwy commented Aug 1, 2019

ABSL我替换成r1.13版本的那个ABSL后编译正常
XLA 在r1.14编译不成功 编译到最后本来就已经加了--nonccl 还去找thirdparty/nccl/nccl.h
xcode9.4.1编译XLA时 需要改很多constexpr --> const
而用了xcode10.1后 很多地方不需要改...xcode真让人头疼。。
macos 10.13 还存在显存泄露的问题 macos10.12可以通过复制
sudo cp /Library/Frameworks/CUDA.framework/Versions/A/Libraries/libcuda_378.10.10.10_mercury.dylib /Library/Frameworks/CUDA.framework/Versions/A/Libraries/libcuda_378.05.05_mercury.dylib
支持 cuda10.1 cuda10

@TomHeaven
Copy link
Owner

看起来问题已经解决

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants