About 2 minor bug fixes on CUDA macOSX 10.13.6 #46803

llv22 · 2020-10-24T04:33:25Z

🐛 Bug

During building pytorch 1.7 on macOSX 10.13.6, cuda 10.1(update2), Xcode 10.1(clang 1000.11.45.5), I found the following two bugs and adjusted code accordingly on my local mac. Just update information to you in case that you need to fix as well.

To Reproduce

Using "MAGMA_HOME="/usr/local/lib/magma-cu101" MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ USE_OPENMP=OFF USE_FBGEMM=OFF python setup.py bdist_wheel" to build pytorch on 1.7 branch

Steps to reproduce the behavior:

For issue during building nnpack

"ModuleNotFoundError: No module named 'peachpy.x86_64.avx’"

Error when building ../torch/csrc/jit/codegen/cuda/kernel_cache.cpp

../torch/csrc/jit/codegen/cuda/kernel_cache.cpp:29:25: error: format specifies type 'long' but the argument has type 'int64_t' (aka 'long long') [-Werror,-Wformat]
        printf("%ld, ", shape_symbol.static_size());
                ~~~     ^~~~~~~~~~~~~~~~~~~~~~~~~~
                %lld
../torch/csrc/jit/codegen/cuda/kernel_cache.cpp:31:28: error: format specifies type 'long' but the argument has type 'int64_t' (aka 'long long') [-Werror,-Wformat]
        printf("s(%ld), ", *reinterpret_cast<const int64_t*>(&shape_symbol));
                  ~~~      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  %lld

My code changes

For issue during building nnpack: have to generate bytecode ahead of time
https://github.com/pytorch/pytorch/blob/master/cmake/External/nnpack.cmake#51, add following codes:

  # Orlando: check if avx bytecode has been generated or not
  if(NOT EXISTS "${CAFFE2_THIRD_PARTY_ROOT}/python-peachpy/peachpy/x86_64/avx.py")
    execute_process(COMMAND python setup.by develop
      WORKING_DIRECTORY "${CAFFE2_THIRD_PARTY_ROOT}/python-peachpy")
  endif()

Error when building ../torch/csrc/jit/codegen/cuda/kernel_cache.cpp
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/codegen/cuda/kernel_cache.cpp#L48
int64_t refers to third_party/eigen/Eigen/src/Core/util/Meta.h

      const auto& shape_symbol = sizes.value()[i];
      if (shape_symbol.is_static()) {
        printf("%lld, ", shape_symbol.static_size());
      } else {
        printf("s(%lld), ", *reinterpret_cast<const int64_t*>(&shape_symbol));
      }

Environment

PyTorch Version (e.g., 1.0): 1.7
OS (e.g., Linux): macOS 10.13.6
How you installed PyTorch (conda, pip, source): source
Build command you used (if compiling from source): MAGMA_HOME="/usr/local/lib/magma-cu101" MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ USE_OPENMP=OFF USE_FBGEMM=OFF python setup.py bdist_wheel
Python version: 3.7
CUDA/cuDNN version: 10.1update 2, cudnn 7.6.5
GPU models and configuration:
Any other relevant information:

cc @malfet @seemethere @walterddr @ngimel

The text was updated successfully, but these errors were encountered:

llv22 · 2020-10-26T03:08:59Z

@osalpekar I tried to figure out how to submit a minor merge for the bug. Anyway, submit patch here.
orlando-for-patch-torch1.8-mac.patch.txt
this also apply for 1.5-1.7

ezyang · 2020-10-26T17:12:19Z

@llv22 would you mind opening a pull request for your patch?

ezyang · 2020-10-26T17:29:18Z

By the way, CUDA on OS X is an unsupported configuration (we're happy to take your fixes, but we won't be testing these continuously)

llv22 · 2020-10-27T01:59:39Z

@ezyang actually I tried with create new branch from master and can't push this branch to torch to create pull request. Is it caused by some missing rights? or we can have another process to make pull request ready?

ezyang · 2020-10-27T14:39:58Z

Make a fork of PyTorch, and then push your branch to that fork and then open a PR from that branch

llv22 · 2020-10-28T02:58:15Z

@ezyang OK, refer to #46968.

llv22 changed the title ~~About 2 bug-fixes on macOSX 10.13.6~~ About 2 minor bug fixes on macOSX 10.13.6 Oct 24, 2020

llv22 mentioned this issue Oct 24, 2020

can't build pytorch on osx 10.13.6 for pytorch 1.5 and 1.6 TomHeaven/pytorch-osx-build#14

Closed

izdeby added the triage review label Oct 26, 2020

ezyang added module: build Build system issues module: macos Mac OS related issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: nnpack Related to our NNPack integration labels Oct 26, 2020

malfet added the module: cuda Related to torch.cuda, and CUDA support in general label Oct 26, 2020

ezyang removed the triage review label Oct 26, 2020

malfet self-assigned this Oct 26, 2020

ezyang changed the title ~~About 2 minor bug fixes on macOSX 10.13.6~~ About 2 minor bug fixes on CUDA macOSX 10.13.6 Oct 26, 2020

llv22 mentioned this issue Oct 28, 2020

orlando - for fixing of int64_t and peachpy-python installation #46968

Closed

walterddr mentioned this issue Oct 29, 2020

Get rid of printf in cuda fuser debugPrint() #46994

Closed

malfet removed their assignment May 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About 2 minor bug fixes on CUDA macOSX 10.13.6 #46803

About 2 minor bug fixes on CUDA macOSX 10.13.6 #46803

llv22 commented Oct 24, 2020 •

edited by pytorch-probot bot

Loading

llv22 commented Oct 26, 2020

ezyang commented Oct 26, 2020

ezyang commented Oct 26, 2020

llv22 commented Oct 27, 2020

ezyang commented Oct 27, 2020

llv22 commented Oct 28, 2020

About 2 minor bug fixes on CUDA macOSX 10.13.6 #46803

About 2 minor bug fixes on CUDA macOSX 10.13.6 #46803

Comments

llv22 commented Oct 24, 2020 • edited by pytorch-probot bot Loading

🐛 Bug

To Reproduce

My code changes

Environment

llv22 commented Oct 26, 2020

ezyang commented Oct 26, 2020

ezyang commented Oct 26, 2020

llv22 commented Oct 27, 2020

ezyang commented Oct 27, 2020

llv22 commented Oct 28, 2020

llv22 commented Oct 24, 2020 •

edited by pytorch-probot bot

Loading