Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor fixes to build on Mac OS #62

Closed
wants to merge 1 commit into from
Closed

Conversation

Yangqing
Copy link

This PR tries to make NCCL build on Mac - tested on Mac OS 10.12, clang 8.0.0 and cuda 8.0.

  • I changed the Makefile to make sure that on Mac ("Darwin") we set the right lib and versions.
  • It seems that clang 8.0.0 has an issue with nullptr_t so we need to manually specify the type.

I haven't tested this on a Linux box, but hopefully I haven't touched anything for Linux.

@Yangqing
Copy link
Author

friendly ping @sjeaugey

@codeAC29
Copy link

Thank you @Yangqing !!! This fix was helpful.

@allenwq
Copy link

allenwq commented Jul 1, 2018

Compiling src/all_gather.cu                   > /Users/allen/Documents/dev/nccl/build/obj/all_gather.o
src/common_kernel.h(237): error: class "__half" has no member "x"

src/common_kernel.h(237): error: class "__half" has no member "x"

src/common_kernel.h(250): error: class "__half" has no member "x"

src/common_kernel.h(250): error: class "__half" has no member "x"

src/copy_kernel.h(28): error: class "__half" has no member "x"

src/copy_kernel.h(28): error: class "__half" has no member "x"

6 errors detected in the compilation of "/var/folders/yh/hys1jw_n29z1rkk87r9l4z8r0000gn/T//tmpxft_0001400c_00000000-8_all_gather.compute_52.cpp1.ii".
make: *** [/Users/allen/Documents/dev/nccl/build/obj/all_gather.o] Error 1

get these errors on 10.12.6, CUDA 9.2

@Yangqing do you have any ideas?

@allenwq
Copy link

allenwq commented Jul 1, 2018

For those who are facing above issue, git rebase master will work.

@CapoeiraShaolin1
Copy link

CapoeiraShaolin1 commented Sep 22, 2019

(1) as comments allenwq on Jul1, 2018, I realize NVIDIA/NCCL/common_kernel.h has been revised for CUDA ver 9 or later. so, how about updating like this,

line 217

#ifdef CUDA_HAS_HALF
#if CUDART_VERSION < 9000
template<> inline __device__
half vFetch<half>(const volatile half* ptr) {
  half r;
  r.x = ptr->x;
  return r;
}
#else
template<> inline __device__
half vFetch<half>(const volatile half* ptr) {
  half r;
  r = ((half*)ptr)[0];
  return r;
}

#endif  //  CUDART_VERSION 
#endif  //  CUDA_HAS_HALF 

template<typename T> inline __device__
void vStore(volatile T* ptr, const T val) {
  *ptr = val;
}
#ifdef CUDA_HAS_HALF
#if CUDART_VERSION < 9000
template<> inline __device__
void vStore<half>(volatile half* ptr, const half val) {
  ptr->x = val.x;
}
#else
template<> inline __device__
void vStore<half>(volatile half* ptr, const half val) {
  ((half*)ptr)[0] = val;
}
#endif  //  CUDART_VERSION 
#endif  //  CUDA_HAS_HALF 

(2) ncclAllGather() arguments looks conflicts of position of recvbuff against NVIDIA/NCCL/nccl.h. so could you check nccl.h and all_gather.cu. it should be

ncclResult_t  ncclAllGather(const void* sendbuff, void* recvbuff, int count,
    ncclDataType_t datatype, ncclComm_t comm, cudaStream_t stream);
ncclResult_t pncclAllGather(const void* sendbuff, void* recvbuff, int count,
    ncclDataType_t datatype, ncclComm_t comm, cudaStream_t stream);

Thanks,

@CapoeiraShaolin1
Copy link

libwrap.cu loads "libnvidia-ml.so" (memory management) but which is Not available for MacOS. does anyone find solution to resolve missing libnvidia-ml.so for MacOS?

@Yangqing Yangqing closed this Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants