Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHC-8 problems #39

Closed
tmcdonell opened this issue Jul 4, 2016 · 24 comments

Comments

Projects
None yet
6 participants
@tmcdonell
Copy link
Owner

commented Jul 4, 2016

Originally reported by David Duke.


While working with the Haskell Cuda library on OSX 10.11 I started getting a strange set of behaviours, and wondered if you had come anything similar? I recently updated both my GHC installation (to 8.0.1) and my CUDA toolkit (to 7.5). I therefore wanted to update Accelerate etc, but noted that your Cuda package was only noted up to 7.0. As I don't believe there are substantial changes from 7.0 -> 7.5 I thought it should still work (and I need to have the later CUDA for work not involving Haskell).

However I found that Haskell code that called the Cuda library was aborting, and tracked the failure down to the call to cuInit (made through "initialise" in your library) returning error code 2 (CUDA_DEVICE_OUT_OF_MEMORY). Its not clear why this should be happening, and to explore further I:

  1. created my own simple C wrapper function around cuInit, which displays the arg and result.
  2. wrote a C driver to call the wrapper; when executed cuInit is called and returns error code 0.
  3. wrote a Haskell driver to call the simple wrapper directly via FFI: now when the wrapper is executed cuInit returns error code 2.

Given the simplicity of the two programs, I'm scratching my head for possible causes: when called from C, the wrapper is showing the correct arg and result; when called from Haskell it shows the correct arg but the wrong result! Here are the compiler invocations and runtime results (programs are attached):

~> gcc -c -I /usr/local/cuda/include  cuwrap.c
~> ghc callFromHs.hs  cuwrap.o -L /usr/local/cuda/lib/ -lcuda
~> gcc -o callFromC callFromC.c cuwrap.o -L /usr/local/cuda/lib/ -lcuda
~> ./callFromC
Running main.
cuInit called: arg 0, result 0
Main completed, result 0

~> ./callFromHs
Running Main
cuInit called: arg 0, result 2
Main completed, result 2

I haven't had a chance to regress to ghc-7.10.3, and was also planning to try the code on linux once Cuda is reinstalled next week. Wondered if you had come across anything similar - or could check what happens on a different configuration?

Attachments: https://gist.github.com/tmcdonell/ee7c5183633a3687dafd15023f15a914

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Jul 4, 2016

Reproduced on my machine Mac OS 10.11.5, CUDA 7.5.26, GHC-8.0.1. Seems to be fine with GHC-7.10.3 though. Hmm...

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Jul 15, 2016

Worked fine for me on a Ubuntu 12.04 box with GHC 8.0.1, so possibly confined to OS X.

@tmcdonell tmcdonell added the macOS label Jul 15, 2016

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Jul 15, 2016

@mchakravarty @robeverest if you have a different configuration could you try this on your machine?

@tmcdonell tmcdonell referenced this issue Jul 15, 2016

Closed

Preparation for GHC 8.0 #303

17 of 18 tasks complete
@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Jul 15, 2016

Here is an interesting ticket which discusses cuInit failing due to trying to mmap a specific region. GHC-8's new memory allocator may be interfering with this.

This is just a hypothesis however, which I'm not yet sure how to test.

@robeverest

This comment has been minimized.

Copy link

commented Jul 15, 2016

So my OSX configuration is unfortunately the same as yours, but I can confirm I'm seeing the same bug. I did also try it out on Ubuntu 14.04 and it works as expected.

@niobium0

This comment has been minimized.

Copy link

commented Aug 3, 2016

nvidia-device-query dies on CUDA.initialise under

(Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-361) and
(Ubuntu 16.04, GHC 8.0.1, cuda-8, nvidia-367).

Programs written on top of accelerate worked fine under
(Ubuntu 16.04, GHC 7.10.3, cuda-7.5, nvidia-361).

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Aug 3, 2016

@niobium0 Okay, thanks for the confirmation that this is not limited to macOS.

@tmcdonell tmcdonell added the linux label Aug 3, 2016

tmcdonell added a commit that referenced this issue Aug 5, 2016

Initial support for GHC-8
See the description in 'init.c' for details of the problem. This trick works for compiled programs, but we still have problems with running under ghci.

towards: #39

tmcdonell added a commit that referenced this issue Aug 9, 2016

Change approach to GHC-8 support
On initialisation just reserve the memory block that will be required by the CUDA driver, and release it only once the user calls 'cuInit'.

This still doesn't work with ghci, but feels like it is moving in the right direction. (Now, 'cuInit' crashes with 'SIGBUS' (macos) or 'SIGSEGV' (ubuntu), rather than giving the same "out of memory error" even if we had already called 'cuInit' by the previous method via LD_PRELOAD/DYLD_INSERT_LIBRARIES before the RTS initialised.)

towards: #39
@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Aug 9, 2016

I will note that these workarounds probably aren't going to work on windows... :

@niobium0

This comment has been minimized.

Copy link

commented Aug 9, 2016

Trevor, thank you for the swift fix. Unfortunately I don't have a Windows machine at hand, but can verify that everything works as expected in my setup (Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-367).

@djduke

This comment has been minimized.

Copy link

commented Aug 24, 2016

Thanks for working on this Trevor. Not sure if its meant to be in a sufficiently stable state yet to build, so ignore if premature, but when I tried building on OSX10.11 with GHC 8.0.1 and gcc Apple LLVM version 7.3.0 (clang-703.0.31) I ran into problems apparently due to dynamic linking

ld: -rpath can only be used when creating a dynamic final linked image

for modules Foreign.CUDA.Analysis.Device and Foreign.CUDA.Types. Unclear if its an issue with the modified Setup.hs, OSX generally, or just my particular toolset.

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Sep 8, 2016

@alpmestan

This comment has been minimized.

Copy link

commented Sep 30, 2016

Hello (again) Trevor :)

I ran into the same issue under archlinux, both against CUDA 7 and 8. It seems this fix hasn't been released yet, any reason for that? It's keeping me from using ghc 8 which isn't that big of a deal but still a bit annoying =)

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 6, 2016

@alpmestan sorry, just got back from conference travel and am catching up with things. The main problem is that I didn't yet get this to work under ghci. I guess having compiled programs working at least is a big plus, so I'll finalise and throw it up on hackage shortly.

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 7, 2016

@alpmestan

This comment has been minimized.

Copy link

commented Oct 7, 2016

Thanks! It's indeed annoying that it doesn't work in ghci but is still OK. Does the patch going in ghc 8.0.2 fix the ghci issue or is that one not fixed at all?

@djduke

This comment has been minimized.

Copy link

commented Oct 7, 2016

On 7 Oct 2016, at 09:33, Trevor L. McDonell notifications@github.com wrote:

https://hackage.haskell.org/package/cuda-0.7.5.0


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Thanks Trevor.

I'm still getting a build problem on OSX, however:

[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)
gcc' failed in phaseLinker'. (Exit code: 1)
cabal: Leaving directory '.'
cabal: Error: some packages failed to install:
cuda-0.7.5.0 failed during the building phase. The exception was:
ExitFailure 1

This was from a fresh clone of the cuda repo. Are you able to build under OSX, if so could you confirm compiler version, I'm using the following:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Best regards,
David.


David Duke T: +44 113 3436800
Professor of Computer Science E: D.J.Duke@leeds.ac.uk
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: G.Butterwick@leeds.ac.uk T: +44 113 3435434

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 7, 2016

@alpmestan As far as I know a fix has been merged, so hopefully that will be in 8.0.2. If it doesn't make it to that release (or 8.0.2 doesn't come out for a while) I'll have another crack at trying to make it.

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 7, 2016

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
  build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
    cudadevrt
    cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__
@djduke

This comment has been minimized.

Copy link

commented Oct 8, 2016

Hi Trevor,

As far as I can see, my tool configuration matches yours. My c2hs was older (2015), I updated c2hs and tried again but the problem persists. Here is a log of building cuda from a fresh clone of your repo, along with version info for the tools. I also ran cabal build with verbose=3, and looked at the output of the final set of commands (output at the end).

Regards,
David.

scsdjd:GitRepos> git clone https://github.com/tmcdonell/cuda
Cloning into 'cuda'...
remote: Counting objects: 3661, done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 3661 (delta 22), reused 0 (delta 0), pack-reused 3611
Receiving objects: 100% (3661/3661), 1.68 MiB | 249.00 KiB/s, done.
Resolving deltas: 100% (1954/1954), done.
Checking connectivity... done.

scsdjd:GitRepos> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.1

scsdjd:GitRepos> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
  build platform is "x86_64-darwin" <1, True, True, 1>

scsdjd:cuda> cabal configure
Resolving dependencies...
[1 of 1] Compiling Main             ( dist/setup/setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
Configuring cuda-0.7.5.0...
Found CUDA toolkit at: /usr/local/cuda
Storing parameters to cuda.buildinfo.generated
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.

scsdjd:cuda> cabal build
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.
Building cuda-0.7.5.0...
Preprocessing library cuda-0.7.5.0...
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CFloat->Float" may never fire
      because ‘Foreign.C.Types.CFloat’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CDouble->Double" may never fire
      because ‘Foreign.C.Types.CDouble’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
[ 2 of 37] Compiling Foreign.CUDA.Driver.Error ( dist/build/Foreign/CUDA/Driver/Error.hs, dist/build/Foreign/CUDA/Driver/Error.o )
[ 3 of 37] Compiling Foreign.CUDA.Driver.Profiler ( dist/build/Foreign/CUDA/Driver/Profiler.hs, dist/build/Foreign/CUDA/Driver/Profiler.o )
[ 4 of 37] Compiling Foreign.CUDA.Driver.Utils ( dist/build/Foreign/CUDA/Driver/Utils.hs, dist/build/Foreign/CUDA/Driver/Utils.o )
[ 5 of 37] Compiling Foreign.CUDA.Runtime.Error ( dist/build/Foreign/CUDA/Runtime/Error.hs, dist/build/Foreign/CUDA/Runtime/Error.o )
[ 6 of 37] Compiling Foreign.CUDA.Runtime.Utils ( dist/build/Foreign/CUDA/Runtime/Utils.hs, dist/build/Foreign/CUDA/Runtime/Utils.o )
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.o )
[ 8 of 37] Compiling Foreign.CUDA.Analysis.Occupancy ( Foreign/CUDA/Analysis/Occupancy.hs, dist/build/Foreign/CUDA/Analysis/Occupancy.o )
[ 9 of 37] Compiling Foreign.CUDA.Runtime.Device ( dist/build/Foreign/CUDA/Runtime/Device.hs, dist/build/Foreign/CUDA/Runtime/Device.o )
[10 of 37] Compiling Foreign.CUDA.Driver.Device ( dist/build/Foreign/CUDA/Driver/Device.hs, dist/build/Foreign/CUDA/Driver/Device.o )
[11 of 37] Compiling Foreign.CUDA.Driver.Context.Base ( dist/build/Foreign/CUDA/Driver/Context/Base.hs, dist/build/Foreign/CUDA/Driver/Context/Base.o )
[12 of 37] Compiling Foreign.CUDA.Driver.Context.Peer ( dist/build/Foreign/CUDA/Driver/Context/Peer.hs, dist/build/Foreign/CUDA/Driver/Context/Peer.o )
[13 of 37] Compiling Foreign.CUDA.Driver.Context.Primary ( dist/build/Foreign/CUDA/Driver/Context/Primary.hs, dist/build/Foreign/CUDA/Driver/Context/Primary.o )
[14 of 37] Compiling Foreign.CUDA.Driver.Module.Base ( dist/build/Foreign/CUDA/Driver/Module/Base.hs, dist/build/Foreign/CUDA/Driver/Module/Base.o )
[15 of 37] Compiling Foreign.CUDA.Driver.Module.Link ( dist/build/Foreign/CUDA/Driver/Module/Link.hs, dist/build/Foreign/CUDA/Driver/Module/Link.o )
[16 of 37] Compiling Foreign.CUDA.Analysis ( Foreign/CUDA/Analysis.hs, dist/build/Foreign/CUDA/Analysis.o )
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.o )
[18 of 37] Compiling Foreign.CUDA.Runtime.Event ( dist/build/Foreign/CUDA/Runtime/Event.hs, dist/build/Foreign/CUDA/Runtime/Event.o )
[19 of 37] Compiling Foreign.CUDA.Runtime.Stream ( dist/build/Foreign/CUDA/Runtime/Stream.hs, dist/build/Foreign/CUDA/Runtime/Stream.o )
[20 of 37] Compiling Foreign.CUDA.Runtime.Exec ( dist/build/Foreign/CUDA/Runtime/Exec.hs, dist/build/Foreign/CUDA/Runtime/Exec.o )
[21 of 37] Compiling Foreign.CUDA.Driver.Context.Config ( dist/build/Foreign/CUDA/Driver/Context/Config.hs, dist/build/Foreign/CUDA/Driver/Context/Config.o )
[22 of 37] Compiling Foreign.CUDA.Driver.Context ( Foreign/CUDA/Driver/Context.hs, dist/build/Foreign/CUDA/Driver/Context.o )
[23 of 37] Compiling Foreign.CUDA.Driver.Event ( dist/build/Foreign/CUDA/Driver/Event.hs, dist/build/Foreign/CUDA/Driver/Event.o )
[24 of 37] Compiling Foreign.CUDA.Driver.IPC.Event ( dist/build/Foreign/CUDA/Driver/IPC/Event.hs, dist/build/Foreign/CUDA/Driver/IPC/Event.o )
[25 of 37] Compiling Foreign.CUDA.Driver.Stream ( dist/build/Foreign/CUDA/Driver/Stream.hs, dist/build/Foreign/CUDA/Driver/Stream.o )
[26 of 37] Compiling Foreign.CUDA.Driver.Exec ( dist/build/Foreign/CUDA/Driver/Exec.hs, dist/build/Foreign/CUDA/Driver/Exec.o )

Foreign/CUDA/Driver/Exec.chs:373:1: warning: [-Wredundant-constraints]
    • Redundant constraint: Storable a
    • In the type signature for:
           cuParamSetv :: Storable a =>
                          Fun -> Int -> Ptr a -> Int -> IO Status
[27 of 37] Compiling Foreign.CUDA.Ptr ( Foreign/CUDA/Ptr.hs, dist/build/Foreign/CUDA/Ptr.o )
[28 of 37] Compiling Foreign.CUDA.Runtime.Marshal ( dist/build/Foreign/CUDA/Runtime/Marshal.hs, dist/build/Foreign/CUDA/Runtime/Marshal.o )
[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA     ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CFloat->Float" may never fire
      because ‘Foreign.C.Types.CFloat’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CDouble->Double" may never fire
      because ‘Foreign.C.Types.CDouble’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)

scsdjd:cuda> which ld
/usr/bin//ld

scsdjd:cuda> ld -v
@(#)PROGRAM:ld  PROJECT:ld64-264.3.102
configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em (tvOS)
LTO support using: LLVM version 7.3.0

scsdjd:cuda> cat cuda.buildinfo.generated 
buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
    cudadevrt
    cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__
scsdjd:cuda> 
scsdjd:cuda> 

Here is the most obviously relevant chunk of the output from cabal --verbose=3:

*** C Compiler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -I/usr/local/cuda/include -DPROFILING -x c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_49.c -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -fno-common -U__PIC__ -D__PIC__ -Wimplicit -S -O2 '-D__GLASGOW_HASKELL__=800' -include /usr/local/lib/ghc-8.0.1/include/ghcversion.h -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -I/usr/local/lib/ghc-8.0.1/bytestring-0.10.8.1/include -I/opt/local/include/ -I/usr/local/lib/ghc-8.0.1/base-4.9.0.0/include -I/usr/local/lib/ghc-8.0.1/integer-gmp-1.0.0.1/include -I/usr/local/lib/ghc-8.0.1/include
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_56.p_o
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_48.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_57.p_o
*** Linker:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -m64 -L/usr/local/cuda/lib -Wl,-rpath,/usr/local/cuda/lib -nostdlib -Wl,-r -o dist/build/Foreign/CUDA/Analysis/Device.p_o -Wl,-filelist -Wl,/var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_58.filelist
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)

On 8 Oct 2016, at 00:33, Trevor L. McDonell notifications@github.com wrote:

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


David Duke T: +44 113 3436800
Professor of Computer Science E: D.J.Duke@leeds.ac.uk
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: G.Butterwick@leeds.ac.uk T: +44 113 3435434

@djduke

This comment has been minimized.

Copy link

commented Oct 8, 2016

Following up on my previous mail: I wonder if the problem is related to this issue with Cabal:

haskell/cabal#2766

where dynamic linking was incorrectly turned on when executable profiling was selected. The issue was closed and the change was committed, but possibly masking a deeper inconsistency? I'm using Cabal-1.24.0.0, and suspect as you've been using ghc-8.0.1. you will be on the same version?

David.

On 8 Oct 2016, at 20:38, David Duke D.J.Duke@leeds.ac.uk wrote:

Hi Trevor,

As far as I can see, my tool configuration matches yours. My c2hs was older (2015), I updated c2hs and tried again but the problem persists. Here is a log of building cuda from a fresh clone of your repo, along with version info for the tools. I also ran cabal build with verbose=3, and looked at the output of the final set of commands (output at the end).

Regards,
David.

scsdjd:GitRepos> git clone https://github.com/tmcdonell/cuda
Cloning into 'cuda'...
remote: Counting objects: 3661, done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 3661 (delta 22), reused 0 (delta 0), pack-reused 3611
Receiving objects: 100% (3661/3661), 1.68 MiB | 249.00 KiB/s, done.
Resolving deltas: 100% (1954/1954), done.
Checking connectivity... done.

scsdjd:GitRepos> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.1

scsdjd:GitRepos> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

scsdjd:cuda> cabal configure
Resolving dependencies...
[1 of 1] Compiling Main ( dist/setup/setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
Configuring cuda-0.7.5.0...
Found CUDA toolkit at: /usr/local/cuda
Storing parameters to cuda.buildinfo.generated
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.

scsdjd:cuda> cabal build
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.
Building cuda-0.7.5.0...
Preprocessing library cuda-0.7.5.0...
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
[ 2 of 37] Compiling Foreign.CUDA.Driver.Error ( dist/build/Foreign/CUDA/Driver/Error.hs, dist/build/Foreign/CUDA/Driver/Error.o )
[ 3 of 37] Compiling Foreign.CUDA.Driver.Profiler ( dist/build/Foreign/CUDA/Driver/Profiler.hs, dist/build/Foreign/CUDA/Driver/Profiler.o )
[ 4 of 37] Compiling Foreign.CUDA.Driver.Utils ( dist/build/Foreign/CUDA/Driver/Utils.hs, dist/build/Foreign/CUDA/Driver/Utils.o )
[ 5 of 37] Compiling Foreign.CUDA.Runtime.Error ( dist/build/Foreign/CUDA/Runtime/Error.hs, dist/build/Foreign/CUDA/Runtime/Error.o )
[ 6 of 37] Compiling Foreign.CUDA.Runtime.Utils ( dist/build/Foreign/CUDA/Runtime/Utils.hs, dist/build/Foreign/CUDA/Runtime/Utils.o )
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.o )
[ 8 of 37] Compiling Foreign.CUDA.Analysis.Occupancy ( Foreign/CUDA/Analysis/Occupancy.hs, dist/build/Foreign/CUDA/Analysis/Occupancy.o )
[ 9 of 37] Compiling Foreign.CUDA.Runtime.Device ( dist/build/Foreign/CUDA/Runtime/Device.hs, dist/build/Foreign/CUDA/Runtime/Device.o )
[10 of 37] Compiling Foreign.CUDA.Driver.Device ( dist/build/Foreign/CUDA/Driver/Device.hs, dist/build/Foreign/CUDA/Driver/Device.o )
[11 of 37] Compiling Foreign.CUDA.Driver.Context.Base ( dist/build/Foreign/CUDA/Driver/Context/Base.hs, dist/build/Foreign/CUDA/Driver/Context/Base.o )
[12 of 37] Compiling Foreign.CUDA.Driver.Context.Peer ( dist/build/Foreign/CUDA/Driver/Context/Peer.hs, dist/build/Foreign/CUDA/Driver/Context/Peer.o )
[13 of 37] Compiling Foreign.CUDA.Driver.Context.Primary ( dist/build/Foreign/CUDA/Driver/Context/Primary.hs, dist/build/Foreign/CUDA/Driver/Context/Primary.o )
[14 of 37] Compiling Foreign.CUDA.Driver.Module.Base ( dist/build/Foreign/CUDA/Driver/Module/Base.hs, dist/build/Foreign/CUDA/Driver/Module/Base.o )
[15 of 37] Compiling Foreign.CUDA.Driver.Module.Link ( dist/build/Foreign/CUDA/Driver/Module/Link.hs, dist/build/Foreign/CUDA/Driver/Module/Link.o )
[16 of 37] Compiling Foreign.CUDA.Analysis ( Foreign/CUDA/Analysis.hs, dist/build/Foreign/CUDA/Analysis.o )
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.o )
[18 of 37] Compiling Foreign.CUDA.Runtime.Event ( dist/build/Foreign/CUDA/Runtime/Event.hs, dist/build/Foreign/CUDA/Runtime/Event.o )
[19 of 37] Compiling Foreign.CUDA.Runtime.Stream ( dist/build/Foreign/CUDA/Runtime/Stream.hs, dist/build/Foreign/CUDA/Runtime/Stream.o )
[20 of 37] Compiling Foreign.CUDA.Runtime.Exec ( dist/build/Foreign/CUDA/Runtime/Exec.hs, dist/build/Foreign/CUDA/Runtime/Exec.o )
[21 of 37] Compiling Foreign.CUDA.Driver.Context.Config ( dist/build/Foreign/CUDA/Driver/Context/Config.hs, dist/build/Foreign/CUDA/Driver/Context/Config.o )
[22 of 37] Compiling Foreign.CUDA.Driver.Context ( Foreign/CUDA/Driver/Context.hs, dist/build/Foreign/CUDA/Driver/Context.o )
[23 of 37] Compiling Foreign.CUDA.Driver.Event ( dist/build/Foreign/CUDA/Driver/Event.hs, dist/build/Foreign/CUDA/Driver/Event.o )
[24 of 37] Compiling Foreign.CUDA.Driver.IPC.Event ( dist/build/Foreign/CUDA/Driver/IPC/Event.hs, dist/build/Foreign/CUDA/Driver/IPC/Event.o )
[25 of 37] Compiling Foreign.CUDA.Driver.Stream ( dist/build/Foreign/CUDA/Driver/Stream.hs, dist/build/Foreign/CUDA/Driver/Stream.o )
[26 of 37] Compiling Foreign.CUDA.Driver.Exec ( dist/build/Foreign/CUDA/Driver/Exec.hs, dist/build/Foreign/CUDA/Driver/Exec.o )

Foreign/CUDA/Driver/Exec.chs:373:1: warning: [-Wredundant-constraints]
• Redundant constraint: Storable a
• In the type signature for:
cuParamSetv :: Storable a =>
Fun -> Int -> Ptr a -> Int -> IO Status
[27 of 37] Compiling Foreign.CUDA.Ptr ( Foreign/CUDA/Ptr.hs, dist/build/Foreign/CUDA/Ptr.o )
[28 of 37] Compiling Foreign.CUDA.Runtime.Marshal ( dist/build/Foreign/CUDA/Runtime/Marshal.hs, dist/build/Foreign/CUDA/Runtime/Marshal.o )
[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)

scsdjd:cuda> which ld
/usr/bin//ld

scsdjd:cuda> ld -v
@(#)PROGRAM:ld PROJECT:ld64-264.3.102
configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em (tvOS)
LTO support using: LLVM version 7.3.0

scsdjd:cuda> cat cuda.buildinfo.generated
buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__
scsdjd:cuda>
scsdjd:cuda>

Here is the most obviously relevant chunk of the output from cabal --verbose=3:

*** C Compiler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -I/usr/local/cuda/include -DPROFILING -x c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_49.c -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -fno-common -U__PIC__ -D__PIC__ -Wimplicit -S -O2 '-D__GLASGOW_HASKELL__=800' -include /usr/local/lib/ghc-8.0.1/include/ghcversion.h -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -I/usr/local/lib/ghc-8.0.1/bytestring-0.10.8.1/include -I/opt/local/include/ -I/usr/local/lib/ghc-8.0.1/base-4.9.0.0/include -I/usr/local/lib/ghc-8.0.1/integer-gmp-1.0.0.1/include -I/usr/local/lib/ghc-8.0.1/include
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_56.p_o
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_48.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_57.p_o
*** Linker:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -m64 -L/usr/local/cuda/lib -Wl,-rpath,/usr/local/cuda/lib -nostdlib -Wl,-r -o dist/build/Foreign/CUDA/Analysis/Device.p_o -Wl,-filelist -Wl,/var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_58.filelist
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)

On 8 Oct 2016, at 00:33, Trevor L. McDonell notifications@github.com wrote:

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


David Duke T: +44 113 3436800
Professor of Computer Science E: D.J.Duke@leeds.ac.uk
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: G.Butterwick@leeds.ac.uk T: +44 113 3435434


David Duke T: +44 113 3436800
Professor of Computer Science E: D.J.Duke@leeds.ac.uk
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: G.Butterwick@leeds.ac.uk T: +44 113 3435434

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 9, 2016

@djduke migrating to #43

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Oct 12, 2016

I just tried this with GHC HEAD and everything appears to work as expected in ghci.

The RTS automatically avoids the region needed by CUDA, no need to specify that through RTS flags or otherwise (although, I'm not sure how large a region it avoids... if your total GPU+system RAM is very high maybe you will still need to specify the offset manually.)

$ ghci
GHCi, version 8.1.20161011: http://www.haskell.org/ghc/  :? for help

> import Foreign.CUDA.Driver
> initialise []
> props =<< device 0
DeviceProperties {deviceName = "GeForce GT 650M", computeCapability = 3.0, ...
@mchakravarty

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2016

That's good news — thanks for checking!

@tmcdonell

This comment has been minimized.

Copy link
Owner Author

commented Mar 9, 2017

Since GHC-8.0.2 is out now this is probably safe to close.

@tmcdonell tmcdonell closed this Mar 9, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.