Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libtorch-ffi errors when profiling with --profile #706

Open
kenhkan opened this issue May 16, 2024 · 9 comments
Open

libtorch-ffi errors when profiling with --profile #706

kenhkan opened this issue May 16, 2024 · 9 comments

Comments

@kenhkan
Copy link

kenhkan commented May 16, 2024

I have a program that runs as expected when running without profiling:

stack build .

However, when I run the same program with profiling turned on, I get a host of errors (at the end of this message) even though I have indicated --no-library-profiling to stack:

stack build --profile --no-library-profiling --executable-profiling --ghc-options "-fprof-auto" .

The only difference is the stack command run. I'm actually interested in profiling only my part of the codebase and don't need to profile any part of hasktorch, but the codebase is of course intertwined with references to Torch, so I can't really take out the dependency to test. Does anyone have any insight on profiling their codebases with hasktorch as a dependency?

  • lts-20.2
  • hasktorch commit 816334c
  • fpco/inline-c commit ef87fbae38ed9f646b912f94606d895d0582f1b4

The errors are:

libtorch-ffi> <command-line>: error: expected identifier before numeric constant
libtorch-ffi>
libtorch-ffi> /tmp/stack-a488220a1ae71b97/libtorch-ffi-2.0.0.0//home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/runtime/graph_executor.h:21:3: error:
libtorch-ffi>      note: in expansion of macro ‘PROFILING’
libtorch-ffi>        21 |   PROFILING,
libtorch-ffi>           |   ^~~~~~~~~
libtorch-ffi>    |
libtorch-ffi> 21 |   PROFILING,
libtorch-ffi>    |   ^
libtorch-ffi> <command-line>: error: expected ‘}’ before numeric constant
libtorch-ffi>
libtorch-ffi> /tmp/stack-a488220a1ae71b97/libtorch-ffi-2.0.0.0//home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/runtime/graph_executor.h:21:3: error:
libtorch-ffi>      note: in expansion of macro ‘PROFILING’
libtorch-ffi>        21 |   PROFILING,
libtorch-ffi>           |   ^~~~~~~~~
libtorch-ffi>    |
libtorch-ffi> 21 |   PROFILING,
libtorch-ffi>    |   ^
libtorch-ffi> In file included from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/api/function_impl.h:5,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/api/method.h:7,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/api/object.h:6,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/api/module.h:4,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/api/include/torch/serialize/input-archive.h:6,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/api/include/torch/serialize/archive.h:3,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/api/include/torch/nn/pimpl.h:5,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/api/include/torch/optim/adagrad.h:3,
libtorch-ffi>                  from /home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/api/include/torch/optim.h:3,
libtorch-ffi>
libtorch-ffi> /tmp/stack-a488220a1ae71b97/libtorch-ffi-2.0.0.0/                 from /tmp/ghc27419_0/ghc_433.cpp:8:0: error:

... <redacted> ...

libtorch-ffi> /tmp/stack-a488220a1ae71b97/libtorch-ffi-2.0.0.0//home/kenhkan/present/lib/hasktorch/deps/libtorch/include/torch/csrc/jit/runtime/interpreter.h:40:8: error:
libtorch-ffi>      note: forward declaration of ‘struct torch::jit::GraphExecutor’
libtorch-ffi>        40 | struct GraphExecutor;
libtorch-ffi>           |        ^~~~~~~~~~~~~
libtorch-ffi>    |
libtorch-ffi> 40 | struct GraphExecutor;
libtorch-ffi>    |        ^
libtorch-ffi> `gcc' failed in phase `C Compiler'. (Exit code: 1)
Progress 1/3

Error: [S-7282]
       Stack failed to execute the build plan.

       While executing the build plan, Stack encountered the error:

       [S-7011]
       While building package libtorch-ffi-2.0.0.0 (scroll up to its section to see the error) using:
       /home/kenhkan/.stack/setup-exe-cache/x86_64-linux/Cabal-simple_6HauvNHV_3.6.3.0_ghc-9.2.5 --verbose=1 --builddir=.stack-work/dist/x86_64-linux/ghc-9.2.5 build --ghc-options " -fdiagnostics-color=always"
       Process exited with code: ExitFailure 1
make: *** [Makefile:46: profile] Error 1
@chfin
Copy link

chfin commented May 17, 2024

I get the same error just compiling with stack build --profile or stack build --trace on HEAD (16b7e3e).

@collinarnett
Copy link

This also happened with libtorch-ffi 2.0 using nix for compilation.

@junjihashimoto
Copy link
Member

junjihashimoto commented May 19, 2024

@kenhkan From the error message, I think it's a issue with the compiler version (C++14). Do you get the same error on the latest branch using C++17? Or could you try it?

@kenhkan
Copy link
Author

kenhkan commented May 21, 2024

@junjihashimoto I use gcc 11.4 which should use C++17 by default.

~$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I also tried setting the environment variable directly:

export CXXFLAGS="-std=c++17 $CXXFLAGS"

The same message resulted.

@kenhkan
Copy link
Author

kenhkan commented May 21, 2024

I forgot to mention. I tried both my original hasktorch commit as well as the current master commit 16b7e3e.

@junjihashimoto
Copy link
Member

@kenhkan Thank you for trying both cases!

GHC defines PROFILING.

https://github.com/ghc/ghc/blob/00126a89da90ecef405f80644ee6f746f5a94fc3/hadrian/src/Settings/Packages.hs#L321

$ cat test.c
#include <stdio.h>

int
test(){
  printf("%d\n", PROFILING);
  return 0;
}
$ ghc test.c -prof -c
$ ghc test.c  -c
test.c: In function ‘test’:

test.c:5:18: error:
     error: ‘PROFILING’ undeclared (first use in this function)
       printf("%d\n", PROFILING);
                      ^~~~~~~~~
  |
5 |   printf("%d\n", PROFILING);
  |                  ^

test.c:5:18: error:
     note: each undeclared identifier is reported only once for each function it appears in
  |
5 |   printf("%d\n", PROFILING);
  |                  ^
`gcc' failed in phase `C Compiler'. (Exit code: 1)
$ gcc test.c -pg -c
test.c: In function ‘test’:
test.c:5:18: error: ‘PROFILING’ undeclared (first use in this function)
   printf("%d\n", PROFILING);
                  ^~~~~~~~~
test.c:5:18: note: each undeclared identifier is reported only once for each function it appears in
$ gcc test.c -c
test.c: In function ‘test’:
test.c:5:18: error: ‘PROFILING’ undeclared (first use in this function)
   printf("%d\n", PROFILING);
                  ^~~~~~~~~
test.c:5:18: note: each undeclared identifier is reported only once for each function it appears in

@junjihashimoto
Copy link
Member

junjihashimoto commented May 23, 2024

There is no code using the macro(PROFILING) in libtorch's headers. So it should work if you rewrite the PROFILING part to PRPOFILING__.

$ grep -rn  PROFILING deps/libtorch/include/
deps/libtorch/include//torch/csrc/distributed/rpc/message.h:80:  RUN_WITH_PROFILING_REQ = 0x15 | MessageTypeFlags::REQUEST_TYPE,
deps/libtorch/include//torch/csrc/distributed/rpc/message.h:81:  RUN_WITH_PROFILING_RESP = 0x16 | MessageTypeFlags::RESPONSE_TYPE,
deps/libtorch/include//torch/csrc/jit/runtime/graph_executor.h:21:  PROFILING,
deps/libtorch/include//xnnpack.h:49:#define XNN_FLAG_BASIC_PROFILING 0x00000008
deps/libtorch/include//ATen/core/dispatch/Dispatcher.h:633:#ifndef PYTORCH_DISABLE_PER_OP_PROFILING
deps/libtorch/include//ATen/core/dispatch/Dispatcher.h:638:#endif  // PYTORCH_DISABLE_PER_OP_PROFILING
deps/libtorch/include//ATen/core/dispatch/Dispatcher.h:672:#ifndef PYTORCH_DISABLE_PER_OP_PROFILING
deps/libtorch/include//ATen/core/dispatch/Dispatcher.h:690:#endif  // PYTORCH_DISABLE_PER_OP_PROFILING

@kenhkan
Copy link
Author

kenhkan commented May 25, 2024

@junjihashimoto Ah! It didn't occur to me that it'd be something in libtorch itself that interferes with GHC.

It wouldn't work to replace individual symbols. I've redacted the log in my original post for brevity. The full log is over 1000 lines long of similar errors spanning across 35 files. I've attached them here.

I assume that profiling does work for you. What is your setup? I don't know how I could prevent those files to be included so maybe I could try copying your setup.

@collinarnett
Copy link

collinarnett commented May 26, 2024

Circling back to this issue, I'm getting a slightly different error when compiling with profiling support:

[ 42 of 105] Compiling Torch.Internal.Unmanaged.Native ( src/Torch/Internal/Unmanaged/Native.hs, dist/build/Torch/Internal/Unmanaged/Native.p_o )
[ 43 of 105] Compiling Torch.Internal.Managed.Native.Native9 ( src/Torch/Internal/Managed/Native/Native9.hs, dist/build/Torch/Internal/Managed/Native/Native9.p_o )
[ 44 of 105] Compiling Torch.Internal.Managed.Native ( src/Torch/Internal/Managed/Native.hs, dist/build/Torch/Internal/Managed/Native.p_o )
[ 45 of 105] Compiling Torch.Internal.Unmanaged.Optim ( src/Torch/Internal/Unmanaged/Optim.hs, dist/build/Torch/Internal/Unmanaged/Optim.p_o )
<command-line>: error: expected identifier before numeric constant
<command-line>: error: expected ‘}’ before numeric constant
In file included from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/api/function_impl.h:5,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/api/method.h:7,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/api/object.h:6,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/api/module.h:4,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/api/include/torch/serialize/input-archive.h:6,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/api/include/torch/serialize/archive.h:3,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/api/include/torch/nn/pimpl.h:5,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/api/include/torch/optim/adagrad.h:3,
                 from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/api/include/torch/optim.h:3,

                 from /build/ghc1742_0/ghc_42.cpp:10:0: error: 

/nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/runtime/graph_executor.h:20:28: error:
     note: to match this ‘{’
       20 | enum ExecutorExecutionMode {
          |                            ^
   |
20 | enum ExecutorExecutionMode {
   |                            ^
<command-line>: error: expected unqualified-id before numeric constant

/nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/runtime/graph_executor.h:27:33: error:
     error: ‘Graph’ was not declared in this scope; did you mean ‘torch::jit::Graph’?
       27 |   ExecutionPlan(std::shared_ptr<Graph> graph, std::string function_name)
          |                                 ^~~~~
          |                                 torch::jit::Graph
   |
27 |   ExecutionPlan(std::shared_ptr<Graph> graph, std::string function_name)
   |                                 ^

In file included from /nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/api/function_impl.h:4:0: error:
    

/nix/store/znipp49fldq5waw9n45fmynh4wkpnrbn-libtorch-2.3.0-dev/include/torch/csrc/jit/ir/ir.h:1181:8: error:
     note: ‘torch::jit::Graph’ declared here
     1181 | struct Graph : std::enable_shared_from_this<Graph> {
          |        ^~~~~
     |
1181 | struct Graph : std::enable_shared_from_this<Graph> {
     |        ^

There are a couple of options I can change with Nix to fix this:

enableLibraryProfiling
Whether to enable profiling for libraries contained in the package. Enabled by default if supported.
enableExecutableProfiling
Whether to enable profiling for executables contained in the package. Disabled by default.
profilingDetail
Profiling detail level to set. Defaults to exported-functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants