Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built with LTO #238

Open
bgoodri opened this issue Nov 12, 2017 · 7 comments
Open

Built with LTO #238

bgoodri opened this issue Nov 12, 2017 · 7 comments

Comments

@bgoodri
Copy link
Contributor

bgoodri commented Nov 12, 2017

Summary:

After / during the feature/2.17 is merged, we need to figure out how to build rstanarm with LTO whenever possible.

Description:

Adding -flto=8 to the ~/.R/Makevars reduces the compile time to about 30 seconds per Stan program (in parallel) and about 90 seconds to link.

R Version:

3.4.x

Operating System:

Debian

@bgoodri
Copy link
Contributor Author

bgoodri commented Nov 12, 2017

@aadler Is it possible to build an R package like rstanarm on Windows with LTO? I am not expecting it to execute any faster but hoping the build process takes fewer resources on win-builder.

@aadler
Copy link

aadler commented Nov 13, 2017

@bgoodri Hi. LTO as it is implemented in GCC 4.9.3 which underlies the current version of Rtools is either incomplete or incorrect. When I tried to build even base R with LTO, it interfered with various packages like dplyr so I had to install some from source and some from binary. Even when installing from source, I had to change the Makevars and install some packages with LTO and some without, which then requires ffat-lto-objects which increases file size.

The one time I tried to build R with a custom-built version of GCC 7.1.0, if I recall correctly, I used LTO and it worked better, but I don't remember clearly.

Therefore, after months of trying, I stopped recommending using LTO for building on Windows, and would continue to do so until Rtools is moved to GCC 7 at least. For that, @jeroen (aka @ropensci) is the new keeper of Rtools. All bribes should be sent his way ;)

@jeroen
Copy link

jeroen commented Jun 2, 2018

I considering if we should enable LTO for the new toolchain. Can you give me an example that shows a case for how this would be useful to support?

@bgoodri
Copy link
Contributor Author

bgoodri commented Jun 2, 2018

@jeroen It would be great if LTO worked with the Windows C++ toolchain.

Currently, installing rstanarm on CRAN under r-release-windows-ix86+x86_64 takes 2135 seconds and the shared object consumes 17.3Mb of disk space.

On Linux with g++-7, installation time with LTO is 75% of the time without LTO and disk space consumed with LTO is also 75% of the disk space without LTO.

Additionally, although it is not a problem for CRAN, installing rstanarm in parallel with 8 cores on Linux consumes almost 16 GB of RAM. With LTO, I can get the peak RAM spike down to 13.5 GB.

Someone else has found ( http://discourse.mc-stan.org/t/thinlto-standard-benchmarks/3673?u=bgoodri ) that the execution time for a bunch of Stan models is around 5% less when compiled with LTO (under clang).

@aadler
Copy link

aadler commented Jun 3, 2018

@jeroen I'm not sure what you mean by enable. The toolchain, at least as of 3.4, has LTO enabled, but its implementation in GCC 4.9.3 was incomplete and often not worth it, at least in my many experiments. When the toolchain for windows will be based on GCC 7+, then it should certainly be continued to be built with LTO enabled.

@bgoodri
Copy link
Contributor Author

bgoodri commented Oct 24, 2018

@jeroen @aadler Is there a known trick to compiling a DLL using LTO with R-testing for Windows? In my ~/.R/Makevars, I have

CC = C:\rtools40\mingw64\bin\gcc -m$(WIN)
CXX = C:\rtools40\mingw64\bin\g++ -m$(WIN)
CXX11 = C:\rtools40\mingw64\bin\g++ -m$(WIN)
CXX14 = C:\rtools40\mingw64\bin\g++ -m$(WIN)
CXX14 += -flto=jobserver
LOCAL_CPPFLAGS = -Og -Wno-unused-variable -Wno-unused-function -Wno-unused-local-typedefs
LOCAL_CPPFLAGS += -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-attributes -march=native -mtune=native
AR = C:\rtools40\mingw64\bin\gcc-ar
NM = C:\rtools40\mingw64\bin\gcc-nm
RANLIB = C:\rtools40\mingw64\bin\gcc-ranlib
endif

But when I try to compile a Stan program (e.g. with example(stan_model, package = "rstan")) using CXX14, I get several messages of the form

make[1]: [C:\Users\Stan\AppData\Local\Temp\ccYamjfZ.mk:15: C:\Users\Stan\AppData\Local\Temp\ccnmQXX2.ltrans4.ltrans.o] Error 1 (ignored)

although the DLL does compile and load. However, when I try to execute it, R crashes with a corrupted backtrace (under gdb and removing line breaks from the output)

(gdb) run
Starting program: C:\PROGRA1\R\R-TEST1\bin\x64\Rterm.exe
[New Thread 3832.0x1f58]
warning: Invalid parameter passed to C runtime function.
[New Thread 3832.0x1a38]
[New Thread 3832.0x1b80]
[New Thread 3832.0x7a0]
R version 3.6.0 Under development (Testing Rtools) (2018-08-14 r75146) -- "Blame
Jeroen"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
library(rstan)
Loading required package: ggplot2
Registered S3 methods overwritten by 'ggplot2':
method from
[.quosures rlang
c.quosures rlang
print.quosures rlang
Registered S3 method overwritten by 'dplyr':
method from
as.data.frame.tbl_df tibble
Loading required package: StanHeaders
rstan (Version 2.18.1, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
stancode <- 'data {real y_mean;} parameters {real y;} model {y ~ normal(y_me$
mod <- stan_model(model_code = stancode)
[New Thread 3832.0xa5c]
[Thread 3832.0x1b80 exited with code 0]
[Thread 3832.0x1a38 exited with code 0]
[Thread 3832.0xa5c exited with code 0]
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:15: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans4.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:3: C:\Users\Stan\AppData
Local\Temp\ccNiw63a.ltrans0.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:24: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans7.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:6: C:\Users\Stan\AppData
Local\Temp\ccNiw63a.ltrans1.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:21: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans6.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:27: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans8.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:18: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans5.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:12: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans3.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:30: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans9.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:9: C:\Users\Stan\AppData
Local\Temp\ccNiw63a.ltrans2.ltrans.o] Error 1 (ignored)
make[1]: [C:\Users\Stan\AppData\Local\Temp\ccwsyYkm.mk:33: C:\Users\Stan\AppData
\Local\Temp\ccNiw63a.ltrans10.ltrans.o] Error 1 (ignored)
[New Thread 3832.0x1cfc]
[Thread 3832.0x1cfc exited with code 0]
post <- sampling(mod, data = list(y_mean = 0), chains = 1, cores = 1)
SAMPLING FOR MODEL '73fc79f8b1915e8208c736914c86d1a1' NOW (CHAIN 1).
Program received signal SIGSEGV, Segmentation fault.
0xffffffff9b3c0000 in ?? ()
(gdb) bt
#0 0xffffffff9b3c0000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

Do either of you have any idea what I am doing wrong?

@aadler
Copy link

aadler commented Oct 3, 2019

No, I gave up a while ago. I should try again one of these days with @jeroen latest and greatest Rtools4. Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants