New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual Studio reports higher CPU usage than MinGW #542

Closed
Laurae2 opened this Issue May 22, 2017 · 17 comments

Comments

Projects
None yet
2 participants
@Laurae2
Collaborator

Laurae2 commented May 22, 2017

(Requires investigation)

This issue reflects an issue found on the following issues:

  • #446: Installation of CMake&MinGW is slow than Visual Studio

I tried the two method of installaion on my computer (6core,3.6Ghz). trainning and predicting with the same datasets(one million samples and 150 features). two method have different CPU occupancy. the detail info as follow:

CMake+MinGW:
trainning cost 560 seconds (CPU occupancy is 70%)
testing cost 44 seconds (CPU occupancy is 50%)

Visual Studio:
trainning cost 547 seconds (CPU occupancy is 100%)
testing cost 23 seconds (CPU occupancy is 100%)

  • #512: lightgbm not using all the CPU

pic vs

pic mingw

  • (probably missed more issues)

We have to find out why there is such discrepancy between Visual Studio and MinGW.

Even worse is MinGW compilation leading to a way less CPU usage during training, but ending only slightly slower than Visual Studio which is taking 100% CPU.

MinGW vs Visual Studio is a known issue with OpenMP especially for xgboost in Windows. One can look there: dmlc/xgboost#2243

Reported in issue dmlc/xgboost#2165. Dynamic scheduling of OpenMP loops involve
implicit synchronization. To implement synchronization, libgomp uses futex
(fast userspace mutex), whereas MinGW uses kernel-space mutex, which is more
costly. With chunk size of 1, synchronization overhead may become prohibitive
on Windows machines.

Solution: use 'guided' schedule to minimize the number of syncs

Possible theory: perhaps threads are locked in Visual Studio (so they are hot), while MinGW is freeing them when possible with OpenMP (cold cores). This would lead to MinGW compiled LightGBM having less CPU usage due to higher synchronization/overhead costs (explaining why it is slower) while Visual Studio compiled LightGBM would keep the cores "ready" for training (but both behaviors are unexplained so far).

@guolinke

This comment has been minimized.

Member

guolinke commented May 26, 2017

@Laurae2 Actually, I am thinking, can we just use the compiled DLL/lib for the R-package (like python-package)?

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented May 26, 2017

@guolinke I think some people did it: https://erpcoder.wordpress.com/2016/06/15/how-to-develop-a-c-dll-for-r-in-visual-studio-2015/

But not sure how feasible it is in reality.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented May 26, 2017

@guolinke I am getting this error when using Visual Studio or MinGW externally compiled DLL out of the box:

> train$construct()
Error in .Call(fun_name, ..., ret, call_state, PACKAGE = "lightgbm") : 
  "LGBM_DatasetCreateFromFile_R" not available for .Call() for package "lightgbm"
@guolinke

This comment has been minimized.

Member

guolinke commented May 27, 2017

Some simple tests:

  1. Higgs : >lightgbm.exe data=higgs.train num_leaves=255 num_threads=16

Mingw version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=21
[LightGBM] [Info] 131.781472 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training

vs2013 version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=21
[LightGBM] [Info] 115.655364 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training
  1. Yahoo : >lightgbm.exe data=yahoo.train num_leaves=255 num_threads=16

Mingw version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=19
[LightGBM] [Info] 68.384020 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training

vs2013 version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=19
[LightGBM] [Info] 41.499685 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training
  1. Bosch : >lightgbm.exe data=bosch.train num_leaves=255 num_threads=16

Mingw version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=55
[LightGBM] [Info] 64.543007 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training

vs2013 version:

[LightGBM] [Info] Trained a tree with leaves=255 and max_depth=36
[LightGBM] [Info] 43.374675 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training

I think the multi-threading performance of mingw is actually poor, especially for the sparse dataset...

@guolinke

This comment has been minimized.

Member

guolinke commented May 27, 2017

@Laurae2 I try many ways to hack R's package with pre-compiled dll, but didn't success yet.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented May 27, 2017

@guolinke We can use .exe wrapper like I did on my package (https://github.com/Laurae2/Laurae) but this is not the best solution (we lose the ability to use callbacks, etc.).

R can dynamically load binaries (like it loads dlls/so/dylib), there must be a way to hack a way through it. VS and MinGW's externally compiled dll are loading fine in R: https://stat.ethz.ch/R-manual/R-devel/library/base/html/dynload.html - but how to use them becomes wizardry (like using dlls in VBA).

Or, we need someone who compiled R with Visual Studio to provide us the R-compiled DLL. Compiling R with Visual Studio is a very difficult task if not impossible... (difficult enough just to get MinGW compile R).

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented May 27, 2017

In the past MinGW and VS for LightGBM were very close, now they are very different...

Did you test by changing OpenMP static scheduling to guided/dynamic scheduling? xgboost got a performance boost by using guided/dynamic scheduling.

@guolinke

This comment has been minimized.

Member

guolinke commented May 27, 2017

@Laurae2
Actually, Our R's dll can be compiled by Visual studio, since we didn't include any third-part header files.
The only problem is how to put this DLL into R's package.
I try to put it to the R-package/libs/x64, and build the package with --binary.
I find the dll file is included in the package file. However, It says cannot find the dll when try to install this package...

For the static/dynamic/guided. Most of loop are using static, few of them are guided. I didn't use the dynamic due to the low performance.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented Jun 1, 2017

@guolinke I'll retry with my 20 core server since it is available today. I will update here when I get new results: Laurae2/gbt_benchmarks#1

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented Jun 2, 2017

@guolinke Do you have any idea about these results? VS is very good when doing heavy multithreading, while MinGW is better for lower amount of threads.

i7-4600U, 4 threads:

Algorithm Time (s)
v2.0 CLI (O3) 645.962
v2.0 R (O2) 673.568
master CLI (O3) 588.445
master R (O2) 607.209
Visual Studio 615.863

Dual Xeon Ivy Bridge, 2.7GHz, 2x 10 cores:

Algorithm Time (s) Threads
v2.0 R (O2) 244.542 40
master R (O2) 174.157 40
master CLI (O3) 164.336 40
master CLI (O3) 225.045 20
Visual Studio 139.214 40
Visual Studio 162.792 20

For the R package, using a dll compiled with Visual Studio, I think it requires loading it dynamically and using directly its calls... I think it is not possible currently unless you create the datasets in memory and do everything in memory (like doing a CLI in memory).

Do you think this is possible? I think this might be the "easiest" (but not best) alternative:

  • LightGBM binary datasets must be created beforehand in memory
  • LightGBM runs like CLI but using DLL
  • (a new) R-package wraps the "CLI"

This would also fix the issue we currently have with datasets being stuck in memory, or memory not being freed up without removing the model(s).

But that (new) R-package would lose:

  • Custom metric
  • Custom objective

It means it would also require to be able to call R everytime a metric/objective is used while training is running, for every iteration.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented Jun 3, 2017

@guolinke I found this saying using Visual Studio DLL cannot work with C++ code:

Cross-compiler linking works for C, it does not for C++ due to unstandardized name wrangling.

http://r.789695.n4.nabble.com/RStudio-Calling-C-Visual-Studio-DLL-td4703642.html

It requires a C wrapper apparently.

@guolinke

This comment has been minimized.

Member

guolinke commented Jun 3, 2017

@Laurae2 I think all expose apis are c api. So it can be used.

For the DLL in R packages, I think the VS version can be normally used. The only problem is how to put it into R package correctly, without the error of "cannot find lightgbm.dll" .

Another possible solution is solving the multi-threading problem in MinGW.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented Jun 3, 2017

@guolinke I think you have to put the dll in R-package/inst/<my_dll.dll> otherwise it will get removed on install when not in inst folder.

Using https://stat.ethz.ch/R-manual/R-devel/library/base/html/library.dynam.html on library load can load the DLL from package.

@Laurae2

This comment has been minimized.

Collaborator

Laurae2 commented Jun 3, 2017

I think the tip from here might help: http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-April/002207.html

| .... now what would be the line I would have to add to Makevars.win, or how do I have to modify the PKG_LIBS Line in order to incorporate "Plastic.lib"?

You probably want to look at a few packages using external libraries; there
are a few that use the GSL. One of them is mvabund which also uses Rcpp. It
has

This assumes that the LIB_GSL variable points to working GSL libraries
It also assume that we can call Rscript to ask Rcpp about its locations
PKG_CPPFLAGS=-std=c++0x -I$(LIB_GSL)/include
PKG_LIBS=-L$(LIB_GSL)/lib -lgsl -lgslcblas $(shell $(R_HOME)/bin/Rscript.exe -e "Rcpp:::LdFlags()")

but understand that the environment variable use here is 'just for CRAN' and
you do not have to worry about it.

You do need to understand what these two lines do though, which is why I
recommended that you start with something simple.

:-) Success!!
I studied, found out what most of that means, and -- voila -- in the end the (very simple!! :-)) ) working version of Makevars.win is like that:

PKG_LIBS = Plastic.lib $(shell "${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe" -e "Rcpp:::LdFlags()")
with Plastic.lib, which resides directly in src, being the library I wanted to link against.

It works, I can invoke the first simple functions from within R, I can enumerate the connected devices, clear the list and give back build-numbers and the like.

The lib file was apparently linked without anything extra required:

Perfect!!  No transformation needed for Plastic.lib, ie no export table
business and all that as discussed in the MinGW FAQ?
@guolinke

This comment has been minimized.

Member

guolinke commented Jun 3, 2017

@Laurae2
Thanks for the info.

I just try a small experiment.

I build R's dll by the branch r-withvs, and use the built dll to replace the lightgbm.dll in C:\Users\xxx\Documents\R\win-library\3.4\lightgbm\libs\x64 .
Then a run some R's demos, it runs smoothly. This proves the dll built by VS can be used.

The only problem is how to put our pre-compiled DLL into R-package when building package.

@guolinke

This comment has been minimized.

Member

guolinke commented Jun 3, 2017

finally, I find a solution from https://cran.r-project.org/doc/manuals/r-release/R-exts.html

In very special cases packages may create binary files other than the shared objects/DLLs in the src directory. Such files will not be installed in a multi-architecture setting since R CMD INSTALL --libs-only is used to merge multiple sub-architectures and it only copies shared objects/DLLs. If a package wants to install other binaries (for example executable programs), it should provide an R script src/install.libs.R which will be run as part of the installation in the src build directory instead of copying the shared objects/DLLs. The script is run in a separate R environment containing the following variables: R_PACKAGE_NAME (the name of the package), R_PACKAGE_SOURCE (the path to the source directory of the package), R_PACKAGE_DIR (the path of the target installation directory of the package), R_ARCH (the arch-dependent part of the path, often empty), SHLIB_EXT (the extension of shared objects) and WINDOWS (TRUE on Windows, FALSE elsewhere). Something close to the default behavior could be replicated with the following src/install.libs.R file:

files <- Sys.glob(paste0("*", SHLIB_EXT))
dest <- file.path(R_PACKAGE_DIR, paste0('libs', R_ARCH))
dir.create(dest, recursive = TRUE, showWarnings = FALSE)
file.copy(files, dest, overwrite = TRUE)
if(file.exists("symbols.rds"))
    file.copy("symbols.rds", dest, overwrite = TRUE)

And build the package successfully with the pre-compile dll.

@guolinke

This comment has been minimized.

Member

guolinke commented Jun 6, 2017

close since we can build R package by visual studio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment