New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices for distributing TMB and R code #43

Open
James-Thorson opened this Issue Sep 4, 2014 · 19 comments

Comments

Projects
None yet
7 participants
@James-Thorson

James-Thorson commented Sep 4, 2014

Hi all,

I'm starting to plan projects where I'd like to distribute models using TMB code as stand-alone, encapsulated functions in R (i.e. index standardization, growth analysis, etc.).

I imagine that CRAN will not accept packages using TMB (because TMB isn't in CRAN), but could imagine using R-forge for cross-platform (at least windows and linux) package builds:

  1. Is there any better option I'm missing? (I'd prefer not to just source functions into the R environment)
  2. Is there any tutorial on how to bundle a package to build directly out of GitHub (as TMB itself does)?

Jim

@markpayneatwork

This comment has been minimized.

markpayneatwork commented Sep 16, 2014

Hi Jim,

Another option is to incorporate compiled versions of the TMB library into the package. This can be done for infrequently-updated packages, but rapidly gets annoying, because you need one version for each platform (!)...

Mark

@bbolker

This comment has been minimized.

Contributor

bbolker commented Sep 18, 2014

I'm a little late to this discussion, but I wanted to point out that

library("devtools")
install_github("adcomp",username="kaskr",subdir="TMB")

should work fine, and easily, for any users who have compilation tools installed (which they'll have to have in order to use the package anyway, right??)

Downstream packages that are kept on github can be installed just as easily -- in fact more easily, because if they don't have compiled components then install_github should work even without compilation tools available. (There's a chance of running into problems with compiling vignettes etc. -- using build_vignettes=FALSE or quick=TRUE (see ?install in the devtools package) usually solves these problems.

You don't need to do anything special at all to allow installation of a package from Github: in fact, if you have the package contents (DESCRIPTION, NAMESPACE, R/, man/, etc.) at the top level of your GH repository, then you don't even need the subdir argument specified in the call above. I maintain lme4, betararef, ADmarsaglia, R2admb packages on GH (some of these are on CRAN, some aren't).

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Oct 6, 2014

Ben's suggestion is an elegant installation method. Unfortunately, it won't work on Windows computers because Windows installation is done by source("install_windows.R"). I tried it out with Roman Lustrik in Ljubljana this past week for fun and helped him fit some nonlinear models. Ben, do you know if there's a way for install_github() to do source("install_windows.R") when it detects a Windows operating system?

@bbolker

This comment has been minimized.

Contributor

bbolker commented Oct 6, 2014

I don't know, but it might be possible to look at the source code of install_github() for inspiration.

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Nov 30, 2014

Let's assume I'm ignoring Windows computers for now. Also assume I expect the user to install TMB and compile C++ code on their own computer. In which part of the package contents should I put files like model.cpp? Then I want install_github() to run the command TMB::compile("model.cpp") upon installation to create model.so. Then I want the command library(myTMBpackage) to do TMB::dyn.load("model.so") so that it can be used. Is there a good example I can look at for this or can anyone point me to the part of the package contents where these commands should go?

@bbolker

This comment has been minimized.

Contributor

bbolker commented Nov 30, 2014

There are two ways to do this, I'm not sure which one works for you.

  • code that goes into the src directory will get automatically compiled on installation. You'll have to provide a Makefile to specify exactly how it should be compiled.
  • anything you put into inst will be installed with the package (but the packages will just be copied -- nothing else will be done with them); you can then reference the functions via system.file().

I think my main advice would be to read the R extensions manual carefully, especially the sections on configuration and Makefiles ... try putting the instructions in src/Makefile ...

@kaskr

This comment has been minimized.

Owner

kaskr commented Nov 30, 2014

I managed to get a test package working following these steps:

  1. package.skeleton("mypkg")
  2. Copy model.cpp to mypkg/src
  3. Add
Depends: TMB
LinkingTo: TMB

to file mypkg/DESCRIPTION to ensure that TMB is loaded when mypkg is loaded, and that the TMB source code is found when compiling mypkg.
4. Create a file mypkg/R/zzz.R with

.onLoad <- function(lib, pkg) {
    cat("Loading compiled code...\n")
    library.dynam("mypkg", pkg, lib)
}

to ensure that the compiled code is loaded when you load the package.
5. Pass DLL="mypkg" to all 'MakeADFun' calls.

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Nov 30, 2014

Thanks for the suggestions. I'll try it out.

@markpayneatwork

This comment has been minimized.

markpayneatwork commented Nov 30, 2014

Hi, For future reference I added Kasper's answer to the Wiki, under "development".

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Feb 26, 2016

I dug around and it seems that the steps we took for glmmTMB are to put these lines in the Makefile to compile glmmTMB.cpp

R=R
PACKAGE=glmmTMB
$(PACKAGE)/src/glmmTMB.so: $(PACKAGE)/src/glmmTMB.cpp
    cd $(PACKAGE)/src; echo "library(TMB); compile('glmmTMB.cpp','-O0 -g')" | $(R) --slave

and this line in the NAMESPACE to load the compiled code

useDynLib(glmmTMB)

Does that look right? These steps seemed to work for the other package I'm working on. I just want to document them for reference.

@bbolker

This comment has been minimized.

Contributor

bbolker commented Feb 26, 2016

I'm guessing that you don't even need the Makefile if you aren't trying to specify non-standard compilation flags. It would probably be better (in general) to use a Makevars file - see the relevant section in the R Extensions manual ; you could probably just set PKG_CXXFLAGS="-O0 -g" in this case. (These are good settings for debugging, not for production ...)

@kaskr

This comment has been minimized.

Owner

kaskr commented Feb 27, 2016

I have updated wiki under "development" with comments from @mebrooks and @bbolker .

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Feb 27, 2016

Looks good. Thanks!

@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Mar 16, 2016

Is it possible to use two .cpp files in the same package? When I run make install, I'm getting many duplicate symbol errors about the .o files. Here's the first one

duplicate symbol __ZN5CppAD18traceforward0sweepEi in:
    randwalk.o
    randwalk2.o
@mebrooks

This comment has been minimized.

Contributor

mebrooks commented Mar 16, 2016

Playing around with this a bit more, it seems that the .cpp file must have the same name as the package. Before, I had the impression that this was just a suggestion. So that answers my question about multiple .cpp files.

@alko989

This comment has been minimized.

Contributor

alko989 commented Apr 2, 2016

I have an R package with three different .cpp files, only one of them has the name of the package, but even that it is not necessary. For compilation I use a Makefile in the src folder with:

all: calcFmsy.so s6model.so s6modelts.so

calcFmsy.so: calcFmsy.cpp
    Rscript --vanilla -e "TMB::compile('calcFmsy.cpp')"

s6model.so: s6model.cpp
    Rscript --vanilla -e "TMB::compile('s6model.cpp')"

s6modelts.so: s6modelts.cpp
    Rscript --vanilla -e "TMB::compile('s6modelts.cpp')"

clean:
    rm -rf *o

So during installation the compile function from TMB compiles the three cpp files. And I use 3 useDynLib calls in the NAMESPACE file. To make it work for windows I have a separate makefile, Makefile.win, that is the same but instead of .so it has .dll

@alexfun

This comment has been minimized.

alexfun commented May 16, 2017

@alko989 Can you please elaborate on what you mean when you say that you use "3 useDynLib" calls in the NAMESPACE file? My TMB::compile step always creates a DLL with the same name as my package.

I've figured it out, adding a file called Makefile.win with the following contents (making sure indentations are tabs and not spaces):

all: my_dll_name.dll

my_dll_name.dll: my_dll_name.cpp
	Rscript --vanilla -e "TMB::compile('my_dll_name.cpp')"

clean:
	rm -rf *o

then allows me to use a MakeADFun of the form MakeAdFun(..., ..., DLL = "my_dll_name")

@alko989

This comment has been minimized.

Contributor

alko989 commented May 16, 2017

@alexfun What I meant was that my NAMESPACE file has three useDynLibcalls to load the three shared libraries of my package

useDynLib(calcFmsy)
useDynLib(s6model)
useDynLib(s6modelts)
@alexfun

This comment has been minimized.

alexfun commented May 16, 2017

Thanks for your reply. I had got that bit (except I am using roxygen's @useDynLib my_dll_name to build my namespace appropriately). What I had not understood was your comment about Makefile.win. Apparently R is smart enough to compile the files without a Makefile, so my instructions (copy and pasted from yours) were not read as I did not have .win extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment