Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving compiled code from cfunction objects in user defined locations is not supported #13

Closed
jranke opened this issue Nov 19, 2020 · 17 comments · Fixed by #16
Closed

Comments

@jranke
Copy link
Contributor

jranke commented Nov 19, 2020

I frequently generate cfunctions in my mkin package by generating C code from a user specified ODE model definition which is then compiled for fast solutions with deSolve within iterative optimizations. Generally, it is fine to have the compiled code under /tmp, but when I have large knitr documents and specific chunks take a long time to complete I use knitr chunk caching and I get problems with model objects in the knitr cache not finding their compiled code any more. Therefore, I would like to store the shared objects generated by cfunctions in a user defined path, e.g. in the current working directory.

The functions readDynLib and writeDynLib from R/utilities.R would fit the bill, but they are not exported, and the former has a bug making the example code in dontrun fail, even if the functions are found by using inline:::readDynLib and inline:::writeDynLib. An alternative to fixing and exporting these functions would be to add an argument to cfunction, defining a path for the shared object.

@jranke
Copy link
Contributor Author

jranke commented Nov 19, 2020

A reproducible example showing the bug, slightly adapted from man/utilites.Rd:

x <- as.numeric(1:10)
n <- as.integer(10)

code <- "
      integer i
      do 1 i=1, n(1)
    1 x(i) = x(i)**3
"
cubefn <- cfunction(signature(n="integer", x="numeric"), code, convention=".Fortran")
code(cubefn)

cubefn(n, x)$x

fname <- tempfile()
inline:::writeDynLib(cubefn, file = fname)
# load and assign different name to object
cfn <- inline:::readDynLib(fname)
print(cfn)
cfn(2, 1:2)

which gives

> x <- as.numeric(1:10)
> n <- as.integer(10)
> code <- "
+       integer i
+       do 1 i=1, n(1)
+     1 x(i) = x(i)**3
+ "
> cubefn <- cfunction(signature(n="integer", x="numeric"), code, convention=".Fo
rtran")
> code(cubefn)
  1:
  2:        SUBROUTINE file628954285952 ( n, x )
  3:       INTEGER n(*)
  4:       DOUBLE PRECISION x(*)
  5:
  6:       integer i
  7:       do 1 i=1, n(1)
  8:     1 x(i) = x(i)**3
  9:
 10:       RETURN
 11:       END
 12:
> cubefn(n, x)$x
 [1]    1    8   27   64  125  216  343  512  729 1000
> fname <- tempfile()
> inline:::writeDynLib(cubefn, file = fname)
> cfn <- inline:::readDynLib(fname)
Error in inline:::readDynLib(fname) : 'file' does not exist

@eddelbuettel
Copy link
Owner

Hm, can you not create a local helper that picks the created artifact, here an object file, up and stores it where you want it?

[ AFAIK knitr kinda / mostly / somewhat solved the chunk dependency problem many years ago; that is what we use in the Rcpp Gallery site (with some known restrictions, I think it prefers to compile all chunks at once ...) The rstan folks also built on this, and I think even persist compilation output but I am a little fuzzy. I don't really want to rock the boat here as this is a pretty stable package (that, as far as I can tell, is not that widely used anymore as e.g. Rcpp and cppFunction() ate its lunch at least over in my world of short C++ snippets....). ]

Also, the pair of writeDynLib and readDynLib are a contribution by @karlines so she may be a better person to talk to for details. Maybe the two of you can discuss an approach you could scope out for discussion?

@jranke
Copy link
Contributor Author

jranke commented Nov 19, 2020

Yes, sure, I could solve this in my package. The more I think about it, the more I think a path argument for cfunction would be the way to go, to have the object stored in a user defined location right away. Then, readDynLib and writeDynLib would not be necessary as far as I can see.

knitr chunk dependencies in conjunction with caching are nice, but not if objects contain references to /tmp/xyz.

@eddelbuettel
Copy link
Owner

Yes, it sounds like cfunction could be extended that way. I just don't have any personal use cases so I would rely on you to implement, test and document it carefully.

@jranke
Copy link
Contributor Author

jranke commented Nov 20, 2020

Meanwhile I tried both approaches...

One problem with writeDynLib that is hard to fix is that for completeness, it would need to modify the environment of the R function that was originally created, which it does not do. Therefore, the CFunc object that is saved together with the DLL still refers to the original libLFile via this environment. This becomes obvious when we call getDynLibon the new object obtained after reading with readDynLib. This is illustrated below, using modified functions writeDynLib and readDynLib from my branch saving_compiled_code:

> code <- "
+       integer i
+       do 1 i=1, n(1)
+     1 x(i) = x(i)**3
+ "
> cubefn <- cfunction(signature(n="integer", x="numeric"), code,
+   convention=".Fortran")
> writeDynLib(cubefn, bname = "testname", directory = tempdir())

After writing using the specified name "testname", we find the two new files in the specified location, next to the three files originally created by cfunction:

> dir(tempdir())
[1] "file20545001d463.f"  "file20545001d463.o"  "file20545001d463.so"
[4] "testname.CFunc"      "testname.so"

We can read the DLL from these files, but the object still contains references to the original locations.

> cfn <- readDynLib("testname", directory = tempdir())
> getDynLib(cfn)
DLL name: file20545001d463
Filename: /tmp/Rtmp9rtQUU/file20545001d463.so
Dynamic lookup: TRUE
>  environment(cfn@.Data)$f
[1] "file20545001d463"
>  environment(cfn@.Data)$libLFile
[1] "/tmp/Rtmp9rtQUU/file20545001d463.so"

Only the DLL attribute is modified.

>  attributes(cfn)
$code
[1] "\n       SUBROUTINE file20545001d463 ( n, x )\n      INTEGER n(*)\n      DO
UBLE PRECISION x(*)\n\n      integer i\n      do 1 i=1, n(1)\n    1 x(i) = x(i)*
*3\n\n      RETURN\n      END\n\n"

$class
[1] "CFunc"
attr(,"package")
[1] "inline"

$DLL
[1] "/tmp/Rtmp9rtQUU/testname.so"

$fname
[1] "file20545001d463"

This confirms yesterdays idea that it would be better to allow cfunction to store its output in a more permanent way in the first place. However, when I tried to follow this route, I found myself changing a lot in the cfunction code, with only a partial understanding of what I was doing, so I got nowhere until now.

jranke added a commit to jranke/inline that referenced this issue Nov 20, 2020
Instead of the argument "file", there are now two arguments "bname" and
"directory" (see updated documentation in man/utilities.Rd).

However, we still have remainders of the old DLL locations and names in
after writing and reading, see eddelbuettel#13.
@jranke
Copy link
Contributor Author

jranke commented Nov 20, 2020

OK, as I am unlikely to change cfunction, I completed writeDynLib in the way postulated above. Now I get

> getDynLib(cfn)
DLL name: testname
Filename: /tmp/RtmpgxKotq/testname.so
Dynamic lookup: TRUE
> environment(cfn@.Data)$f
[1] "testname"
> environment(cfn@.Data)$libLFile
[1] "/tmp/RtmpgxKotq/testname.so"
> attributes(cfn)
$code
[1] "\n       SUBROUTINE file4d044553b24d ( n, x )\n      INTEGER n(*)\n      DO
UBLE PRECISION x(*)\n\n      integer i\n      do 1 i=1, n(1)\n    1 x(i) = x(i)*
*3\n\n      RETURN\n      END\n\n"

$class
[1] "CFunc"
attr(,"package")
[1] "inline"

$DLL
[1] "/tmp/RtmpgxKotq/testname.so"

$fname
[1] "file4d044553b24d"

> cfn(3, 1:3)
$n
[1] 3

$x
[1]  1  8 27

I think this works, but needs some testing. @karlines would you like to check if this is what you intended with these functions? You can see the changes I made here.

@eddelbuettel I will send a PR when I am confident that everything is OK.

@eddelbuettel
Copy link
Owner

Looks like you are heading in a fruitful direction of generalizing what we currently, but don't rush a PR -- we have found that more discussion often has value. And it would be good to hear from @karlines first,

@jranke
Copy link
Contributor Author

jranke commented Nov 23, 2020

Even after modification of the environment, the finalizer set for the environment of the CFunc function object x sticks around and removes my dynlib as environment(x@.Data)$libLfile points to it. And I do not know how to replace the finalizer. Setting a finalizer on the environment via reg.finalizer a second time does not remove the first finalizer, but adds another one.

Is it important to unlink the DLL in the finalizer as done here
https://github.com/eddelbuettel/inline/blob/master/R/cfunction.R#L194
What happens if we don't?

@jranke
Copy link
Contributor Author

jranke commented Nov 24, 2020

Now I modified the finalizer to unlink only the original DLL that is intended to be temporary (libLFile_orig in the function environment). If we use writeDynLib and readDynLib, the path to the DLL libLFile in the function environment is set to a new path that is not cleaned up by the finaiizer.

The code of cfunction was a bit confusing for me, as the variable f that is generated in the very beginning of cfunction() is used for the function name (in case we have a single signature), but also for the DLL name. I have partially addressed this in 974bdea, see the commit message for details.

The current state of the full patch can be seen here.

@eddelbuettel
Copy link
Owner

I had glanced at your changes earlier (albeit only superficially). Any chance you could make the change to the readDynbLib and writeDynLib() interfaces just be a new (optional, with default) argument and rename file to bname ?

@jranke
Copy link
Contributor Author

jranke commented Nov 24, 2020

Mhm, yes, I reverted to file for this argument, as bname is not exactly a brilliant improvement over that.

jranke added a commit to jranke/inline that referenced this issue Nov 24, 2020
@jranke
Copy link
Contributor Author

jranke commented Nov 24, 2020

Note however that the functions do not work as they are on the master branch, see the original bug report. So I do not think anyone could have used them.

@jranke
Copy link
Contributor Author

jranke commented Nov 24, 2020

I added some tests based on the examples in man/cfunction.Rd. They run fine when run in a normal R session, but compiling fails when these tests are run via R CMD check:

* checking tests ...
  Running ‘cfunction.R’
 ERROR
Running the tests in ‘tests/cfunction.R’ failed.
Last 13 lines of output:
    4:       DOUBLE PRECISION x(*)
    5: 
    6:       integer i
    7:       do 1 i=1, n(1)
    8:     1 x(i) = x(i)**3
    9: 
   10:       RETURN
   11:       END
   12: 
  
  Compilation ERROR, function(s)/method(s) not created!
  Error in compileCode(f, code, language, verbose) : 
    Error in file(filename, "r", encoding = encoding) :   cannot open the connectionCalls: local ... eval.parent -> eval -> eval -> eval -> eval -> source -> fileIn addition: Warning message:In file(filename, "r", encoding = encoding) :  cannot open file 'startup.Rs': No such file or directoryExecution halted
  Calls: cfunction -> compileCode
  Execution halted

@eddelbuettel Do you have an idea what's going wrong here?

@eddelbuettel
Copy link
Owner

I do not. I only see the CRAN checks which are all OK. Don't rock the boat :)

@eddelbuettel
Copy link
Owner

That said, I am of course on board adding unit tests. But if were to decide to do that, I think we would want a different issue ticket for it, and I would likely make a case for tinytest, or for just scripts in tests/.

@jranke
Copy link
Contributor Author

jranke commented Nov 24, 2020

i know you don't like testthat, don't worry. I just have a script in tests/ but it is run differently than a normal R session. I also tried Rscript --vanilla tests/cfunction.R and it runs fine, while R CMD check complains as above.

@eddelbuettel
Copy link
Owner

while R CMD check complains as above.

I suspect that is just the usual dance R CMD check does to be rigoruous about a) temporary directories and other resources, b) its .libPaths() and what not c) black magic as usual. You will need to debug that as we (obviously) cannot use it if it breaks R CMD check.

jranke added a commit to jranke/inline that referenced this issue Nov 25, 2020
Instead of the argument "file", there are now two arguments "bname" and
"directory" (see updated documentation in man/utilities.Rd).

However, we still have remainders of the old DLL locations and names in
after writing and reading, see eddelbuettel#13.
jranke added a commit to jranke/inline that referenced this issue Nov 25, 2020
eddelbuettel pushed a commit that referenced this issue Nov 26, 2020
* Overhaul readDynLib and writeDynLib

Instead of the argument "file", there are now two arguments "bname" and
"directory" (see updated documentation in man/utilities.Rd).

However, we still have remainders of the old DLL locations and names in
after writing and reading, see #13.

* writeDynLib: Also adapt the environment of the function

* Export writeDynLib and readDynLib

* Test the reloaded CFunc object

* dyn.unload is unnecessary, as we have a new dllname

* Prevent removal of DLL in finalizer, clean up

cfunction and R/utilities.R:
- use DLLpath instead of DLLname if it is a path

cfunction:
- Add 'name' argument to make it possible to specify a function name
  that is used in the source code of the function
- In the finalizer, remove only 'libLFile_orig' which always points to
  the original DLL file, while 'libLFile' can be changed by
  'writeDynLib' to a user defined location

R/utilities.R (writeDynLib and readDynLib):
- Do not use additional attributes for DLL path, DLL name and function
  name, but only modify this information in the function environment
  in 'writeDynLib', and restore this information in 'readDynLib'
- Remove code that was meant to support 'CFuncList' objects in
  'writeDynLib' and 'readDynLib'

* Revert to using a 'file' argument instead of 'bname'

Based on #13 (comment)

* Add tests and fix bugs that turned up

Unfortunately compiling cfunction objects does not work in the R
testing environment. The error message upon compiling the first
cfunction in tests/cfunction.R:

```
* checking tests ...
  Running ‘cfunction.R’
 ERROR
Running the tests in ‘tests/cfunction.R’ failed.
Last 13 lines of output:
    4:       DOUBLE PRECISION x(*)
    5:
    6:       integer i
    7:       do 1 i=1, n(1)
    8:     1 x(i) = x(i)**3
    9:
   10:       RETURN
   11:       END
   12:

  Compilation ERROR, function(s)/method(s) not created!
  Error in compileCode(f, code, language, verbose) :
    Error in file(filename, "r", encoding = encoding) :   cannot open the connectionCalls: local ... eval.parent -> eval -> eval -> eval -> eval -> source -> fileIn addition: Warning message:In file(filename, "r", encoding = encoding) :  cannot open file 'startup.Rs': No such file or directoryExecution halted
  Calls: cfunction -> compileCode
  Execution halted
```

* Overhaul readDynLib and writeDynLib

Instead of the argument "file", there are now two arguments "bname" and
"directory" (see updated documentation in man/utilities.Rd).

However, we still have remainders of the old DLL locations and names in
after writing and reading, see #13.

* writeDynLib: Also adapt the environment of the function

* Export writeDynLib and readDynLib

* Test the reloaded CFunc object

* dyn.unload is unnecessary, as we have a new dllname

* Prevent removal of DLL in finalizer, clean up

cfunction and R/utilities.R:
- use DLLpath instead of DLLname if it is a path

cfunction:
- Add 'name' argument to make it possible to specify a function name
  that is used in the source code of the function
- In the finalizer, remove only 'libLFile_orig' which always points to
  the original DLL file, while 'libLFile' can be changed by
  'writeDynLib' to a user defined location

R/utilities.R (writeDynLib and readDynLib):
- Do not use additional attributes for DLL path, DLL name and function
  name, but only modify this information in the function environment
  in 'writeDynLib', and restore this information in 'readDynLib'
- Remove code that was meant to support 'CFuncList' objects in
  'writeDynLib' and 'readDynLib'

* Revert to using a 'file' argument instead of 'bname'

Based on #13 (comment)

* Add tests and fix bugs that turned up

Unfortunately compiling cfunction objects does not work in the R
testing environment. The error message upon compiling the first
cfunction in tests/cfunction.R:

```
* checking tests ...
  Running ‘cfunction.R’
 ERROR
Running the tests in ‘tests/cfunction.R’ failed.
Last 13 lines of output:
    4:       DOUBLE PRECISION x(*)
    5:
    6:       integer i
    7:       do 1 i=1, n(1)
    8:     1 x(i) = x(i)**3
    9:
   10:       RETURN
   11:       END
   12:

  Compilation ERROR, function(s)/method(s) not created!
  Error in compileCode(f, code, language, verbose) :
    Error in file(filename, "r", encoding = encoding) :   cannot open the connectionCalls: local ... eval.parent -> eval -> eval -> eval -> eval -> source -> fileIn addition: Warning message:In file(filename, "r", encoding = encoding) :  cannot open file 'startup.Rs': No such file or directoryExecution halted
  Calls: cfunction -> compileCode
  Execution halted
```

* Remove superseded test script

* Improve doc of name argument

* Remove superseded test script again

This came back with the merge commit and I didn't notice

* Write the CFunc object in two steps

By separating the process, it is easier to catch potential errors when
moving the DLL around (now done in moveDLL()). Also, the files can get
independent file names in this way. Only writing CFunc objects is
currently supported.

* With current cfunction moveDLL prevents removal of DLL

during garbage collection.

* Improve docs and testing of the 'name' argument

* Small doc improvement.
eddelbuettel added a commit that referenced this issue Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants