Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalizers, closes #92 #93

Merged
merged 2 commits into from Aug 22, 2016

Conversation

@gaborcsardi
Copy link
Member

commented Aug 22, 2016

Classes can define a public finalize method, and this will be
called automatically when objects are garbage collected.

The destructor is called with no arguments, but of course it can refer to self and private, and if the class is not portable, then it can also just refer to members.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

Btw. this PR also help accessing private members from the finalizer, which is quite hacky currently, as I can see you need to extract the private env manually: a$.__enclos_env__$private

@@ -0,0 +1,100 @@
context("destructor")

This comment has been minimized.

Copy link
@wch

wch Aug 22, 2016

Member

Can you rename all of the instances of "destructor" to "finalizer"?

This comment has been minimized.

Copy link
@gaborcsardi

gaborcsardi Aug 22, 2016

Author Member

Sure, no problem.

This comment has been minimized.

Copy link
@gaborcsardi

gaborcsardi Aug 22, 2016

Author Member

Done.

# last returned value is saved in .Last.value
1+1
# Instantiate an object:
O <- A$new()

This comment has been minimized.

Copy link
@wch

wch Aug 22, 2016

Member

Can you rename O to a, or obj? The name O is surprising and sort of distracts from the important parts.

This comment has been minimized.

Copy link
@gaborcsardi

gaborcsardi Aug 22, 2016

Author Member

Done.

gc()
# Remove the single existing reference to it, and force garbage collection
# (normally garbage collection will happen automatically from time
# time to time

This comment has been minimized.

Copy link
@wch

wch Aug 22, 2016

Member

This says "time time to time" - please remove extra "time" and add closing paren.

This comment has been minimized.

Copy link
@gaborcsardi

gaborcsardi Aug 22, 2016

Author Member

Done.

```

In the example above, we used `onexit=TRUE`, so that the finalizer will also be called when R exits. This is useful in some cases, like database connections, but it isn't necessary for others, like open files, since they will be closed anyway when the R process exits.
Destructors are implemented using the `reg.finalizer()` function, and they set `onexit=TRUE`, so that the destructor will also be called when R exits. This is useful in some cases, like database connections.

This comment has been minimized.

Copy link
@wch

wch Aug 22, 2016

Member

s/Destructor/Finalizer/

This comment has been minimized.

Copy link
@gaborcsardi

gaborcsardi Aug 22, 2016

Author Member

No worries, I replaced it everywhere.

@gaborcsardi gaborcsardi changed the title Destructors, closes #92 Finalizers, closes #92 Aug 22, 2016

Finalizers, closes #92
Classes can define a public finalize method, and this will be
called automatically when objects are garbage collected.
@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

PTAL.

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

I just had a thought: What about inheritance? Say that class A has a finalizer, and class B inherits from A. What should happen then?

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

YEah, that's a good one.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

I think the "usual" thing is sensible: the finalizer is still called, and you can override it in the child.
It might be working like this already, although even then it is worth adding a test case.

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

If A and B both have finalizers, I can imagine cases where B's finalizer should override A's, but I can also imagine some where both finalizers should run. In the latter case, could the finalizer invoke super$finalizer()?

And some tests of this would be nice :) (including tests with two levels of inheritance)

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

If A and B both have finalizers, I can imagine cases where B's finalizer should override A's, but I can also imagine some where both finalizers should run. In the latter case, could the finalizer invoke super$finalizer()?

Yes, I suspect that it already works like this, finalize is just a method like any other, after all.

And some tests of this would be nice :) (including tests with two levels of inheritance)

I'll write them in a minute.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

@wch Btw. what did you have in mind for two levels? There are a lot of combinations to test with two levels....

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

I've been thinking about it some more, and I think I've convinced myself that B's finalizer should have to explicitly call super$finalize() if it wants to invoke A's finalizer.

Here's an example of that illustrates the other side: why it can be awkward for B to have to invoke A's finalizer explicitly. Suppose A is a class that represents a database connection with a finalize method that closes the connection. And B is a subclass that buffers the reads (or something like that). That means that B (and every other subclass of A) will need to have a finalize that calls super$finalize(), even if B's finalizer does nothing else. And it's also possible to create a subclass of A which has broken behavior -- if the subclass has no finalize, it will leave the connection open.

On the other hand, if B's initializer wants to invoke A's initializer, the it needs to call super$initialize(). This is the current behavior. I think it makes sense for finalizers to have behavior that mirrors this.

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

For two levels: suppose C inherits from B, which inherits from A. Then if C's finalize calls super$finalize(), it should invoke B's finalize. And if B's finalize calls super$finalize(), it should invoke A's finalize.

I don't think it makes sense for C's finalize to be able to call A's finalize directly. Did you have any other combinations in mind?

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

TBH, I am not sure what you prefer now. :) Sorry.

Right now

  • finalizers are inherited, i.e. they are called on the child automatically (unless overridden, see next point),
  • finalizers can be overridden, in which case the parent's finalizer is not called automatically,
  • the child's finalizer can call the immediate parent's finalizer via super$finalize.
  • AFAIK this is exactly the same behavior as for initialize.

I think this is all quite good. It is safe and flexible. In your DB example, B does not have to define a finalizer just to call A's finalizer. OTH if it defines one to do sg extra, it can still call the parent's finalizer via super.

I'm almost done with the test cases, 5 more minutes....

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

For two levels: suppose C inherits from B, which inherits from A. Then if C's finalize calls super$finalize(), it should invoke B's finalize. And if B's finalize calls super$finalize(), it should invoke A's finalize.

Got it, so all three have finalizers and we destroy an object from the grandchild.

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

Oh right, I was confused -- I forgot that B would just inherit A's finalizer if it didn't define one. I agree that what you outlined sounds good, and I'm glad it just works that way!

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

Never mind, it was good to think this through. And also, to write the test cases, which are done btw.

As for two levels of inheritance, you could test the case when B does not define a finalizer, only A and C, etc. so there are a lot of cases.

Hmmm, this reminds me, what if we call super$finalize() and the superclass does not have a finalize method? I'll check this out in a minute.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

Hmmm, this reminds me, what if we call super$finalize() and the superclass does not have a finalize method? I'll check this out in a minute.

OK, this fails, as expected. On the other hand it also fails for initialize, so maybe we want to fix them both at the same place?

I think it is good defensive programming to just call super$initialize and super$finalize, especially if another package (and person) provides the superclass, so ideally these should always succeed. Although

if (!is.null(super$finalize)) super$finalize()

is a simple workaround, so probably not a big deal.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

Wait a sec, I am fixing some test labels.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2016

OK, should be all good now.

Maybe we could generate the non-portable test cases from the portable ones, because it is easy to make copy-and-paste errors. :)

@wch

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

Looks good!

@wch wch merged commit 9ca3108 into r-lib:master Aug 22, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@HenrikBengtsson

This comment has been minimized.

Copy link

commented Oct 5, 2016

Just a quick comment sharing my experiences with "automagic" finalizers:

Make sure to test that the finalizers are re-entrant. There's a risk that finalizers depend on code that may not be available when R runs it. For instance, an object may still be around, but all package code may have been unloaded when it's finalizer runs. This becomes especially complicated when they run when R itself exits, cf. from help(".Last"):

Exactly what happens at termination of an R session depends on the platform and GUI interface in use. A typical sequence is to run .Last() and .Last.sys() (unless runLast is false), to save the workspace if requested (and in most cases also to save the session history: see savehistory), then run any finalizers (see reg.finalizer) that have been set to be run on exit, close all open graphics devices, remove the session temporary directory and print any remaining warnings (e.g., from .Last() and device closure).

Then there's the following comment from help("reg.finalizer"):

R's interpreter is not re-entrant and the finalizer could be run in the middle of a computation. So there are many functions which it is potentially unsafe to call from f: one example which caused trouble is options. As from R 3.0.3 finalizers are scheduled at garbage collection but only run at a relatively safe time thereafter.

Problems often shows up as non-reproducible sporadic errors that are quite hard to troubleshoot, because how and when finalizers are run is hard to predict.

I went through the above a while ago with R.oo - required lots of trial-and-error development and very defensive coding.

Just my $.02

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Oct 5, 2016

@HenrikBengtsson Thanks, this is very useful! Maybe it could be even in the docs.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Oct 5, 2016

@HenrikBengtsson Reading now the R.oo code, wow!

@HenrikBengtsson

This comment has been minimized.

Copy link

commented Oct 5, 2016

Don't look to close; that piece of code is ad hoc and patchy - it's an instance where "if it works - don't change it" truly applies.

@gaborcsardi

This comment has been minimized.

Copy link
Member Author

commented Oct 5, 2016

@HenrikBengtsson What I am thinking about is that R6 objects are pretty self-contained, and usually do not need anything from the R6 package.

What might happen is that an R6 class needs its (package) environment to work. E.g. my process class needs the processx package environment. If this environment is not around after gc, that indeed restricts what we can run in the finalizer.

I'll experiment with this.

@wch

This comment has been minimized.

Copy link
Member

commented Oct 6, 2016

@HenrikBengtsson Very interesting, thanks for sharing your experiences. These do sound like really tricky problems to diagnose.

FWIW, when I run this code in an R session:

library(R6)
print(loadedNamespaces())

e <- new.env()
reg.finalizer(e, onexit = TRUE, function(e) {
  cat("==== Finalizer ====\n")
  print(loadedNamespaces())
})

.Last <- function() {
  cat("==== .Last ====\n")
  print(loadedNamespaces())
}
q()

It doesn't unload any of the packages before .Last() or e's finalizer are called. (Tested on R 3.2.3 on a Mac.)

$ R
> library(R6)
> print(loadedNamespaces())
[1] "R6"        "graphics"  "utils"     "grDevices" "stats"     "datasets" 
[7] "methods"   "base"     
> 
> e <- new.env()
> reg.finalizer(e, onexit = TRUE, function(e) {
+   cat("==== Finalizer ====\n")
+   print(loadedNamespaces())
+ })
NULL
> 
> .Last <- function() {
+   cat("==== .Last ====\n")
+   print(loadedNamespaces())
+ }
> q()
==== .Last ====
[1] "R6"        "graphics"  "utils"     "grDevices" "stats"     "datasets" 
[7] "methods"   "base"     
==== Finalizer ====
[1] "R6"        "graphics"  "utils"     "grDevices" "stats"     "datasets" 
[7] "methods"   "base"     
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.