Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should have a method for cloning objects #27

Closed
wch opened this issue Sep 3, 2014 · 27 comments

Comments

@wch
Copy link
Member

commented Sep 3, 2014

No description provided.

@richfitz

This comment has been minimized.

Copy link

commented Sep 3, 2014

👍

@wch

This comment has been minimized.

Copy link
Member Author

commented Oct 29, 2014

I think this will require storing references in the public binding env (aka self) to the enclosing env, private env, and possibly super binding env. They could be stored in attributes.

@gluc

This comment has been minimized.

Copy link

commented Mar 29, 2015

That would be awesome! Not sure I understand your comments, but if you give me some hints, I wouldn't mind helping you implementing this.

@wch

This comment has been minimized.

Copy link
Member Author

commented Mar 30, 2015

@gluc Right now, given the R6 object (AKA the public binding environment of the object), there's not a straightforward way to access the enclosing environment, private binding environment, or the super binding environment.

To clone an object, it's necessary to access all of these things, when you're given the public binding env. So it might be necessary to save references to these things as attributes.

See https://github.com/wch/R6/blob/master/doc_extra/R6.pdf for a diagram of the structure of the environments. These follow the diagramming convention used in Hadley's Advanced-R book.

@thomasp85

This comment has been minimized.

Copy link
Member

commented Apr 17, 2015

How is this progressing (if at all?) I came here to ask exactly for that...

Couldn't all R6 objects have a reserved method called clone/copy in the same way as initialize? This method would get set during the call to new() and the enclosing environment could be set to contain self, private, the constructer etc... Or am I missing an obvious problem with this approach?

@thomasp85

This comment has been minimized.

Copy link
Member

commented Apr 17, 2015

by using super$copy() subclasses could even do some additional work to its fields and methods and pass in these modifications as named arguments - this could be useful if one of the fields were an environment that needed to be copied before the object itself is copied...

@wch

This comment has been minimized.

Copy link
Member Author

commented Apr 21, 2015

@thomasp85 I haven't had time recently to get to this... that does sound like an interesting solution. When I get some downtime I'll think about it some more.

@CameronBieganek

This comment has been minimized.

Copy link

commented May 26, 2015

I also support the movement to have a built in copy method. This is actually forcing me to go back to R5 classes at the moment.

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented May 26, 2015

@clbieganek Why don't you just implement a copy method? I guess for most classes with reference semantics (e.g. databases handles, shiny widgets, etc.) you would need to do that, anyway.

@richfitz

This comment has been minimized.

Copy link

commented May 27, 2015

Built-ins (and a standard interface) would have the advantage of having objects that don't know each other working well.

When doing a clone/copy and working recursively through the objects contained in one R6 instance, if an object is not a R6/reference class, it could be copied directly, if it supports clone / copy that would be called. Similarly, an object can become non-copyable by implementing a method that just throws an error (which would be useful for things like boost::noncopyable).

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented May 27, 2015

@richfitz Well, yes, these are good points. Personally, I still would not go back to RC because of the lack of the copy method. :) I would rather code my copy method. Your points are valid, but chances are, you are not embedding/inheriting from classes by other people.

@CameronBieganek

This comment has been minimized.

Copy link

commented May 27, 2015

@gaborcsardi I did try to implement my own copy method, but I didn't get it to work right. Here's a simplified version of what I had:

top.class <- R6Class(
    'top.class',
    portable = FALSE,
    public = list(

        initial.model = NULL,
        final.model = NULL,

        intialize = function( ...stuff... )
        {
            # Fill the "models" slot of a "mixed.model" object with a
            # list of "single.model" objects and put the result in the
            # "initial.model" slot above.
        ),

        update.model = function( ...stuff... )
        {
            # Take the "mixed.model" in the "initial.model" slot, change
            # some parameters in the model, and put the result in the
            # "final.model" slot. The first step of this procedure is to
            # copy the initial "mixed.model" using the "mixed.model"
            # copy method.
        )

    )
)




mixed.model <- R6Class(
    'mixed.model',
    portable = FALSE,
    public = list(

        models = NULL,

        copy = function()
        {
            out <- mixed.model$new()
            out$models <- lapply(models, function(sm) sm$copy())
            out
        }

    )
)




single.model <- R6Class(
    'single.model',
    portable = FALSE,
    public = list(

        a = NULL,
        b = NULL,
        c = NULL,
        d = NULL,

        initialize = function(a, b, c, d)
        {
            a <<- a
            b <<- b
            c <<- c
            d <<- d
        },

        predict = function(x, y, z)
        {
            a + b * x + c * y + d * z
        },

        update.model = function( ...stuff... )
        {
            a <<- # non-trivial function of the arguments
            b <<- # non-trivial function of the arguments
            c <<- # non-trivial function of the arguments
            d <<- # non-trivial function of the arguments
        },

        copy = function()
        {
            out <- as.environment(as.list.environment(self))
            class(out) <- c('single.model', 'R6')
            out
        }

    )
)

However, after testing the code, I realized that the single.model copy method creates a new copy of the object at a different memory address, but the methods/closures within the copied object are still the same closures that are within the original object. In particular, when I used the single.model$update.model method on the copied object, it changed the parameters of the original object rather than the parameters of the copied object.

...Now that I right it down, I wonder if changing the single.model$update.model method to use
self$a <- ... would solve that particular problem.

I'm sure I could eventually figure out how to properly write the copy method. I briefly looked at the RC copy method, and it seems like there may be some clues there. However, I don't really have time right now to plumb the depths of environments and closures.

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented May 27, 2015

@clbieganek Yes, I clearly underestimated the difficulties here, sorry about that. :( I'll try to work out a copy method later this week if I'll have the time.

@gluc

This comment has been minimized.

Copy link

commented May 27, 2015

Not sure if it helps you guys, but I implemented a clone method in the data.tree package. It works by converting the R6 to a nested list (see here) and back (see here ).
This works fine for my specific requirement, but it's a very incomplete implementation (e.g. no circular references are checked, all references are assumed to be R6, only converts public fields, etc.).

@CameronBieganek

This comment has been minimized.

Copy link

commented May 27, 2015

Ok, I came up with something that seems to work properly. I used the RC copy method for inspiration. Here's the RC copy method:

copy = function (shallow = FALSE) 
{
   def <- .refClassDef
   value <- new(def)
   vEnv <- as.environment(value)
   selfEnv <- as.environment(.self)
   for (field in names(def@fieldClasses)) {
      if (shallow) 
         assign(field, get(field, envir = selfEnv), envir = vEnv)
      else {
         current <- get(field, envir = selfEnv)
         if (is(current, "envRefClass")) 
            current <- current$copy(FALSE)
         assign(field, current, envir = vEnv)
      }
   }
   value
}

Here's my implementation of an R6 copy method:

copy = function(shallow = FALSE, parent = parent.frame())
{
    Copy <- new.env(parent = parent)
    class(Copy) <- class(self)

    for(field.name in setdiff(ls(self, all.names = TRUE), 'self'))
    {
        if(shallow)
        {
            Copy[[field.name]] <- self[[field.name]]

        } else
        {
            field.value <- self[[field.name]]
            if(inherits(field.value, 'R6')) field.value <- field.value$copy(FALSE)

            Copy[[field.name]] <- field.value
            if(is.function(field.value)) environment(Copy[[field.name]]) <- Copy
        }
    }

    Copy$self <- Copy

    Copy
}

There are at least two scenarios that I haven't accounted for in my copy method:

  • One of the fields contains an environment (that is not an R6 class).
  • One of the fields contains a closure where the enclosing environment is NOT the R6 object.

Comments and suggestions welcome.

@CameronBieganek

This comment has been minimized.

Copy link

commented May 27, 2015

Shucks. Here's another scenario that my new R6 copy method does not cover:

  • One of the fields contains a list of R6 objects.

...which is the case for my mixed.model class above. Granted, I could keep the mixed.model copy method the same as it is shown above, but it would be better to have one copy method that I could put in all of my R6 classes. Interestingly, I just tested the RC copy method and it has the same problem -- it fails to make a distinct copy for a field that contains a list of RC objects. This is because the RC copy method (and my R6 copy method) only checks the top-level class of the field when recursively calling copy, rather than also checking the classes of the elements of the list.

@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

Thinking some more about this I would propose the following construct (implementation will follow if the approach resonates):

During the call to $new() a public method of name 'copy' or 'clone' will automatically be set to a method having access to all relevant information (private, public, enclosing, super_binding etc. They are all available from $new()). This method will make a shallow copy of the object, returning a new object and should not be modified or overwritten (in fact it will overwrite any public method of the same name). To modify the copying process to make it deep, the developer can provide a private method copyFields which will be called by copy/clone and return a list with values for public and private fields that should take precedence over existing values. This gives the developer the power to make decisions about how fields containing environment/list of R6Classes, etc. should be handled, without having them worrying over the actual copy implementation. Furthermore it will support calls to super() so inheritance of copying behaviour is supported.

Does this makes sense to you? It supports some strong conventions (two protected field names etc.) and I don't know how you feel about that, but I think it is a gracefull way to give developers control over the copying process...

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

This sounds too difficult to me to be honest. Why not just have a default clone method that the user can override? This should be public, I guess, so that people can call it.

@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

My intent was to abstract away all the possible failures of having people handling the different enclosing environments. It’s easy enough to create a standard shallow clone method, but if someone needs a deep copy of an environment in a private field they would be required to understand the full R6 object model and how the different enclosures are extracted and set. I don’t say that this is impossible for everyone to learn, but I think it will be very unappealing for most developers…

While my explanation might seems complicated I think it’s rather simple from a developer point of view:

$copy() is provided for free

To modify copying add a $fieldCopy() method that returns a list with a public and private element, each containing named values to put into the cloned object.

For instance:

R6Class(
    'test',
    public = list(
        x=5,
        y='a',
        env=new.env(parent=emptyenv())
    ),
    private = list(
        fieldCopy = function() {
            list(
                public=list(),
                private=list(
                    env=list2env(as.list(self$env))
                )
            )
        }
    )
)

For a developer I think this is a clean as it gets...

When you say difficult are you thinking of the R6 implementation?

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

OK, I thought about this, and I think what you have is quite good.

Personally I would change fieldCopy to copy or clone, the user does not need to know that this is not the real copy, and there is another one. And then change the name of clone to something internal, or just make it part of new() somehow.

When the user is writing a copy method, they should be able to call the copy methods of fields, of course. E.g.

...
copy = function() {
  list(
    public = list(),
    private = list(
      embedded = self$embedded$copy(),
      simple_field = self$simple_field
    )
  )
)

Of course I might be still missing something....

@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

Thats exactly what I had in mind (I'm not married to the fieldCopy method name, and you might be right that a simpler name is more appealing).

Actually my intent was that only fields that required special attention should be returned by the fieldCopy method, so the simple_field in your example would not be required...

@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 4, 2015

I'll try to brew together a rudimentary implementation over the weekend and see if there are any obvious problems with this design...

@wch

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2015

Thanks guys, I'm going on a trip for a little while I'll try to take a look at this stuff when I get a chance.

@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 7, 2015

I've created an implementation with my proposed approach in #57. I identified a range of problems with @clbieganek approach, mainly that the enclosing environment of methods in super are not modified and thus still points to the original object. Also active bindings are lost and environment locking is reset.

My copy method is almost a full call to new(), which ensures that the returned object is fully unconnected. There is quite a lot of code copy/paste so maybe some refactoring would be in place if it is to be merged...

@wch wch referenced this issue Jun 20, 2015
@wch

This comment has been minimized.

Copy link
Member Author

commented Jun 20, 2015

@thomasp85 Thanks for the PR -- it's useful to refer to, although I think I'm going to go a slightly different way. I do like the idea of looking a $deepCopy() method.

I've opened PR #62 where, instead of automatically adding a clone method to all objects, the R6 package provides a clone() function, which is called with clone(obj). The programmer can add a public clone() method if they want to be able to call it with obj$clone().

  clone = function() clone(self)
@thomasp85

This comment has been minimized.

Copy link
Member

commented Jun 21, 2015

@wch no problem - it was a rather crude prototype to show a working example of what I had in mind - I learned a lot about enclosures in the process so no time wasted on my part :-)

I'll have a look at #62 and see if I have any comments...

@wch wch closed this in #62 Jun 22, 2015

@wch

This comment has been minimized.

Copy link
Member Author

commented Jun 23, 2015

Everyone who's been watching this issue: I wrestled with the implementation of deep cloning for a while, and came up with an implementation in #65. I'd appreciate any feedback on it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.