Skip to content

Rcpp_PreserveObject/Rcpp_ReplaceObject to protect underlying SEXP in String #322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 20, 2015

Conversation

thirdwing
Copy link
Member

This is for #321 .

Rewriting String class using PreserveStorage will be much elegant, but I am afraid it might break existing code.

This PR just provides a simple solution using Rcpp_PreserveObject, the underlying SEXP will be protect in constructor and unprotect by Rcpp_ReleaseObject.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector test() {
    Function gc("gc");

    SEXP character = Rf_mkChar("ouch");
    String string = String("ouch string");
    String otherString = string;

    // force big allocation + gc
    Shield<SEXP> other(Rf_allocVector(INTSXP, 1E6));
    gc();

    Rprintf("CHARSXP:\n");
    Rf_PrintValue(character);

    Rprintf("STRSXP:\n");
    Rf_PrintValue(wrap(string));

    Rf_PrintValue(wrap(otherString));

    return wrap(other);
}
> Rcpp::sourceCpp("ouch.cpp")
> gctorture(TRUE)
> invisible(test())
CHARSXP:
$warning
function (w)
{
    .warningsEnv$warnings <- append(.warningsEnv$warnings, w$message)
    invokeRestart("muffleWarning")
}
<environment: namespace:Rcpp>

STRSXP:
[1] "ouch string"
[1] "ouch string"

@kevinushey
Copy link
Contributor

Good point -- having the String class inherit from the storage meta-class would be an ABI breaking change, so this is definitely the less destructive way of enforcing protection in the class.

I think this looks good. We might slate a time to move to using the PreserveStorage metaclass when we're comfortable with ABI breakage. @jjallaire, @eddelbuettel?

@eddelbuettel
Copy link
Member

Was just (ahem) chasing a seg.fault between on Rcpp-using package of mine and one of a colleague. Likely string related ...

We sorta-kinda decided that we may as well call the next one 0.12.0, given the R_xlen_t. So call it a breaking change for this too? Any other things we want to include and break?

And just to be plain, I think we all hate breaking ABIs. But leaving bugs out in the open is worse, no?

@thirdwing
Copy link
Member Author

Of course, fixing bugs has the highest priority.

Give me some time and let's see how many packages will be broken. Then we can make better decision.

@eddelbuettel
Copy link
Member

Thanks, KK, but this may be different: we are not wondering what not longer compiles (== API change) but what, when not recompiled, no longer runs (== ABI change). This is also nasty, and we already required rebuild-due-to-ABI twice. We'd rather avoid it, but bugs are bugs...

@thirdwing
Copy link
Member Author

I think I know what you mean. Just let me do some research. Actually we are
facing a similar ABI problem in the company right now.
On Jul 7, 2015 6:39 PM, "Dirk Eddelbuettel" notifications@github.com
wrote:

Thanks, KK, but this may be different: we are not wondering what not
longer compilers (== API change) but what, when not recompiles, no longer
runs (== ABI change). This is also nasty, and we already required
rebuild-due-to-ABI twice. We'd rather avoid it, but bugs are bugs...


Reply to this email directly or view it on GitHub
#322 (comment).

@jjallaire
Copy link
Member

I don't think it will be ABI breaking if the entire implementation of
String is in a header file (everyone will just have their own
representation compiled into their modules, and the Rcpp library changing
underneath them won't affect this representation). Does that sound right
to everyone else?

On Tue, Jul 7, 2015 at 5:22 PM, Kevin Ushey notifications@github.com
wrote:

Good point -- having the String class inherit from the storage meta-class
would be an ABI breaking change, so this is definitely the less destructive
way of enforcing protection in the class.

I think this looks good. We might slate a time to move to using the
PreserveStorage metaclass when we're comfortable with ABI breakage.
@jjallaire https://github.com/jjallaire, @eddelbuettel
https://github.com/eddelbuettel?


Reply to this email directly or view it on GitHub
#322 (comment).

@eddelbuettel
Copy link
Member

Yes, good point.

@eddelbuettel
Copy link
Member

And FWIW Conrad tossed an Armadillo release candidate at me which I am currently rev.dep-testing, along with Rcpp 'HEAD'. All good so far with 28 out 129 built and only one R CMD check fail (for AFAIK as usual, a missing Suggests: not tested for).

@jjallaire
Copy link
Member

I think we might need to do a bit more here to make sure that copying and
assignment are handled correctly.

Whenever the object is destroyed we'll call Rcpp_ReleaseObject on the data.
I also noticed that in all forms of construction including copying we do a
fresh Rcpp_PreserveObject on the data. I haven't looked at
Rcpp_ReleaseObject and Rcpp_PreserveObject but if they are effectively
reference counted then this is fine (otherwise we need to provide some sort
of reference counting wrapper around the data so it doesn't get prematurely
released e.g. after a copy).

Assignment looks good to me via Rcpp_ReplaceObject.

@kevinushey Could you take one more look with the above issues in mind?

On Wed, Jul 8, 2015 at 8:42 AM, Dirk Eddelbuettel notifications@github.com
wrote:

And FWIW Conrad tossed an Armadillo release candidate at me which I am
currently rev.dep-testing, along with Rcpp 'HEAD'. All good so far with 28
out 129 built and only one R CMD check fail (for AFAIK as usual, a missing
Suggests: not tested for).


Reply to this email directly or view it on GitHub
#322 (comment).

@eddelbuettel
Copy link
Member

Once upon a time these two functions just set the magic bit R itself would consult before doing the release. As R is not (yet) using reference counting simply setting the bit again (and again and ...) should do no harm.

@jjallaire
Copy link
Member

But then when you unset the bit in a destructor are you freeing the other
instances that e.g. copies were made from?

On Wed, Jul 8, 2015 at 9:27 AM, Dirk Eddelbuettel notifications@github.com
wrote:

Once upon a time these two functions just set the magic bit R itself would
consult before doing the release. As R is not (yet) using reference
counting simply setting the bit again (and again and ...) should do no
harm.


Reply to this email directly or view it on GitHub
#322 (comment).

@romainfrancois
Copy link
Contributor

What on earth are you all talking about unsetting bits. No such thing is done by R_ReleaseObject, it just removes it from the R_PreciousList.

If that's the only place this SEXP was reachable from, then the GC will do its thing, but under no circumstance does Rcpp or R_ReleaseObject unsets bits.

Please show me otherwise if the implementation has evolved to tweaking with the internal bits, I'd like to see that.

@jjallaire
Copy link
Member

Based on that here is my concern:

  1. A copy of String is made here:
    https://github.com/thirdwing/Rcpp/blob/master/inst/include/Rcpp/String.h#L67-L70
    which calls R_PreserveObject on the same SEXP owned within the source
    String (now two String instances have called R_PreserveObject on the same
    SEXP).
  2. One of the two String instances is destroyed here:
    https://github.com/thirdwing/Rcpp/blob/master/inst/include/Rcpp/String.h#L136-L138

Does this effectively free the SEXP from underneath the String instance
that wasn't destroyed?

Seems like it does but I could easily be missing something.

On Wed, Jul 8, 2015 at 9:47 AM, Romain François notifications@github.com
wrote:

What on earth are you all talking about unsetting bits. No such thing is
done by R_ReleaseObject, it just removes it from the R_PreciousList.

If that's the only place this SEXP was reachable from, then the GC will do
its thing, but under no circumstance does Rcpp or R_ReleaseObject unsets
bits.

Please show me otherwise if the implementation has evolved to tweaking
with the internal bits, I'd like to see that.


Reply to this email directly or view it on GitHub
#322 (comment).

@romainfrancois
Copy link
Contributor

So it is preserved (by R_PreserveObject) twice. At this point it appears twice in the R_PreciousList. then released once, by R_ReleaseObject so it only appears once in the precious list, which prevents the GC to collect it.

This can be debugged, with some hackery (not cran proof stuff), you can get access to the precious list and traverse it yourself to check how many times a given SEXP appears ...

All of Rcpp would fall apart if it was not preserving the same SEXP multiple times, just think about a function that passes a NumericVector by value for example.

@jjallaire
Copy link
Member

Okay, very good point re: the rest Rcpp. I think we are good here.

On Wed, Jul 8, 2015 at 9:57 AM, Romain François notifications@github.com
wrote:

So it is preserved (by R_PreserveObject) twice. At this point it appears
twice in the R_PreciousList. then released once, by R_ReleaseObject so it
only appears once in the precious list, which prevents the GC to collect
it.

This can be debugged, with some hackery (not cran proof stuff), you can
get access to the precious list and traverse it yourself to check how many
times a given SEXP appears ...

All of Rcpp would fall apart if it was not preserving the same SEXP
multiple times, just think about a function that passes a NumericVector
by value for example.


Reply to this email directly or view it on GitHub
#322 (comment).

@thirdwing
Copy link
Member Author

So at least we have a temporary fix (not elegant).

@jjallaire I also think it will not be ABI-breaking, so give me some time to rewrite the String class, maybe this weekend.

@eddelbuettel
Copy link
Member

This passed a reverse-dependency check with only one new issue -- a seg.fault in package cccp by Bernhard which I'll look into separately -- so I will merge this now.

eddelbuettel added a commit that referenced this pull request Jul 20, 2015
Rcpp_PreserveObject/Rcpp_ReplaceObject to protect underlying SEXP in String
@eddelbuettel eddelbuettel merged commit 21baa12 into RcppCore:master Jul 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants