Skip to content

Conversation

@eddelbuettel
Copy link
Member

Closes #1429

As detailed by @ltierney in a post to the r-devel list two days ago, and as tracked in issue #1429, ATTRIB() may go away for the R 4.6.0 cycle. We use it in three spots, and this PR replaces the use by employing the (brand-new, had to re-build the 'r-devel' container we use here in CI as its last build on Monday did not reflect it yet) R_mapAttrib() which treats attributes as an opaque object and 'walks' a user-supplied function over it.

Checklist

  • Code compiles correctly
  • R CMD check still passes all tests
  • Preferably, new tests were added which fail without the change
  • Document the changes by file in ChangeLog

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the use of the deprecated ATTRIB() macro with the new R_mapAttrib() API introduced in R 4.6.0. The ATTRIB() macro is being removed in R 4.6.0 as announced on the r-devel mailing list, requiring this update for forward compatibility.

Key Changes:

  • Introduced two new helper callback functions (get_attr_names and get_row_count) that work with R_mapAttrib() to iterate over attributes
  • Updated attribute-related methods to use version-conditional compilation, maintaining backward compatibility with R < 4.6.0
  • Replaced direct attribute list traversal with the new opaque attribute handling approach

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/utilities.cpp New file containing helper callback functions get_attr_names and get_row_count for use with R_mapAttrib()
src/rcpp_init.cpp Registers the two new helper functions to make them callable via the R API
inst/include/Rcpp/routines.h Adds function declarations and inline wrappers for the new helper functions
inst/include/Rcpp/proxy/AttributeProxy.h Updates attributeNames() and hasAttribute() methods to use R_mapAttrib() for R >= 4.6.0 while preserving old implementation for earlier versions
inst/include/Rcpp/DataFrame.h Rewrites nrow() method to use R_mapAttrib() instead of direct ATTRIB access, with version-conditional compilation
DESCRIPTION Updates version number and date
ChangeLog Documents all changes made in this PR

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@Enchufa2 Enchufa2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simplify nrow and hasAttribute as suggested above. attributeNames is the only one that requires R_mapAttrib, but it may be further simplified if Luke adds R_getAttribNames, so I would just inline an anonymous visitor there for now, instead of creating another routine, another compilation unit...

@ltierney
Copy link

On the hasAttribute and attributeNames usage: If we added Rf_existsAttribute and R_getAttribNames one issue is how to handle the implicit names attribute for pairlist and language objects. Adding the implicit attribute to the names and returning TRUE for exists would be most consistent with the R level and with Rf_getAttrib, but would affect code now looking only at ATTRIB. Any thoughts from your end?

@Enchufa2
Copy link
Member

On the hasAttribute and attributeNames usage: If we added Rf_existsAttribute and R_getAttribNames

This would be very convenient for us.

one issue is how to handle the implicit names attribute for pairlist and language objects. Adding the implicit attribute to the names and returning TRUE for exists would be most consistent with the R level and with Rf_getAttrib, but would affect code now looking only at ATTRIB. Any thoughts from your end?

Since ATTRIB is not part of the API from now on, this is not a problem.

@Enchufa2
Copy link
Member

Enchufa2 commented Dec 22, 2025

Oh, you meant that, from now on, pairlists and language objects would return a new attribute names compared to previous versions of Rcpp, right? It's highly unlikely this breaks any existing code, because I don't think many people use Rcpp to manipulate pairlists and language objects.

@eddelbuettel
Copy link
Member Author

eddelbuettel commented Dec 22, 2025

All good. I mostly wanted to see if I could get away from ATTRIB now as we have the next upload coming up. We can likely simplify further. Too bad I didn't 'click' on the already possible simpler nrow() (esp as I looked at do_shortRowNames() just below it !!). Also, getting Rf_existsAttribute and R_getAttribNames sounds promising and would have simplified this now. We will have to check if something is affected by returning an attribute names, that is likely an empirical question.

@eddelbuettel
Copy link
Member Author

The thought of going the column length for a simpler nrow occurred to me too (though I didn't "finish" the thought with a four-line implementation as @ltierney suggested). Somehow going with R_RowNamesSymbol) feels cleaner. Using it with Shield<> makes it a two-liner

Shield<SEXP> rn = Rf_getAttrib(Parent::get__(), R_RowNamesSymbol);
return Rf_length(rn);

which stays closer to R itself at the cost of the one allocation in Shield. I believe we can use Rf_xlength() here to and return the R_xlen_t. We 'upgraded' from int before without issues. Of course, needs a reverse dependency check too.

On the other hand rewriting what @ltierney suggested as follows (with a second XLENGTH, and a missing )) is pretty tempting too

SEXP x = Parent::get__();
// We can simplify: zero row DFs have no names, else length of all vector same
// by design constraint so can use vector length for desired row count
return (XLENGTH(x) == 0) ? 0 :  XLENGTH(VECTOR_ELT(x, 0))

All those variants pass our 1600+ tests (of which several dozen hit data.frame objects).

@ltierney
Copy link

On the other hand rewriting what @ltierney suggested as follows (with a second XLENGTH, and a missing )) is pretty tempting too

SEXP x = Parent::get__();
// We can simplify: zero row DFs have no names, else length of all vector same
// by design constraint so can use vector length for desired row count
return (XLENGTH(x) == 0) ? 0 :  XLENGTH(VECTOR_ELT(x, 0))

Unfortunately my comment is wrong: you can have a data frame with zero columns and non-zero rows:

data.frame(row.names = 1:3)

So for now going with R_RowNamesSymbol is probably best until we figure out whether Rf_nrows can be modified to handle this or maybe add another function that does. Definitely seems like something that should be in the API.

@eddelbuettel
Copy link
Member Author

Thanks so much for checking. The other two-liner relying on Shield<SEXP> for the one allocation here is much safer, and closer to what R does.

I am about to commit the re-done attribute proxy using two lambda visitors too. I still think copilot is wrong and that row.names always the last element meaning we need all of them anyway but heck, on the chance it gains at least sometimes it is worth it.

Copy link
Member

@Enchufa2 Enchufa2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@eddelbuettel
Copy link
Member Author

Those were really helpful by all -- big thanks to you all: @ltierney @Enchufa2 @JanMarvin !

@eddelbuettel eddelbuettel merged commit 2ac55e0 into master Dec 22, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace ATTRIB in three places

5 participants