New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when a column has not classes but has the object bit set, pretend it … #3394
Conversation
Perhaps can be deprecated once dependency on R passes if/when the underlying R bug is fixed? Certainly in that case the test here will fail; given that, according to the |
Yeah. We can leave the defensive code just in case, but disable the test once the issue is fixed upstream in R. |
sounds good. maybe add a comment to tag for later readers? not sure the best pipeline for dealing with something minor like this that will (hopefully) be hardly visible. |
tests/testthat/test-binds.R
Outdated
@@ -571,6 +571,13 @@ test_that("bind_rows() accepts lists of dataframe-like lists as first argument", | |||
expect_identical(bind_rows(list(list(a = 1, b = 2))), tibble(a = 1, b = 2)) | |||
}) | |||
|
|||
test_that("columns that are OBJECT but have NULL class are handled gracefully (#3349)", { | |||
mod <- lm(y~ ., data=freeny) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're trying to follow http://style.tidyverse.org/, the styler package gets most of it right with style_pkg(strict = FALSE)
.
inst/include/dplyr/Collecter.h
Outdated
if (!is_class_known(x)) { | ||
SEXP classes = Rf_getAttrib(x, R_ClassSymbol); | ||
SEXP classes = Rf_getAttrib(x, R_ClassSymbol); | ||
if (!Rf_isNull(classes) && !is_class_known(x)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to not implement this change in is_class_known()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoiding to do Rf_getAttrib(x, R_ClassSymbol)
twice.
while we are here, I don't see the point of
if (!OBJECT(x)) {
return false;
}
in inherits_from
which is only called when OBJECT(x)
is set.
And the std::set_intersection
looks a bit overkill, this could exit earlier, as soon as there is at least one class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this avoid querying the class attribute twice? I'm fine with removing the classes
variable in this function, fix is_class_known()
, and only query the classes when constructing the warning (we don't need to optimize for the failure case).
Happy to look at the other problems you mentioned in a separate PR/issue.
…wn`, `inherits_from` so as to avoid `SEXP classes =`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move checks into inherits_from()
.
I agree that the inheritance check in that function can be improved, if I were to reimplement it today I would:
- let
known_classes
be a set (or maybe sorted array) ofCHARSXP
- iterate manually over all elements of the
"class"
vector and do a lookup in the set/array
But I'm not sure it has an impact on the overall performance. Let's not spend too much time on that.
inst/include/dplyr/Collecter.h
Outdated
@@ -34,7 +34,7 @@ static bool is_class_known(SEXP x) { | |||
known_classes.insert("integer64"); | |||
known_classes.insert("table"); | |||
} | |||
if (OBJECT(x)) { | |||
if (OBJECT(x) && !Rf_isNull(Rf_getAttrib(x, R_ClassSymbol))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to understand the problem here. If you want to avoid querying the class symbol and the OBJECT()
bit twice (and also clean up the code a bit), maybe this check could be added to line 17 in this file? I suspect inherits_from()
will fail if the class attribute is NULL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inherits_from
and is_class_known
are only used once each anyway. it all looks kind of confusing. But I feel that this patch is good enough for fixing the problem, and we can revisit the code later.
I suspect inherits_from() will fail if the class attribute is NULL.
yes that was part of the initial problem.
Thanks! Could you please resolve conflicts and merge? |
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
…has not. closes #3349
This is pretty straightforward.
stats::model.frame
does weird stuff, and broom was an innocent bystander. Just added some defensive code.