Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for recent Fedora and ASAN errors #3098

Merged
merged 14 commits into from Sep 29, 2017
Merged

Fix for recent Fedora and ASAN errors #3098

merged 14 commits into from Sep 29, 2017

Conversation

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Sep 12, 2017

@krlmlr krlmlr requested review from hadley and lionel- Sep 12, 2017
@lionel-
Copy link
Member

@lionel- lionel- commented Sep 13, 2017

Should class OrderVectorVisitorImpl : public OrderVisitor takeVECTOR by const ref?

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 13, 2017

const& members seem dangerous because the class that holds the reference may live longer than the referenced object: 0b40356.

@lionel-
Copy link
Member

@lionel- lionel- commented Sep 13, 2017

Sorry I got confused while trying to decipher the stack trace. Most of the classes involved assign members by const ref, I wondered if having one class in the mix taking by value could create problems somehow but I guess that doesn't make sense.

The changes you are making in this patch seem to be in another method than the one reported in the SAN reports by the way. The report also indicates:

  • The faulty memory access occurs while subsetting the Rcpp integer vector of the slicing index.

  • The report says that "Memory access at offset 280 is inside this variable". This is a temporary object created while instantiating slice

  • There's a warning about a pointer overflow

    pointer index expression with base 0x000000000000 overflowed to 0xfffffffffffffff8
    

I think the temporary object is created when constructing VectorSliceVisitor<RTYPE>. Its constructor takes the vector by const ref but it is passed a SEXP. The implicit coercion creates a temporary object.

@lionel-
Copy link
Member

@lionel- lionel- commented Sep 13, 2017

So yes we should probably get rid of all const ref members because it seems all too easy to create hard-to-debug UB. The compiler will probably elide many copies anyway.

@hadley
Copy link
Member

@hadley hadley commented Sep 13, 2017

Alternatively, you could make them pointers to make the ownership more explicit.

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 14, 2017

I'd rather avoid naked pointers, I don't see how this makes ownership more explicit.

Thanks @lionel- for looking into this. I agree about removing const& members and replacing them by smart classes like Vector or perhaps shared_ptr, which are essentially pointers with reference count semantics. With C++11 the copy overhead should go away in many cases.

@lionel-
Copy link
Member

@lionel- lionel- commented Sep 14, 2017

I think Hadley was thinking about something like "scope ownership", i.e. making sure that const refs refer to objects that belong to a scope, which is problematic with implicit coercions. Pointers could help here. I agree that overloading the concept of ownership for this case is a bit confusing especially since we are talking about pointers.

I wouldn't be surprised if shared pointers have more overhead than simple values (copy elisions vs counting). I think the best thing to do is to try both and measure performance.

@lionel-
Copy link
Member

@lionel- lionel- commented Sep 14, 2017

An alternative or intermediate step would be to add an explicit keyword to constructors of all classes that wrap const refs.

@lionel-
Copy link
Member

@lionel- lionel- commented Sep 14, 2017

I have reviewed the headers and added the explicit keyword to all classes that wrap refs: krlmlr/dplyr@b-ub...lionel-:b-ub

In theory this should disallow temporary objects as well as copy initilisation (things like Comparer x = Comparer(y)). However in the second commit (krlmlr@1e0748b) I reintroduce both of those to check that the build fails and it does not. Any idea why @krlmlr?

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 14, 2017

I think you need, for the Compare_Single_OrderVisitor template class, either a private copy constructor or (in >= C++11) a copy constructor marked with = delete.

Inheriting from boost::noncopyable seems to be a safer way to forbid copying.

Also, there seems to be no need to mark constructors with != 1 arguments as explicit, because AFAIR this modifier only avoids automatic coercion.

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 14, 2017

Anyway, I'd rather resolve these issues separately. At this point we have UB which needs to be handled with as few changes as possible, to speed up the release process. I couldn't yet replicate the UBSAN problems locally, so CRAN is currently the only way to check if our fixes work.

@hadley
Copy link
Member

@hadley hadley commented Sep 14, 2017

CRAN accepted the last dplyr release so the fixed isn't needed urgently.

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 14, 2017

That's true, but we've been fighting this particular problem long enough, and now is a window where we can simply push an update addressing this with very little effort.

@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 14, 2017

I could replicate the Fedora problems on Ubuntu + clang 5.0 (with -O3 only).

hadley
Copy link
Member

hadley commented on 3e9cf81 Sep 15, 2017

Isn't this a potentially costly change since we'll now be copying vectors of data? Or does Vector contain a SEXP?

krlmlr
Copy link
Member

krlmlr commented on 3e9cf81 Sep 15, 2017

Rcpp::Vector contains an SEXP (ref-counted via R_PreserveObject() and R_ReleaseObject()). No copies or duplication involved. It does fix the problem, but a more sustainable fix involves changing references to pointers in the affected constructors, which would make passing temporary objects much harder.

hadley
Copy link
Member

hadley commented on 3e9cf81 Sep 15, 2017

Ok, great.

@krlmlr krlmlr changed the title Potential fix for recent Fedora and ASAN errors Fix for recent Fedora and ASAN errors Sep 15, 2017
hadley
hadley approved these changes Sep 15, 2017
hadley
hadley approved these changes Sep 17, 2017
@krlmlr
Copy link
Member Author

@krlmlr krlmlr commented Sep 17, 2017

Okay to merge after 0.7.4 hits CRAN?

@hadley
Copy link
Member

@hadley hadley commented Sep 17, 2017

Yup

@krlmlr krlmlr merged commit 34b0c8a into tidyverse:master Sep 29, 2017
1 of 2 checks passed
@krlmlr krlmlr deleted the b-ub branch Sep 29, 2017
@lock
Copy link

@lock lock bot commented Jan 18, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants