Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to factor out cast_to_layout_compatible #13

Merged
merged 13 commits into from Oct 12, 2017

Conversation

cbeck88
Copy link
Contributor

@cbeck88 cbeck88 commented Sep 19, 2017

Hi,

Here's my attempt at implementing the idea I described in issue #12

I had some issues testing it -- even without my commit, I am on ubuntu 16.04 xenial and I don't have a c++14 standard library on my system, so I don't have some things like std::less<void> and std::greater<void> which are used in some of the tests.

However, the same set of tests are passing and failing before and after my commit, using gcc 5.4 and clang 4.0, nor do I have new warnings. So I think this commit is at least close to correct.

(As a minor complaint -- it would be nice if there were a simple script to run the tests. To figure out how to run them locally I had to sort of reverse engineer from the travis.yml file. I guess that the test mechanism relies on the boost repo and so can't be run without cloning boost.)

I would like to test also on my macbook, and I would also like to do some tests with godbolt to inspect the generated assembly and make sure this is not significantly different in terms of efficiency.

However, I don't have time to do that stuff until about Thursday, so you might want to hold off until then. I will try to test it with c++14 conforming standard library at that time.


The commit message describes the changes I made. Basically, where previously we used cast_to_layout_compatible and then make_tuple_of_references, what I tried to do was, generalize make_tuple_of_references so that it takes instead an arbitrary object, together with a getter that can implement get<i> on that object.

Then, for cases when you needed to apply this to a tuple, there is a sequence_tuple_getter which wraps sequence_tuple::get. And for user-defined structures, there is an offset_based_getter, which takes as template parameters the user-structure S, and the sequence_tuple<T1, T2, ...> generated by using the ubiqs. Assuming that the sequence_tuple is indeed layout compatible with S, the offset_based_getter I believe should optimize to code that is just as good as cast_to_layout_compatible, and I don't think it should violate the standard anywhere. There is some more technical discussion in the commit message.


Another thing to be considered, now that I think of it, is that maybe the cast_to_layout_compatible should be kept around and enabled by use of a flag -- that way if on some compilers this change causes a regression, people can revert to the old way.

Cheers,
Chris

The idea here is to try to remove "cast_to_layout_compatible", which
technically violates strict aliasing rules. Instead, we want to try
to compute something like "offsetof" at compile-time for each struct member,
and use pointer arithmetic and reinterpret casts to do this legally.

The basic idea is to take the function "make_flat_tuple_of_references"
and generalize it, so that it need not actually work with tuples.
Instead, it can work with any class type, provided that it is also given
a "getter" function object which it can use to do "get<idx>".

We provide a trivial getter, which calls sequence_tuple::get.

We also provide a "offset_based_getter" which can work with user-defined
structure types, provided we are given a layout-compatible tuple as a template
parameter.

The basic idea is to take the layout-compatible tuple, then replace all the
members with corresponding `std::aligned_storage_t`. This doesn't change the
layout, but it makes everything trivially constructible and constexpr and all that.

Then we can construct that on the stack, and take differences between the main tuple
address, and the members' addresses.

At time of writing the tests are passing for me, but I haven't tested with compilers
other than gcc, or examined assembly to see if this method is being optimized the
way I hope it is.

This commit changes both core14_classic and core14_loophole to use the offset_based_getter,
but until further testing it might have been better to keep the other method around also.
@cbeck88 cbeck88 changed the title Attempt to resolve issue #12 Attempt to factor out cast_to_layout_compatible Sep 19, 2017
@cbeck88
Copy link
Contributor Author

cbeck88 commented Sep 19, 2017

I played around with some sample code on godbolt to see how well this technique optimizes. It seems to get totally optimized by gcc 4.9 and clang 3.5, but on MSVC I couldn't get it to compile, and ICC didn't optimize it fully. I will have to investigate msvc issues later when I come back to this. Here's my code sample, based on your sequence_tuple and my offset_based_getter: godbolt

Edit: Here's a reduced case that only has the sequence_tuple::tuple. It doesn't seem to work on icc 16 or icc 17, so I'm not worried that the offset_based_getter doesn't work for those. But I do need to figure it out for msvc.

@apolukhin
Copy link
Member

This looks very good. I'll fix some other issues and come back to your pull request.

Try to fox the tests untill then (make a member template function get accept an additional size_t_<I> parameter. This will remove the need in template and <I> in .template get<I>. This may bypass the compiler related issues).

@cbeck88
Copy link
Contributor Author

cbeck88 commented Sep 28, 2017

It looks like there are still some issues, binding rvalues to non-const references in make_flat_tuple_of_references... I must have changed something subtle by accident. Also the enum test is not passing, but I remember that I changed cast_to_layout_compatible to a static_cast in the enum handling, so maybe that was an incorrect change.

i didn't think intuitively that these changes were significant but they
appear to be necessary to pass the tests
this mirrors the earlier practice of commenting this out in
cast_to_layout_compatible
this partially reverts a subtle change to static_cast in
551b36a

the reason that is wrong is that the static_cast produces a new value,
while we need to produce a reference to the underlying type of the enum
essentially.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 95.652% when pulling 3aba48c on cbeck88:develop into 1bf21ae on apolukhin:develop.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 95.652% when pulling af8be96 on cbeck88:develop into 1bf21ae on apolukhin:develop.

@cbeck88
Copy link
Contributor Author

cbeck88 commented Sep 28, 2017

Hmm -- I can't see a way to avoid using cast_to_layout_compatible with enums.

Do you think it's okay to just return a reference to the enum? So that make_tuple_of_references would contain references to the enum and not the underlying type.

I guess that this breaks compatibility with previous behavior, but I don't see a way to get a reference to the underlying type without breaking strict aliasing. Gonna think about it some more

this might be useful if on some compilers, cast_to_layout_compatible
leads to better codegen
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 95.652% when pulling 96b7c38 on cbeck88:develop into 1bf21ae on apolukhin:develop.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 95.664% when pulling 29ab252 on cbeck88:develop into 1bf21ae on apolukhin:develop.

@cbeck88
Copy link
Contributor Author

cbeck88 commented Sep 28, 2017

I figured out why it doesn't compile on msvc. It looks to me like there is some problem with the msvc std::aligned_storage which causes it not to be default constructible, or something like this. When I use a naive definition, it seems to compile and optimize completely for all compilers

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 95.664% when pulling d37163c on cbeck88:develop into 1bf21ae on apolukhin:develop.

@cbeck88
Copy link
Contributor Author

cbeck88 commented Sep 30, 2017

I wonder if it makes sense to essentially offer the "precise" reflection interface to C++14 users, based on loophole implementation, and with the offset_based_getter? Because then we probably can give a totally strict-aliasing friendly implementation, unless I'm missing something. We could leave the flat reflection implementation using the cast_to_layout_compatible for structs and enums, since afaik you need to do that for enums anyways.

@apolukhin apolukhin merged commit 6814449 into boostorg:develop Oct 12, 2017
@apolukhin
Copy link
Member

Super awesome pstch! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants