-
-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve std.typecons.Unique #3139
Conversation
Whenever D is brought up to the general programming public, the garbage collector is quickly raised as a point of contention. Regardless of how legitimate or well-informed these concerns are, it would be a massive public relations boon --- and great for the language, to boot --- if we could trot out a solid said of RAII-based smart pointers for those who prefer to use them. We have a solid start in std.typecons.Unique and std.typecons.RefCounted. Unfortunately, these classes seem to be victims of bit rot and compiler bugs of days long gone. An overview of the changes in this commit is as follows: - Unique's underlying data now uses malloc and free instead of the garbage collector. Given that many people use RAII smart pointers to escape the GC, it seems to make more sense to avoid it here. On a related note, isn't delete deprecated? The current destructor uses it. - std.algorithm.move is used instead of a special release member function. Whether by design or by happy accident, move transfers ownership between Unique pointers in a very similar manner to C++'s std::move with std::unique_ptr. Along with being a familiar paradigm to C++ users, using move to transfer ownership makes more intuitive sense and builds consistency with the rest of Phobos. - With std.algorithm.move transferring ownership, release now just frees the underlying pointer and nulls the Unique. - Unique.create is no longer compiled out using version(None). Regardless of whether or not there is language support for checking uniqueness, a utility function that creates a Unique, taking the same arguments as the underlying type's constructor, is extremely useful, as demonstrated by the addition of make_unique to C++14. - Because Unique.create is now in place and Unique is backed with malloc, constructors taking a pointer have been removed. This encourages the use of create as the idiomatic, consistent method to, well, create Unique objects. If one can only get a Unique by calling create or moving another into it, we also ensures uniqueness in one fell swoop. - A new method, get, returns the underlying pointer, for use in functions and code that do not play a role in the life cycle of the object. Smart pointers are as much about ownership semantics as they are about allocating and freeing memory, and non-owning code should continue to refer to data using a raw pointer or a reference.
immutable size_t allocSize = T.sizeof; | ||
|
||
void* rawMemory = enforce(malloc(allocSize), "malloc returned null"); | ||
u._p = cast(RefT)rawMemory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appear to be having a segfault on classes that have a reference to their context, such as the one on the third unit test, on line 281. When I step through create
in GDB, rawMemory
is correctly set to our newly allocated memory, but after stepping past this line, u._p
is some bad address, such as 0x14. It's like the assignment is ignored or the cast messes with the address somehow. Is this some failing of my understanding? The D site says that "Casting a pointer type to and from a class type is done as a type paint (i.e. a reinterpret cast)." so I'm not sure how this is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breakpoint 1, std.typecons.Unique!(std.typecons.__unittestL276_3().C).Unique.create!().create() (
__HID19=0x7fffffffe0d8) at std/typecons.d:101
101 u._p = cast(RefT)rawMemory;
(gdb) print rawMemory
$1 = (void *) 0x92d860
(gdb) n
111 void[] init = typeid(T).init[];
(gdb) print u._p
$2 = (struct std.typecons.__unittestL276_3.C *) 0x7fffffffe060
ಠ_ಠ
Great to see some effort here. One quick thing that caught my eye is |
My rationale behind doing so was that if we remove constructors accepting I agree that making |
else | ||
immutable size_t allocSize = T.sizeof; | ||
|
||
void* rawMemory = enforce(malloc(allocSize), "malloc returned null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enforce
wants to allocate to throw an exception, so you can't use it with malloc
.
Do:
void* rawMemory = malloc(allocSize);
if(!rawMemory)
onOutOfMemoryError();
(See also #3031)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! For now I'll do this, and once both this and #3031 are hopefully merged, we can move over to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JakobOvrum After a bit of review, I recall that I was following a pattern noticed elsewhere in Phobos:
~/s/d/phobos [v2.067.0 ?] % ag 'enforce\(malloc'
std/container/array.d
485: auto p = enforce(malloc(sz));
std/regex/package.d
565: _memory = (enforce(malloc(size))[0..size]);
641: _memory = (enforce(malloc(size))[0..size]);
671: void[] memory = enforce(malloc(size))[0..size];
std/stdio.d
346: _p = cast(Impl*) enforce(malloc(Impl.sizeof), "Out of memory");
std/typecons.d
4053: _store = cast(Impl*) enforce(malloc(Impl.sizeof));
std/uni.d
1745: auto ptr = cast(T*)enforce(malloc(T.sizeof*size), "out of memory on C heap");
6875: auto p = cast(ubyte*)enforce(malloc(raw_cap));
6917: ubyte* p = cast(ubyte*)enforce(malloc(3*(grow+1)));
Should these be corrected in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3031 fixes a few of them but focuses on more important changes, but yes, we'll fix all of them eventually.
I picked up the trick of converting a pointer into an array using the [0 .. size] syntax from std/regex/package.d. Unique.create is still segfaulting, but this seems to be an issue with the class version of emplace regardless of this Unique work. The bug can be found here: https://issues.dlang.org/show_bug.cgi?id=14402
The segfaults I've been getting seem to be caused by a separate issue in |
Not entire at the very least. For me personally it is not even important point. The way I see it, best thing that can happen with It is still rather far away but I'd prefer to avoid any hard requirements about internal memory management. Though that does seem a bit out of scope of this PR so no real objections as long as you keep that in mind in the long term. |
It is currently impossible (or so it seems) to use malloc and emplace to create a nested class or struct, so we'll return to using the GC for now. We'll also restore the constructors that take a RefT, as using new _inside_ the context of the nested class or struct is apparently the only way to create one currently.
Alright. So, I've done a bit of exploring and realized that context-aware objects cannot be created with So we're all on the same page, this PR now makes the following changes:
|
Instead of hiding this dangerous operation in an unassuming, anonymous constructor, we should probably formalize it in a named, greppable constructor function. Ideally it would be named |
@JakobOvrum - I couldn't agree more. I only left those constructors because they were there already, and I was going under the assumption that fewer breaking changes would mean less debate to get this PR accepted. I'm more than happy to remove those constructors and work them into something more... excplicit in its stated intentions. |
Alright, this is a breaking change but |
Needs an entry in changelog with migration instructions in that case |
I can provide those, along with some better DDoc documentation in |
btw I'd really love |
From the related pull request (#3139), there seems to be a general consensus that it is more important to do Unique "right", even if that means breaking changes, so long as there is a clean migration path. With that in mind, I have made the following additional changes: - Instead of constructors that take a RefT, Uniques can now be created one of two ways: via .create or .fromNested. See the DDocs of both for details. - opDot is replaced with "alias _p this". A cursorty Google search indicates that opDot is deprecated and that alias this is the preferred method. Like C++'s unique_ptr, Unique now enjoys pointer-like operations (such as dereferencing), but cannot be set to null or assigned from a different pointer due to opAssign and the disabled postblit constructor. - Consequently, isEmpty has been removed. Instead, just use is null as you would with a pointer. - Removal of redundant unit tests - Various comment and unit test cleanup
There seems to be a general consensus that it is more important to do
|
Why create? We don't have any create in Phobos. The common pattern is to use a normal constructor and maybe add a lowercase |
I just made a related patch for RefCounted, #3171. |
But assumeUnique does something very different, it casts to immutable. It was named after the precondition for converting something to immutable. |
|
||
To ensure uniqueness, be sure to provide the direct result of $(D new). | ||
|
||
Example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make this example a documented unittest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a corresponding unit test - see https://github.com/mrkline/phobos/blob/better-unique/std/typecons.d#L225-L236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but if you make it a documented unittest, you don't have to copy the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, fantastic! I wasn't aware of this feature. Thanks much.
Which is exactly my main problem with |
I used As @Dicebot discussed, I'm not touching |
@MartinNowak A thought just occurred to me - |
Allows you to dereference the underlying $(D RefT) | ||
and treat it like a $(D RefT) in other ways (such as comparing to null) | ||
*/ | ||
alias _p this; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of those functions can escape references making Unique unsafe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, they can, and perhaps I should add a comment saying as much. However, this is not inherently bad. Unique
conveys ownership, and non-owning functions shouldn't pass around smart pointers if they aren't going to play a role in the underlying object's lifetime.
See this bit (starting at the 14 minute mark or so) of a recent Herb Sutter talk. Yes, it's about C++, but the exact same semantics are at play here and he discusses it extensively. He makes these points much more eloquently than I do, so I hope you take the time to watch it, but to summarize, doing this bakes the ownership semantics of calls directly into the function signatures. Observe:
void consume(Unique!T u) // Takes ownership of u
void poke(ref T u) // Uses u but plays no impact on its lifetime
void opt(T* u) // u is optional; has no impact on its lifetime
If you're still not convinced, another important point is that a non-owning function shouldn't care how ownership of an object is being handled.
void antipatternIfNotOwning(ref Unique!T u)
is just fine to tell the user "this function may reseat the ownership of the T
". But it's just bad if the function doesn't affect ownership. It demands a certain means of ownership (as opposed to GC, or RefCounted
, or stack-allocated) despite not actually caring, it's more verbose, and its intentions aren't as clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ is hardly a decent example to follow. The stuff they are forced to use here is all about using convention in absence of working compile-verified solution. I believe D can and should do better in this regard.
Allowing @safe
escaping of unique reference is absolutely unacceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you take the time to watch the linked section of the talk? D can (unlike C++) do compile-time checks for safety using @safe
, but since it can't verify life cycles at compile time (a la Rust), that doesn't get around the inherent ownership issues at hand.
If we want to mark this as unsafe, that's fine, but I still strongly believe it's necessary. What's the alternative? Passing ref Unique!T
around? That's problematic for the reasons listed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and I have been using that approach in C++ for ages. It is different in D though as we actually promise @safe
code to be always memory-safe if it compiles. No conventions, no style rules - if it compiles, it must be good to go. This can't be compromised no matter what, even simply banning all borrowing of Unique
would be more practical approach.
And making Unique
@system
is just delaying another full rewrite of the utility.
Also please note there are actually various proposals about enforcing borrowing / lifetime semantics at compile-time in a way similar to Rust, with http://wiki.dlang.org/DIP69 being primary candidate for getting implemented. I haven't checked it in a while but it should allow adding scope ref T borrow() { return _p; }
method while keeping it all @safe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken. My apologies if my tone was somewhat... irreverent.
So, what's the plan? Drop get
and just pass Unique
s by ref? And what should we do about opDot
? I only moved away from it because there was no documentation for it anywhere, and several posts said this was because it was deprecated for alias this
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Dicevot says. I know that talk btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No offense, I simply wanted to stress the point how important guarantess of @safe
are.
Another alternative for opDot
is using std.typecons.Proxy
: https://github.com/D-Programming-Language/phobos/blob/master/std/typecons.d#L4821-L4829
I don't know how robust it is in practice but is supposed to provide wrappers for all members without allowing implicit conversion (like alias this
does).
As for get
- yes, I am tempted to just drop it completely for now and add better facility once DIP69
or similar is implemented. But this is a delicate topic, I wonder what Phobos maintainers think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at DIP25 we should implement this like so.
ref T get() return { return *_p; }
alias get this;
That should be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
We could make |
Yeah making |
It's not an issue of classes vs anything else, it's an issue of types with indirection. |
@JakobOvrum is it? Indirection on its own should be fine as long as that data is |
@Dicebot, right, I don't think I've wrapped my head around DIP25 completely yet. Regarding C heap vs GC heap; if we want to allow transferring |
Another observation regarding cross-thread moves - the uniqueness of That means not just any Does this make sense? |
Can we please keep that highly speculative GC discussion out of here?
Makes sense, and once we merge this we should make std.concurrent aware of Unique. |
Unique!C uc = new C; | ||
Unique!Object uo = uc.release; | ||
Unique!C uc = unique!C(); | ||
Unique!Object uo = move(uc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think that polymorph conversion works, should turn this into a documented unittest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's actually is supposed to work, nice.
Looks pretty good already, just a few more details. |
Alright, I added the suggested docs and made |
They don't go well together.
Auto-merge toggled on |
Improve std.typecons.Unique
Thanks, I'll follow up with RefCounted support for classes and factoring out |
Revert "Merge pull request #3139 from mrkline/better-unique"
Whenever D is brought up to the general programming public, the garbage collector is quickly raised as a point of contention. Regardless of how legitimate or well-informed these concerns are or are not, it would be a massive public relations boon — and great for the language, to boot — if we could trot out a solid set of RAII-based smart pointers for those who prefer to use them. We have a solid start in
std.typecons.Unique
andstd.typecons.RefCounted
. Unfortunately, these seem to be victims of bit rot and compiler bugs of days long gone.This is the first of several pull requests that attempt to clean these up. An overview of the changes in this request is as follows (updated 2015-04-21):
Unique
usesmalloc
andfree
instead of the GC for backing memory storage. Unfortunately this rules out nested types for the time being (asemplace
cannot set the frame pointer for closures).std.algorithm.move
is used instead of a specialrelease
member function. Whether by design or by happy accident,move
transfers ownership betweenUnique
pointers in a very similar manner to C++'sstd::move
withstd::unique_ptr
. Along with being a familiar paradigm to C++ users, usingmove
to transfer ownership makes more intuitive sense and builds consistency with the rest of Phobos.std.algorithm.move
transferring ownership,release
is deprecated.Unique.create
has transformed into a freestandingunique
function. Regardless of whether or not there is language support for checking uniqueness, a utility function that creates aUnique
, taking the same arguments as the underlying type's constructor, is extremely useful, as demonstrated by the addition ofmake_unique
to C++14.get
, returns the underlying pointer, for use in functions and code that do not play a role in the life cycle of the object. Smart pointers are as much about ownership semantics as they are about allocating and freeing memory, and non-owning code should continue to refer to data using a raw pointer or a reference.This pull request is not meant to be merged prima facie, but to initiate a dialogue. I strongly believe this is a good step in the right direction and would love to hear commentary from the core devs. Perhaps some of the implementation details need to be hashed out some more, but we can all agree that a stronger showing from the smart pointers in
std.typecons
will only bolster D and Phobos.