Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

Add destroy overload for pointers #2115

Closed
wants to merge 1 commit into from
Closed

Add destroy overload for pointers #2115

wants to merge 1 commit into from

Conversation

JinShil
Copy link
Contributor

@JinShil JinShil commented Feb 26, 2018

This PR adds an overload for destroy that automatically dereferences pointers in an attempt to make the usage of destroy more intuitive (at least to some).

This is an alternative to #2112. See discussion there, and the discussion starting at dlang/dmd#7947 (comment) for context.

Also see http://ddili.org/ders/d.en/memory.html#ix_memory.destroy

Caveat:
To keep the contract of destroy it also destroys the pointer itself, reinitializing it back to null. The user will, therefore, no longer have a valid pointer to call GC.free with. If the user does want to call GC.free then they will have to use destroy(*ptr); themselves to destroy on the object referenced by the pointer and not the pointer itself.

If this is merged, #2112 is not needed.

cc @schveiguy @wilzbach

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @JinShil!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Copy link
Member

@schveiguy schveiguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit uneasy with adding private symbols like isAggregateType without the underscore to object.d, simply because it pollutes the namespace of every module. I still am not sure this is going to fly when doing recursive destruction.

src/object.d Outdated
template _isStaticArray(T : U[N], U, size_t N)
private enum bool isPointer(T) = is(T == U*, U) && !isAggregateType!T;

private enum bool isAggregateType(T) = is(T == struct) || is(T == union) || is(T == class) || is(T == interface);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is a good idea. private doesn't mean it's not visible, just that you can't use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I don't understand. Are you saying there will be name collisions even if they can't be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example that illustrates the problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that this is no longer a problem. It used to be. So objection withdrawn.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still might confuse, because isAggregateType is in phobos. If you forget to include the correct module, you may still see a private symbol error. But this isn't as bad as it was before, where it would interfere with public symbols.

destroy(s);
assert(s is null);
assert(S.dtorCount == 1);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would like to ensure somewhere (doesn't have to be a ddoc unit test), is that the way destroy is implemented, it's not going to recursively follow pointers.

So for instance:

struct S
{
   int *x;
}

S s = new S;
int x = 5;
s.x = &x;
destroy(s);
assert(x == 5);

And if I'm reading the code correctly, I think this is going to fail.

Copy link
Contributor Author

@JinShil JinShil Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested the following using the implementation in this PR:

void main()
{
    struct S
    {
        int *x;
    }

    S* s = new S;
    int x = 5;
    s.x = &x;
    destroy(s);
    assert(x == 5);
}

There is no assertion failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unit test for this scenario

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. Yeah, I thought _destructRecurse was going to destroy recursively, but it correctly just calls the __xdtor of the struct.

@schveiguy
Copy link
Member

I'm ok with this or #2112. I prefer this one, but I think we need @andralex approval for the change in behavior.

Copy link
Member

@PetarKirov PetarKirov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, modulo the union handling.

@@ -2621,7 +2621,7 @@ unittest
}

private void _destructRecurse(S)(ref S s)
if (is(S == struct))
if (is(S == struct) || is(S == union))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can work for unions whose members have conflicting values of T.init, or any union members with non-trivial ctor/dtors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The union support was added after Jenkins failed due to this code:

https://github.com/BBasile/iz/blob/a83d3bdcf49d74fa6f0e7d1eddfc0cba1f65ad4a/import/iz/memory.d#L715-L724

@bbasile Do you have an thoughts on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears destructors are not allowed for unions:

union U
{
    bool b;
    uint u;
    ~this() { }  // Error: destructor `onlineapp.U.~this` destructors, postblits and
                 //        invariants are not allowed in union U
}

So, we should probably disallow destroy for unions, emitting a compile-time error if it is attempted. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Users should manually specify the union member they want to destroy.

}

// Ensures pointers are not recursively followed
nothrow @system unittest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding pure here makes sense, given that destroying structs with trivial destructors is completely deterministic and won't cause access to any shared mutable state.

src/object.d Outdated

private enum bool isAggregateType(T) = is(T == struct) || is(T == union) || is(T == class) || is(T == interface);

private template isStaticArray(T : U[N], U, size_t N)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with:

private enum isStaticArray(T) = is(T : U[N], U, size_t N);


unittest
{
static union uni {
Copy link
Member

@PetarKirov PetarKirov Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I believe this is would be misfeature, as only user code can know how to correctly destroy a union, usually by means of a discriminating field in the containing aggregate. I think the behavior implemented here is "the last union member wins", i.e. the last union member will determine the value of the whole union.

Also, here's an example which should never compile:

struct Str
{
    char* s;
    ~this() { free(s); }
}

struct Num
{
    size_t value = 0xDEAD_BEEF;
}

union U
{
    Num number;
    Str string;  
}

U u;
u.number.value = 42;
destroy(u);

In all likelihood the code above will end up attempting to free the memory at address 0xDEAD_BEEF, after setting u.number.value to this value.

@schveiguy
Copy link
Member

Not sure why, but it appears freebsd32 keeps timing out on some test.

@JinShil
Copy link
Contributor Author

JinShil commented Feb 27, 2018

it appears freebsd32 keeps timing out on some test

See this: braddr/d-tester#70

@andralex
Copy link
Member

I think we should go a different way with destroy. Starting from the documented behavior in https://dlang.org/phobos/object.html#.destroy:

"Destroys the given object and puts it in an invalid state. It's used to destroy an object so that any cleanup which its destructor or finalizer does is done and so that it no longer references any other objects. It does not initiate a GC cycle or free any GC memory."

First comment: "puts it in an invalid state" is a very, very lax terminology that provides no guarantee of the value of the object thereafter. So an observation is that for non-aggregate types we can simply delete the line https://github.com/dlang/druntime/blob/master/src/object.d#L3150. Even structs without a destructor we don't need to reinit - just leave them be.

I think a better wording is: "Destroys the given object. It's used to destroy an object so that any cleanup which its destructor or finalizer does is done and so that it no longer references any other objects. It does not initiate a GC cycle or free any GC memory. If the object has a destructor then it is default-initialized from T.init, otherwise it is left unchanged."

Second matter: Some effort (which this PR is taking a bit further) has been invested in making destroy generic in the sense that destroy(x) can be called uniformly against x regardless of its type. However, that's a C++ism because a D coder can always write static if (is(typeof(destroy(x))) destroy(x); so there's no need for destroy overloads that do nothing interesting. Going with this philosophy, we'd do good to only keep destroy for class, interface, struct, and static array objects (morally equivalent to structs) and deprecate the overload for built-in types including pointers. That obviates the rationale of this PR and eliminates useless overloads that do no interesting work.

(Related to this PR in particular: the fact that destroy(p) nullifies p can introduce a confusion whilst eliminating one. Consider:

void fun(S * p) {
    destroy(p);
    p.x = 42; // oops
}

The code above is correct with destroy(*p) but will crash in the current form. That's an oddity.)

@andralex
Copy link
Member

To create an action item out of these, I think we should:

The milder version that does not break its documentation:

I think we should do the mild version now and discuss deprecation after that.

@schveiguy
Copy link
Member

@andralex, I agree with pretty much all you said, having destroy that "does nothing" is a placebo that says "You did it right, because it compiled!" Much better to error so the user cannot accidentally use it.

One note:

we'd do good to only keep destroy for class, interface, struct, and static array objects

What we still are missing (even with this PR) is something that destroys all the elements of a dynamic array. As I said elsewhere, delete was deprecated with the promise that you can "just do destroy(x); GC.free(x) if you need delete functionality". This is true for everything EXCEPT dynamic arrays. For those, you have to loop through the elements, destroying each. The new __delete function does this, but of course, still frees the memory.

Would it make sense to modify destroy on an array such that it does this? Definitely a discussion for another day, but I just wanted to put it on your radar.

@JinShil
Copy link
Contributor Author

JinShil commented Feb 28, 2018

Invalid State

I think we should go a different way with destroy. Starting from the documented behavior in https://dlang.org/phobos/object.html#.destroy:

"Destroys the given object and puts it in an invalid state. It's used to destroy an object so that any cleanup which its destructor or finalizer does is done and so that it no longer references any other objects. It does not initiate a GC cycle or free any GC memory."

That documentation is already out of date. 2.079 (the upcoming release) already has an update to that documentation (as a result of #2054) that reads:

"Destroys the given object and sets it back to its initial state. It's used to destroy an object, calling its destructor or finalizer so it no longer references any other objects. It does not initiate a GC cycle or free any GC memory. "

So the verbage about "invalid state" has already been removed, though it still may not be correct, depending on the semantics we decide on here.

Re-initializing only aggregate types

So an observation is that for non-aggregate types we can simply delete the line https://github.com/dlang/druntime/blob/master/src/object.d#L3150. Even structs without a destructor we don't need to reinit - just leave them be.

Please forgive my ignorance, but I'm not sure I understand the rationale. I'd prefer uniform behavior, if possible. That is, if reference types are re-initialized in destroy, why not value types as well so the user doesn't have to allocate mind-share to special cases, and said special cases don't need to be communicated in the documentation.

So, unless I'm missing something, I'd prefer it if all types were re-initialized in destroy or all types were left alone in destroy.

destroy that does nothing

having destroy that "does nothing" is a placebo that says "You did it right, because it compiled!" Much better to error so the user cannot accidentally use it.

I'm leaning towards this as well. This is actually where the pointer overload helps.

no-op for non-aggregates

Consider the following code.

struct S { }

S* s = new S();
destroy(s);

Without the pointer overload, and implementing @andralex's action list above, this will do absolutely nothing. Nothing is probably not what the user intended, so I think we should emit a compile-time error that calling destroy on pointers (or any no-aggregate) is useless, or attempt to do what the user likely intended by implementing the pointer overload.

no-op for unions

Similarly I think it would be better to emit a static assertion failure if destroy is called on a union, rather than a no-op. This completes a dialog with the user to inform the user that what they think would happen is probably not what would actually happen.

Implementing destroy for dynamic arrays

I'm planning a PR for that, so let's save that discussion for then.

Moving on from here

This PR has exposed a number of issues in the implementation of destroy, so I will be submitting additional PRs in an attempt to address each one individually.

  • no-op for unions, or emit a compile-time error
  • no-op for non-aggregates or emit a compile-time error
  • uniform re-initialization behavior, or special cases for non-aggregates and structs without destructors
  • dynamic array support

I'm closing this for now, but I welcome continued discussion to help me get this right. A brief discussion on Slack would be helpful.

@JinShil JinShil closed this Feb 28, 2018
@schveiguy
Copy link
Member

no-op for unions, or emit a compile-time error

Error please.

no-op for non-aggregates or emit a compile-time error

Error please.

uniform re-initialization behavior, or special cases for non-aggregates and structs without destructors

Re-initialization for structs without destructors. It is what is expected I would think. Yes, it's more expensive than no-op, but uniformity is more important here.

dynamic array support

If destroy isn't going to actually destroy the elements of the array, I would say it should Error (similar to the pointer case). Otherwise, I would support destroying the elements. Currently, it just sets the array to null.

You may ask why I would want different behavior from arrays and pointers? It's because dynamic arrays are considered to be the elements themselves, not just pointer and length. We use that to great effect everywhere.

Alternatively, you could make destroy on an array an Error, and introduce a new function that will do the destruction of the elements.

@JinShil
Copy link
Contributor Author

JinShil commented Feb 28, 2018

Re-initialization for structs without destructors. It is what is expected I would think. Yes, it's more expensive than no-op, but uniformity is more important here.

What about removing re-initialization for all types? Is that an option?

@rainers
Copy link
Member

rainers commented Feb 28, 2018

It does not initiate a GC cycle or free any GC memory.

I think this needs a slight modification as it sounds like a guarantee for nogc, but called destructors are still able to produce arbitrary GC activity.

@schveiguy
Copy link
Member

What about removing re-initialization for all types? Is that an option?

It is a possible option, but I think that is more disruptive. People expect struct reinitialization, their code probably counts on it.

I think this needs a slight modification as it sounds like a guarantee for nogc

Yes, but @nogc is the guarantee for nogc. Probably reword to "does not directly initiate".

@JinShil
Copy link
Contributor Author

JinShil commented Feb 28, 2018

Re-initialization for structs without destructors. It is what is expected I would think. Yes, it's more expensive than no-op, but uniformity is more important here.

What about removing re-initialization for all types? Is that an option?

It is a possible option, but I think that is more disruptive. People expect struct reinitialization, their code probably counts on it.

So I've been working on this for a few hours and I now realize I can't do all of these:

  • emit a compiler error for unions and non-aggregates
  • re-initialize only aggregates
  • maintain uniformity

If we decide to re-initialize for all types, then I have to implement destroy for all types, to ensure uniformity especially for recursive destruction. And, if we do that we're back to requiring destroy for pointers, too, to complete the implementation.

So I'm faced with the following two implementation choices:

Implementation 1 - destroy means call destructor AND re-initialize

  • re-initialize for all types
  • implement destroy for all types including pointers

Implementation 2 - destroy means ONLY call destructor

  • don't re-initialize for any type
  • implement destroy only for types that have destructors
  • emit compile-time errors for types that don't have destructors.

Either 1 or 2, or we break uniformity.

@andralex
Copy link
Member

Uniformity is not what we're looking for here. We look for simple, composable, efficient primitives. It's much easier to detect "does destroy apply to type T" than "would calling destroy on type T result in a no-op?"

So we want to implement destroy for the cases in which it does real work.

For types with destructor, it's important that destroy leaves them in a destructible state, i.e. the type is re-inited. Otherwise any failed attempt to reinitialize them (or just doing nothing) will leave them in an invalid state when the object goes out of scope. So that's important. For all other types, it's destroy not destroyAndInit.

For static and dynamic arrays that have elements with destructors, destroy should destroy each element carefully - attempt to destroy each element and accumulate the possible exceptions created.

Now it comes down to code breakage. If we deprecate all calls that do nothing, well that's some breakage. If we leave them as they are, people can just call destroy without caring and know that the object has no extra resources and can be written over.

@andralex
Copy link
Member

I reckon I'm not being the most coherent because I'm unclear on when I talk "what I think we could do if we did destroy from scratch" and when I talk about "what I think we can do with destroy right now".

Briefly, destroy must ensure the object has no additional resources allocated and is in a state in which it can safely be destroyed (e.g. go out of scope, or be destroyed again during collection). This is the only uniformity we're after - we're not looking for leaving the object in a specific state, only one that is not dangerous. So for an int etc. there is no reason to overwrite it.

@JinShil
Copy link
Contributor Author

JinShil commented Mar 1, 2018

So, if I understand correctly, destroy should be call the instance's destructor and set the instance to a benign (destructible) state. What is an example of a non-benign (non-destructible) state? How would not re-initializing instances of types with destructors put the instance in a non-benign (non-destructible) state?

@jmdavis
Copy link
Member

jmdavis commented Mar 1, 2018

Destructors aren't generally designed to be called multiple times on the same object. For instance, what if a struct's destructor calls a member function on a class reference and then calls destroy on it but doesn't make it null. The class object is then in an invalid state to do much of anything - among other things, its vtable is gone. I'm not sure that the class object is in a state where destroy can be safely called on it again at that point, but even if destroy is able to handle a destroyed class object, if the struct's destructor gets called again, then it's going to try to call that member function on the class reference again before calling destroy on it, and those function calls will be trying to operate on an invalid class object and will probably segfault.

@andralex
Copy link
Member

andralex commented Mar 1, 2018

So, if I understand correctly, destroy should be call the instance's destructor and set the instance to a benign (destructible) state. What is an example of a non-benign (non-destructible) state?

A very simple situash would be:

struct S { void* p; ... ~this() { ... free(p); ... } }

Calling the destructor twice in a row is liable to double-free memory.

How would not re-initializing instances of types with destructors put the instance in a non-benign (non-destructible) state?

Not sure I understand the question - two negations make it difficult. Let me try to rephrase whilst eliminating both negations: "How would re-initializing instances of types with destructors put the instance in a benign (destructible) state?"

If that's the question: Types need to handle the situation whereby the destructor is called against T.init. Those that don't have a bug that goes squarely on the programmer. There is an exception - types that @disable this(). For those the creator may legitimately define a destructor that doesn't take into account that the value is T.init. For those (disabled this and defined destructor), we should probably not make destroy available.

@JinShil
Copy link
Contributor Author

JinShil commented Mar 1, 2018

So part of the contract of destroy is that it must permit being called multiple times, correct?

There is an exception - types that @disable this(). For those the creator may legitimately define a destructor that doesn't take into account that the value is T.init. For those (disabled this and defined destructor), we should probably not make destroy available.

What about allowing destroy for that case, but not reinitializing?. That is, the programmer is responsible for handling multiple calls to the destructor in their custom ~this().

@andralex
Copy link
Member

andralex commented Mar 2, 2018

So part of the contract of destroy is that it must permit being called multiple times, correct?

No, just that destroy leaves the object in a destructible state. If the object has no destructor, any state is destructible (including states that would cause problems with multiple calls to destroy).

What about allowing destroy for that case, but not reinitializing?. That is, the programmer is responsible for handling multiple calls to the destructor in their custom ~this().

To the extent possible it's best to move from more restrictive to less - the other way around breaks code. So I'd just disable that use for the time being.

@JinShil
Copy link
Contributor Author

JinShil commented Mar 2, 2018

I need a better defininition of "destructible state". Is it "not referring to any resources not managed by the GC"?

@andralex
Copy link
Member

andralex commented Mar 2, 2018

@JinShil bummer about the long exchange - this would be hashed out in 10 minutes in a room with a whiteboard.

Consider:

class C { ... }
void fun() {
    auto c = new C;
    destroy(c);
    c = null;
    GC.collect;
}

This code should be safe. In particular the call to C's destructor during collection should be legit. Also:

struct S { ... }
void fun() {
    S s = ...;
    destroy(s);
}

Again the code should be safe. Note that the dtor is called when s goes out of scope.

So... "destructible state" means "the object should be in a state such that its destructor can be called with well-defined behavior".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
7 participants