Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(array): add .length property to Appender and RefAppender #8432

Closed
wants to merge 1 commit into from

Conversation

ljmf00
Copy link
Member

@ljmf00 ljmf00 commented Apr 16, 2022

This patch adds a .length property to either get and set the length of the
managed array.

@dlang-bot
Copy link
Contributor

dlang-bot commented Apr 16, 2022

Thanks for your pull request, @ljmf00!

Bugzilla references

Auto-close Bugzilla Severity Description
19259 enhancement std.array.Appender needs a way to set/get the length

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + phobos#8432"

std/array.d Outdated
size_t oldLength = _data.arr.length;
reserve(newLength - oldLength);
_data.arr = _data.arr.ptr[0 .. newLength];
_data.arr[oldLength - 1 .. newLength] = T.init;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be emplace and not assignment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is possible because you need to repeat T.init times newLength-oldLength, but can you suggest a change? This needs to be done similarly to _d_arraysetlengthT and _d_arraysetlengthiT runtime hooks. I'm doing a change locally to do that.

This patch adds a .length property to either get and set the length of the
managed array.

Fix issue 19259.

Reference: https://issues.dlang.org/show_bug.cgi?id=19259
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
@ljmf00 ljmf00 changed the title feat(array): add .length property feat(array): add .length property to Appender and RefAppender Apr 16, 2022
* Params:
* a = The new length
*/
@property void length(size_t newLength) @trusted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaving like a dynamic array while expanding/shrinking it like an appender. There is no proper way to expand with the appender factor while setting initialized memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's not a dynamic array - it's a tool for building one. I don't understand what the use case would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For building arrays with the help of other language APIs (i.e. C). Reserve a capacity, then feed the pointer/length pair to an API to fill it, then set the length. Can't be @trusted though like it seems to be now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's not a dynamic array - it's a tool for building one. I don't understand what the use case would be.

It should mimic a dynamic array. The appender is literally used as an optimization to ~= operator that mimics a dynamic array but with a larger factor for .capacity to do fewer allocations. There are also other things missing in the Appender like other ranges interfaces, but not related to this PR. I see no reasons to restrict this interface to an OutputRange. It makes no sense to me that people need to use opSlice to do everything else, making it a special case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user may want to shift from a dynamic array to an appender

Then they're not using appender the way it was intended.

What is the reason for the user to require amending all the occurrences of the array and use opSlice,

That wouldn't be the fix. Appender is meant to be used to construct an array, so the fix is to do that and return that array where it'll be used as normal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the user to require amending all the occurrences of the array and use opSlice,

That wouldn't be the fix. Appender is meant to be used to construct an array, so the fix is to do that and return that array where it'll be used as normal.

The fact that the appender is meant to be used to construct/append to an array doesn't mean that it shouldn't have the other normal interfaces. It is literally interfaces. I don't see drawbacks to implementing common/stable/standard array interfaces that doesn't hurt on performance.

This interface is, in fact, the most concerned one because it behaves differently due to capacity factor and there is no easy and safe way to implement it.

Tell me a way the user can do the following:

Appender!(int[]) a;
// other append operations

// I want to increase the rest of the space with initialized memory
foreach(a; iota(restOfTheSpace))
	a ~= int.init;

This need to be:

  • efficiently without accessing an external capacity
  • be able to set struct with this(this) @disable
  • compatible with dynamic array interfaces

The only way I see this fitting in both cases is:

Appender!(int[]) a;
// other append operations

// I want to increase the rest of the space with initialized memory
auto len = a[].length;
a.reserve(restOfTheSpace);
initializeAll(a[][len .. $]); // if `this(this) @disable` get fixed

This case can be problematic if a Range is given with opSlice interface, but the opSlice is expensive.

The way I'm proposing is a more natural way, either it's a dynamic array:

int[] a;
// other append operations

a.length = restOfTheSpace;

or an Appender:

Appender!(int[]) a;
// other append operations

a.length = restOfTheSpace;

To complete my point on the growth rate, run the following:

void main()
{
    Appender!(string[]) ap;
    string[] ar;
    foreach(_; 0 .. 30)
    {
        writeln("ap: ", ap.capacity);
        writeln("ar: ", ar.capacity);
        ap.reserve(ap.capacity + 1);
        ar.reserve(ar.capacity + 1);
    }
}

or

void main()
{
    Appender!(ubyte[]) ap;
    ubyte[] ar;
    foreach(_; 0 .. 30)
    {
        writeln("ap: ", ap.capacity);
        writeln("ar: ", ar.capacity);
        ap.reserve(ap.capacity + 1);
        ar.reserve(ar.capacity + 1);
    }
}

to see the different growing rates of a dynamic array and an appender.


It is also hacky to support other standard array/range ops, making it pretty much a special case. I see that your argument is biased to "it is not meant to be that way" without further ellaboration and it is unfortunate that I can't find any other reason. I would like to see an ellaborated version of your argument and a list of drawbacks that such addition would have. Although, if you don't see practical value on this anyway, I might just close this and move on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the appender is meant to be used to construct/append to an array doesn't mean that it shouldn't have the other normal interfaces

I disagree.

I don't see drawbacks to implementing common/stable/standard array interfaces

As mentioned earlier, the drawback is the existence of the code given that, in my opinion, it doesn't justify itself. Worse, it incentivises using the appenders as arrays.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier, the drawback is the existence of the code given that, in my opinion, it doesn't justify itself. Worse, it incentivizes using the appenders as arrays.

I agree with this. I don't think Appender should be used as an array at all. I think the problem here is the design approach of it, as seen by your interpretation. An appender should be no more no less than a mechanism to append, not an array. The issue is that by this design the user is inclined to think otherwise since he creates a new instance from an array, which gives the illusion that it might be usable as an array itself. IMO the correct way to view the Appender is to look at it as some sort of policy that alters the way memory usage is made while appending. So maybe it's the design of it that's misleading.

Maybe a more explicit approach to what Appender is would be to make appending functions take an appender or some sort of policy plus a predicate implementing the appending logic. That way it would be more explicit the meaning of it and would ensure the policy is temporary, as it should be used that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. I don't think Appender should be used as an array at all. I think the problem here is the design approach of it, as seen by your interpretation. An appender should be no more no less than a mechanism to append, not an array. The issue is that by this design the user is inclined to think otherwise since he creates a new instance from an array, which gives the illusion that it might be usable as an array itself. IMO the correct way to view the Appender is to look at it as some sort of policy that alters the way memory usage is made while appending. So maybe it's the design of it that's misleading.

Designing an API to be explicitly restrictive because it is the only intended purpose is not a generic approach to be on a modern standard library. Taking the same example of Nullable, we could just strictly implement the methods needed to make nullable work as an actual nullable and not add those range interfaces, but is it handy for the user? It adds value and doesn't remove value.

I can't understand the described practical drawbacks of such addition:

The idea of misusing Appender as an array seems incorrect. No one should use it like a normal array unless it fulfils the documented use-cases, which are the characteristics that make it different from a normal array. You can argue that I can always use a DList versus a SList because DList does the same things as SList, but that doesn't mean it is always adapted for your use-case. You can also do other comparisons with other data structures. For me, arguing that is not a valid reason to explicitly restrict an interface.

Maybe a more explicit approach to what Appender is would be to make appending functions take an appender or some sort of policy plus a predicate implementing the appending logic. That way it would be more explicit the meaning of it and would ensure the policy is temporary, as it should be used that way.

Having append functions that uses Appender struct would make a local capacity field useless, hence it needs to keep its context, otherwise, the performance would be always worse. This is due to how Appender and dynamic arrays capacity are implemented, otherwise, a bigger growing factor would suffice.


My vision of these changes is ultimately have a configurable Array container with a Grower hook or something similar and maybe other synchronisation features, apart from Appender interface, but the containers on Phobos doesn't seem to follow the range interface anyway, hence community often use emsi containers along with other features.

I see no productiveness/progress on continuing arguing here, so I'm going to close this for now.

Comment on lines +3583 to +3587
doInitialize(
_data.arr.ptr,
_data.arr.ptr + newLength,
(cast(void*)&initializer)[0 .. T.sizeof]
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
doInitialize(
_data.arr.ptr,
_data.arr.ptr + newLength,
(cast(void*)&initializer)[0 .. T.sizeof]
);
import core.lifetime : emplace;
foreach (ref e; _data.arr.ptr[0 .. newLength])
emplace(&e);

Also, even though that obviates doInitialize, there's a __traits(initSymbol, T) for structs, classes and unions, introduced specifically to not have to have a static immutable initiailzer = T.init;.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. __traits(initSymbol, T) mimics the TypeInfo.initializer, not the T.init. For classes, it returns the emplaced memory block when dereferenced, not the actual reference.

Also emplace doesn't work here as arrays of abstract classes or structs with disabled constructors would not work. Although, I think emplaceInitializer can fit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For classes (at least when you're merely storing references and not instances), the zero init path should do the trick. But yes, emplaceInitializer is more a way to go.

@ljmf00 ljmf00 closed this May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants