Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

This adds a require and update function for Associative Arrays #2162

Merged
merged 10 commits into from
Jun 21, 2018

Conversation

GilesBathgate
Copy link
Contributor

@GilesBathgate GilesBathgate commented Apr 14, 2018

The require function provides a means to get a value corresponding to the key, but if the value doesn't exist it will evaluate the lazy argument to create a new value, add this to the associative array and then return it.

auto p = lookup.require("giles", new Person);
p.eyeColor = Color.Brown;
//...
assert("giles" in lookup);

The update function provides a means to perform additional operations during an aa create or update.

Person newer;
Person older;
lookup.update("giles", {
    newer = new Person; // assign a new value
    return newer;
}, (ref Person p) {
    older = p; // read the old value
    newer = new Person; // assign a new value
    return newer;
});

Corresponding documentation added in dlang/dlang.org#2343

Implementation notes

Traditionally the first example could be done like so:

auto p = "giles" in lookup;
if (p is null) {
    p = new Person;
    lookup["giles"] = p;
}
p.eyeColor = Color.Brown;
//...
assert("giles" in lookup);

I find this later code clunky and it requires two hashes/lookups. The proposed implementation adds support directly to rt/aaA.d to avoid this. I should also point out that the allocation of a new Person in the example is trivial, but it might often be the case that the person instance is created by fetching a record from a db, or an Image loaded from disk, or a movie downloaded from the internet.

Furthermore, the second example could be done like so:

Person newer;
Person older;
auto p = "giles" in lookup;
if (p is null) {
    newer = new Person;
    lookup["giles"] = newer;
} else {
    older = *p;
    newer = new Person;
    *p = newer;
}

While the this isn't any more clunky it requires two hashes/lookups.

Other languages

Several other languages have support for these operations:
C# has GetOrAdd, AddOrUpdate
https://msdn.microsoft.com/en-us/library/ee378677(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/ee378675(v=vs.110).aspx

Java has computeIfAbsent
https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-

Python has setdefault
https://docs.python.org/2/library/stdtypes.html#dict.setdefault

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @GilesBathgate! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + druntime#2162"

@DmitryOlshansky
Copy link
Member

Save for the name the idea is awesome and it was a blocker for eg use of builtin AAs in DMD. I never seen add mean insert in D language except maybe in my codepoint set stuff in std.uni.

Bikesheding the name can be elsewhere though;)

@GilesBathgate
Copy link
Contributor Author

GilesBathgate commented Apr 15, 2018

@DmitryOlshansky I don't really like the name getOrAdd either ;) (EDIT: now updated to require) But I wanted something terse that would fit with the existing get. Here are a few alternatives getOrInsert upsert merge emplace upget getsert introduce getOrCreate
Where is an appropriate place to bikeshed the name?

@wilzbach
Copy link
Member

The NG (aka forum.dlang.org - general)

@GilesBathgate
Copy link
Contributor Author

src/object.d Outdated
assert("bar" !in aa);
}

auto getOrAdd(K, V)(ref V[K] aa, K key, lazy V value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return by ref, please. Could also use V.init as a default for value. Then we can write code like this:

struct S { int value; }
S[string] aa;
aa.getOrAdd("foo").value = 42;
assert(aa == ["foo": S(42)]);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered whether it should be ref (which is why I added the basic value unit test), I hadn't considered this scenario, I will add a test for it and update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is also a nice idea, so I will add that too (I didn't know lazy args could have a default value)

Copy link
Member

@schveiguy schveiguy Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I just suggested a name of getPtr on the newsgroup, as I assumed that's what you were doing, but it seems you actually are returning by value. I think returning by ref is preferable to either (maybe getRef?). Note that if you wanted to create a pointer to the value, you have to go through a triple lookup today (Edit: it's not necessary to do it this way, but not as obvious, see my comment below):

Value *p = void;
if(!(p = key in aa)) // one
{
    aa[key] = initializeValue(); // two
    p = key in aa;  // three
}

Having this function return by ref makes this very easy to do in one lookup.

Note that a class value type is going to be confusing here unless you do a default value of new Value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schveiguy getPtrand getRef are certainly both terse which I like. My only criticism is that they don't as clearly communicate that the value will be added/inserted into the associative array.

I am a little bit confused by the triple lookup example. Can I not just do p = (aa[key] = initializeValue())

That said this PR will make this much more simple: Value* p = aa.getOrAdd(key, initializeValue()); Is there a unittest you can suggest to ensure this works as you expected?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I not just do p = (aa[key] = initializeValue())

You are still thinking in terms of classes. You may want a pointer to a value type (think int[int]), where you need to use key in aa to avoid further lookups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I guess you can do p = &(aa[key] = initializeValue()), which I didn't realize is possible. So no, it's not necessarily a triple lookup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they don't as clearly communicate that the value will be added/inserted into the associative array

A fair criticism. My response would be, it's just something people will learn. If you are getting a reference to the AA value, it needs to be in there.

In all honesty, before looking closer, I thought the get function already did this.

Copy link
Contributor Author

@GilesBathgate GilesBathgate Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all honesty, before looking closer, I thought the get function already did this.

Ha, yeah I made the same mistake...that lead to a few hours of head scratching ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schveiguy I have added a unittest which I think covers your example. @aG0aep6G I've made the updates as you've suggested too.

src/object.d Outdated
assert(b == S(2));

S* c = &aa.getOrAdd("baz", S(4));
assert(c is &aa["baz"]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schveiguy This is the unittest I am referring to.

@GilesBathgate GilesBathgate force-pushed the getOrAdd branch 2 times, most recently from 9373b0b to 3843de8 Compare April 17, 2018 18:38
src/object.d Outdated

unittest
{
struct S
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static struct here, otherwise there is a connection between the unittest stack frame and every S instance.

@GilesBathgate GilesBathgate force-pushed the getOrAdd branch 2 times, most recently from e08c689 to 1c7e2f0 Compare April 19, 2018 16:56
@GilesBathgate GilesBathgate changed the title This adds an Associative Array function getOrAdd This adds a require function for Associative Arrays Apr 19, 2018
@GilesBathgate GilesBathgate force-pushed the getOrAdd branch 6 times, most recently from 50efe1a to 6a8b06f Compare April 20, 2018 11:20
@GilesBathgate GilesBathgate changed the title This adds a require function for Associative Arrays This adds a require and update function for Associative Arrays Apr 20, 2018
@GilesBathgate GilesBathgate force-pushed the getOrAdd branch 3 times, most recently from df11c74 to 1ac1702 Compare April 20, 2018 17:24
@JinShil JinShil removed the @andralex Approval from Andrei is required label Jun 5, 2018
@JinShil JinShil dismissed jacob-carlborg’s stale review June 5, 2018 00:51

I believe the requested changes were made

@JinShil
Copy link
Contributor

JinShil commented Jun 5, 2018

I do think this could benefit from a changelog entry, however. It's quite an important change, I think it would be good to advertise it's availability. See https://github.com/dlang/druntime/blob/master/changelog/README.md

@JinShil
Copy link
Contributor

JinShil commented Jun 8, 2018

@GilesBathgate I'm just waiting on a changelog entry.

@JinShil JinShil added 72h no objection -> merge The PR will be merged if there are no objections raised. auto-merge and removed auto-merge labels Jun 16, 2018
@JinShil
Copy link
Contributor

JinShil commented Jun 19, 2018

Disabled auto-merge so I can manually squash commits when all tests pass.

Copy link
Member

@Geod24 Geod24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

src/object.d Outdated
* update = The delegate to apply on update.
*/
void update(K, V, C, U)(ref V[K] aa, K key, C create, U update)
if ((is(C : V delegate()) || is(C : V function())) && (is(U : V delegate(ref V)) || is(U : V function(ref V))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right approach. It forbids opCall and restrict some conversion which would be legal under the typesystem.
Maybe use is(typeof({ V* ptr; *ptr = create(); }) ?
Also shouldn't the arguments create and update be scope to avoid pointless allocation when passing delegate literals ?

src/object.d Outdated

unittest
{
static class C{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add a space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Geod24 Between C and {}?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

src/object.d Outdated
* Returns:
* The value.
*/
ref V require(K, V)(ref V[K] aa, K key, lazy V value = V.init) pure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the pure annotation and make the template works its magic, or is there a specific reason you put it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Geod24 I Just copied the function attributes from the get function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Geod24 Just so I am clear this is the 'magic' you are referring to: http://klickverbot.at/blog/2012/05/purity-in-d/#templates-and-purity i.e templates infer pure?
Presumably this wasn't the case when 'get' was implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template infers attributes, so @nogc, @safe/@system, and pure
In this case since only pure and nothrow are put on the extern (C) declaration, that's as much as could get infered. Maybe there were some inference bug at the time ?

You can make sure this works by making a pure unittest

@JinShil JinShil removed the 72h no objection -> merge The PR will be merged if there are no objections raised. label Jun 19, 2018
@JinShil
Copy link
Contributor

JinShil commented Jun 19, 2018

I'll refrain from merging this until @Geod24's comments are addressed.

@JinShil
Copy link
Contributor

JinShil commented Jun 20, 2018

@GilesBathgate Nice Work! And, thank you for sticking with it.

@Geod24 Could you please give this one last look. I'm itching to merge this.

Copy link
Member

@Geod24 Geod24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me, thanks!

@JinShil
Copy link
Contributor

JinShil commented Jun 21, 2018

Disabled auto-merge so I can manually squash commits when all tests pass.

@JinShil JinShil merged commit 0c92d13 into dlang:master Jun 21, 2018
@GilesBathgate GilesBathgate deleted the getOrAdd branch June 23, 2018 20:36
@CyberShadow
Copy link
Member

I'm proposing a small modification (addition) to the API:
#3012

Thank you @GilesBathgate for originally adding these, I got a lot of use out of them! :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.