Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indirect weak map and cache #5065

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

Conversation

kraenhansen
Copy link
Member

@kraenhansen kraenhansen commented Nov 3, 2022

What, How & Why?

This adds two data structures we could add to improve the developer experience.

ImplicitWeakMap

A map from some type of object (the key) into another type of object (the value), where a function (the hasher supplied at construction) is called to derive a hash of the key, which is used when looking up the value.
This makes it possible for multiple different key objects to get the same value object.
The map is considered weak in the sense that values are wrapped in a WeakRef before being inserted in the underling map. A value is also registered with a finalization registry, ensuring that their entry in the underlying map is removed when they get garbage collected, in an effort to make the entire IndirectWeakMap avoid leaks.

ImplicitWeakCache

A cache of objects (the value) which can either be constructed on demand or retrieved from cache.
The cache is considered weak as it extends the IndirectWeakMap to store its values, making them available for garbage collection.

@kraenhansen kraenhansen self-assigned this Nov 3, 2022
@cla-bot cla-bot bot added the cla: yes label Nov 3, 2022
* @returns An existing or new value.
* @throws If `args` are not supplied and no object existed in the cache.
*/
get(key: K, args?: Args) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes they call these getOrCreate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think you're right. It would be good to signal the potential side-effect 👍

* @internal
*/
export class IndirectWeakCache<
K extends object,
Copy link
Contributor

@takameyer takameyer Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, we should start naming our generics. I always have to read the implementation to deduce what these letters actually mean. Could be as simple as:

Key extends object,
Value extends object,
Args extends unknown[],
Hash = unknown

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree .. ol' habits stick around.
I pushed a commit with some less abbreviated names 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh. I think K and V are pretty classic abbreviations for Key and Value. And I don't think the Type suffix adds any value, both because they are capitalized, and because types are the only thing that TS supports as generic arguments (even things that look "valuey" like true and "hello" are promoted to "literal types" and are still considered types).

I agree that Args and Hash are good names though. But we don't need to go all Java Enterprise Edition on all of our names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time to flip a 🪙 regarding Key vs K and Value vs V?
I don't feel strong either way.

As much as I like a good name, please also read / review the other code too 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, i'm not gonna die on this hill, but i've always felt being more explicit is better. As long as we don't start using T, K, V for anything else that Type, Key and Value, i'm fine with it. Just be explicit where it matters.

Comment on lines 47 to 59
set(key: KeyType, value: ValueType): this {
const hash = this.hasher(key);
const ref = new WeakRef(value);
// Unregister the finalization registry on value being removed from the map
// to avoid its finalization to prune the new value from the map of values.
const existingRef = this.values.get(hash);
if (existingRef) {
this.registry.unregister(existingRef);
}
// Register the new value with the finalization registry, to prune its WeakRef from
// the map of values.
this.registry.register(value, hash, ref);
this.values.set(hash, ref);
return this;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a valid way to use a hash. It is legal for two semantically non-equal objects (meaning "real" equal, not the broken js == address equality check) to have the same hash. You seem to be treating the hash as if it is a bijection, and that if the hashs match, the backing objects must be semantically equal.

Perhaps the problem is just naming? If the behavior you want is that the function returns a value that can be used to uniquely identify the value in a map, then the conventional name for that is a "Key Function"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It's probably the naming. By hash I simply mean a one-way function. This is why I named it IndirectWeakMap instead of HashWeakMap.

The hasher is the function that produce the value which is used as the basis of equality checks. Do you have a better name for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps id is better than hash in this case?

@RedBeard0531
Copy link
Contributor

My 2¢: I don't think we should do this right now. I don't think any other SDK does this (based on slack discussion) and the current JS SDK doesn't either. Adding weak maps for this use case is biting off a lot of complexity budget and I don't think it pays for itself. I'd rather we focus on reaching feature parity, and cross-platform support prior to tackling things like this. We can always add this later, even after we release V12.

GH comments probably isn't the best place to discuss this. Happy to discuss in person or on zoom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants