Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationale of current design and comparison of different approaches #6

Closed
hax opened this issue Oct 1, 2019 · 8 comments
Closed

Rationale of current design and comparison of different approaches #6

hax opened this issue Oct 1, 2019 · 8 comments

Comments

@hax
Copy link
Member

hax commented Oct 1, 2019

I think upsert have solid use cases, but it's not clear why we choose current api design, not Java (more complement methods) or Rust (entry view) way. For example, upsert only cover set case, not cover get (map.has(key) ? map.get(key) : defaultValue) and delete (if (map.has(key) && test(map.get(key))) map.delete(key)) cases.

I also feel current api may have risk of misunderstanding like:

  1. Programmer may expect upsert(key, value1, value2) instead of upsert(key, () => value1, () => value2). It will throw TypeError in most cases with the exception of value1/value2 happen to be functions.
  2. Someone may write map.upsert(key, old => map.set(key, value1), () => map.set(key, value2))
  3. I was caught by map.upsert(key, noop, old => ...) where noop is () => {}

It seems Java or Rust way are little bit clear. Of coz they may have other issues.

I think it will be helpful to compare different approaches to find the pros and cons of each, and find the best way we should follow. Thank you.

@bmeck
Copy link
Member

bmeck commented Oct 1, 2019

I also feel current api may have risk of misunderstanding like ...

Misunderstandings are true of all API design possibilities we should focus on the nature problems and solving for them rather than 1 off examples of potentially confusing ways to use APIs. There is no API that will ensure it cannot be misused.

Programmer may expect upsert(key, value1, value2) instead of upsert(key, () => value1, () => value2). It will throw TypeError in most cases with the exception of value1/value2 happen to be functions.

Using values however is not ideal for certain use cases, and if we use values we have the same kind of problem for dynamic allocation depending on insert or update being used:

map.upsert(getIDOfObject(o), updateValue, costlyValueToCreate);

This would always perform the costly creation of the inserted value.

If we want to allow avoiding expensive allocation ahead of the upsert call we would have to use a user provided function, which would mean that function becomes the complementary form of the footgun you describe:

map.upsert(getIDOfObject(o), updateValue, fnCreatingCostlyValueToCreate);

If updateValue is a function then it would mean complications since supporting dynamic creation of values would need it to actually be:

map.upsert(getIDOfObject(o), () => updateValue, fnCreatingCostlyValueToCreate);

A variety of use cases not associated with costly values also exist here, from shared values like:

const response = await httpCache.upsert(url, undefined, () => fetch(url));

or from auto-generated values:

people.upsert(url, undefined, createPersonWithNextID);

Where you want to ensure each person has a sequential ID and not to have an ID skipped.

Someone may write map.upsert(key, old => map.set(key, value1), () => map.set(key, value2))

I'm not sure this is a problem, but it does look odd.

It seems Java or Rust way are little bit clear. Of coz they may have other issues.

I'm not sure I understand.

I think it will be helpful to compare different approaches to find the pros and cons of each, and find the best way we should follow. Thank you.

We have a listing of languages we took the time to audit the APIs of in the README, but I don't think explicitly comparing pros/cons of ever API is worth the time vs seeing the workflows they enable.

We do have a callout as well in the README about "Why use functions instead of values for the parameters?", that perhaps needs expanding. Is there a way we might expand that item such that it is clearer without creating too much text that it doesn't act as a simple answer to the question?

@gibson042
Copy link

It isn't necessarily obvious that the parameter signature is (key, updateFn, insertFn) rather than (key, insertFn, updateFn). A suggestion from the meeting was to have a single callback and let its arguments differentiate update vs. insert, e.g. map.upsert(key, (oldValue, isUpdate) => isUpdate ? transform(oldValue) : getNewValue()).

@devoto13
Copy link

devoto13 commented Oct 3, 2019

Using values however is not ideal for certain use cases, and if we use values we have the same kind of problem for dynamic allocation depending on insert or update being used:

map.upsert(getIDOfObject(o), updateValue, costlyValueToCreate);

This would always perform the costly creation of the inserted value.

I would argue, that API should be optimised for a more common use case. I don't have a data, but from the experience dealing with cheap values is more common use case than costly values. Furthermore if the value is so costly, that it worths creation of the function and calling this function maybe two lookups are not so expensive after all?

Overall the current API design feels too focused on the performance and too little on the usability: passing functions is less convenient than raw values, function has a lot of responsibilities for a single method. Can we consider making it more balanced and accept two lookups for a heavy insert/update value? Python's (less versatile) or Java's (more versatile) approaches look much more clean:

// Python's way
Map.prototype.setIfMissing(key, newValue) -> value

// or Java's way
Map.prototype.computeIfAbsent(key, () => value)) -> value
Map.prototype.computeIfPresent(key, (existinValue) => newValue) -> value

The approach with a single function parameter suggested above also looks like a good tradeoff between performance and usability. The drawback being that usage is less nice for cases where you only want to do either computeIfAbsent or computeIfPresent, but not both at the same time.

@bmeck
Copy link
Member

bmeck commented Oct 3, 2019

@devoto13

I would argue, that API should be optimised for a more common use case. I don't have a data, but from the experience dealing with cheap values is more common use case than costly values. Furthermore if the value is so costly, that it worths creation of the function and calling this function maybe two lookups are not so expensive after all?

I think we should start with the more comprehensive API that covers all the forms but would be open to adding a 2nd different method if you find the proposal not useful while the insertion must be done using a function. I think there is too much focus purely on the performance perhaps being passed around. Part of the intent, is to allow a method to cover all basic workflows around insertion/update, without suffering the common case forcing multiple workflows (upsert vs current multiple method calls) that differ based upon.

A completely different API design like https://doc.rust-lang.org/std/collections/hash_map/enum.Entry.html is also possible for covering the workflows and multiple more, but is verbose. I think the current design is trying to make a fair trade-off and would want significant explanation of the burdens imposed by making a function.

In addition, even if literals are common, other scenarios like insertion being dependent on key require a function.

I highly recommend we focus on the workflows provided by each design rather than on familiarity.

Overall the current API design feels too focused on the performance and too little on the usability: passing functions is less convenient than raw values, function has a lot of responsibilities for a single method. Can we consider making it more balanced and accept two lookups for a heavy insert/update value? Python's (less versatile) or Java's (more versatile) approaches look much more clean:

Does the passing of function cause significant inconvenience to drop the workflow of dealing with insertions that are costly/effectful? I'm not clear that adding an arrow function is a burden that is enough value to limit the utility here.


In any direction, I feel strongly we should not mix parameters being functions or raw values. Personally I use both insert and update workflows frequently and still find the function form to be clear in intent. I would find having 1 to be function and the other to not be a function would likely be confusing to myself at least.

@hax
Copy link
Member Author

hax commented Oct 3, 2019

Someone may write map.upsert(key, old => map.set(key, value1), () => map.set(key, value2))

I'm not sure this is a problem, but it does look odd.

Actually this is a problem because map.set returns map itself (Map.prototype.get returns value but Map.prototype.set returns map!) and I guess map will be upserted instead of value1/value2.

@bmeck
Copy link
Member

bmeck commented Oct 3, 2019

@hax that is what the code they wrote does, yes. I don't see how it is an actual problem though. The method is just doing what they told it to do.

@bmeck
Copy link
Member

bmeck commented Jul 9, 2020

I've moved feedback from several issues to a revised design in #21

@hax
Copy link
Member Author

hax commented Jul 12, 2020

Though I still hope we can have some words in FAQ to explain why not choose Java or Rust way, I am happy to close this issue because the specific risk of misunderstanding in my original issue has been solved in the new design. Thank you @bmeck !

@hax hax closed this as completed Jul 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants