Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] properties of Map(Array|Type) are confusingly named #22170

Closed
asfimport opened this issue Jun 26, 2019 · 7 comments
Closed

[C++] properties of Map(Array|Type) are confusingly named #22170

asfimport opened this issue Jun 26, 2019 · 7 comments

Comments

@asfimport
Copy link

In the context of ListArrays, "values" indicates the elements in a slot of the ListArray. Since MapArray isa ListArray, "values" indicates the same thing and the elements are key-item pairs. This naming scheme is not idiomatic; these should be called key-value pairs but that would require propagating the renaming down to ListArray.

Reporter: Ben Kietzman / @bkietz
Assignee: Ben Kietzman / @bkietz

Note: This issue was originally created as ARROW-5745. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Francois Saint-Jacques / @fsaintjacques:
Just note that for PrimitiveArray, it doesn't even return the same type (Buffer* instead of Array*).

@asfimport
Copy link
Author

Wes McKinney / @wesm:
It might make things more clear to call these methods MapArray::entry_keys and MapArray::entry_values. I suspect that having a MapValue struct (representing semantically a Python dictionary-alike object) improve the usability of this class in general

@asfimport
Copy link
Author

Ben Kietzman / @bkietz:
Using "values" at all is very confusing since we already use that as the noun
which indicates what occupies an array slot (see Layout.rst).
It would really
help in explaining List if we used another word here to indicate what occupies
slots in List's child array; "items" or "entries" would be equally expressive
and are less heavily overloaded. As it is, "value" is used for both purposes,
in the same sentence of Layout.rst.

Using "values" in the context of Map is probably unavoidable due to the very
strong convention of naming elements of the range and domain of a mapping its
"keys" and "values". (A related Python-specific convention is to name the sequence
of key-value pairs "items", as in dict.)

However by renaming ListArray::values at least we can avoid collision between the
expectation that map_array.values() will yield the array of all mapped values
and the reality that it will return the array of key-value pairs, and without adding a prefix.

@asfimport
Copy link
Author

Ben Kietzman / @bkietz:
One way to partially resolve this would be to remove Map's inheritance of List so that there is no name collision over "values". Code which handles both map and list could be adapted with minimal boilerplate by adding
std::shared_ptr<Array> MapArray::\{To,From\}KeyValuePairs()
(which will also have minimal runtime overhead since the structures of Map and List arrays are identical)

@asfimport
Copy link
Author

Wes McKinney / @wesm:
I don't have a strong opinion

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Do we want to change something here?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I think that unfortunately the ship has sailed here. Changing the semantics of MapArray::values() would probably annoy a lot of users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants