Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go][Array] Expose UntypedValue(i int) interface #34657

Closed
yevgenypats opened this issue Mar 21, 2023 · 9 comments · Fixed by #34986
Closed

[Go][Array] Expose UntypedValue(i int) interface #34657

yevgenypats opened this issue Mar 21, 2023 · 9 comments · Fixed by #34986

Comments

@yevgenypats
Copy link
Contributor

Describe the enhancement requested

Similar to #34377

It will be nice to expose UntypedValue(i int) interface on array so any subclass array can implement it.

Component(s)

Go

@kou
Copy link
Member

kou commented Mar 21, 2023

Could you use "[Go]" not "[GO]" in title?

@yevgenypats yevgenypats changed the title [GO][Array] Expose UntypedValue(i int) interface [Go][Array] Expose UntypedValue(i int) interface Mar 21, 2023
@yevgenypats
Copy link
Contributor Author

cc @zeroshade for feedback. If that's something possible/reasonable then I can also contribute this PR.

@zeroshade
Copy link
Member

hmm, I think it's perfectly reasonable to implement this. Do you think this is better than the ValueString idea? ie: instead of adding a ValueString method to just extension arrays, you add UntypedValue to all arrays and then extension arrays can override this to produce something with a String() method or just let it go to the underlying storage array?

Also, perhaps we can name it something like InterfaceValue(i int) or something else. I don't think I like UntypedValue but then again, I can't think of something "better" at the moment, so we can go with UntypedValue if you don't have any better suggestions.

@yevgenypats
Copy link
Contributor Author

After further investigation seems there is a parallel implementation in c/c++ with GetScalar so we should prob stick to that - https://arrow.apache.org/docs/cpp/api/array.html . WDYT?

@zeroshade
Copy link
Member

Sorry for the delayed response, I'm currently at a conference.

We already have a function in the scalar package for getting the scalar from an array. Do we really need to shift it to the array object instead?

@yevgenypats
Copy link
Contributor Author

No worries. @zeroshade . I think we have to if we want to implement GetScalar(i) on array. We can implement also GetScalar(arr array.Array, i int) inside scalar but imo this will just spread each array functionality into two different packages.

@hermanschaaf
Copy link
Contributor

hermanschaaf commented Mar 30, 2023

I'd like to add my two cents here: I've been looking at implementing array.Diff functionality for #34790. To implement this, we need to compare values in the arrays. Right now the way to go would be to use scalar.GetScalar, but unfortunately because the scalar.GetScalar imports the array package, we cannot use it from array due to circular imports.

This problem could be avoided if the GetScalar(arr array.Array, i int) function, and other scalar functions that deal with arrays, were placed inside the array package instead. So we'd have array.GetScalar, array.MakeFromScalar, etc. This also makes sense to me from a semantic point of view, since scalars are smaller building blocks for arrays, the imports should go from array -> scalar and not the other way round.

Update: In the array.Diff case, I was able to use SliceEqual instead, which also has the benefit of not allocating extra memory, so it wasn't a big problem in the end.

@zeroshade
Copy link
Member

@yevgenypats There's already GetScalar(arr arrow.Array, i int) in the scalar package. This is because even if you move the interface into the arrow package, you unfortunately can't actually construct the scalar for a GetScalar method without causing a cyclic import problem.

I believe the benefit here would be to have an UntypedValue method that returns the underlying value type. For a nested type, that might be an array itself (like for a list array) or it might be an []interface{} (like the case for a struct) etc.

@yevgenypats
Copy link
Contributor Author

@zeroshade After playing a bit more with arrow, csv, json seems that the you are right indeed :) first easiest and most helpful step would be indeed having two methods:

on Array interface: ValueString(i int) string that will return a string representation (we actually already had this on extensions)

and the inverse on Builder interface: AppendValueFromString(i int) error which will take the string representation parse and add to the builder.

This would simplify CSV reader,writer and in general should solve lot of issues when handling unknown types and providing sane defaults via string representation and not only json marshaling.

I've a first stab at this here: #34986

zeroshade pushed a commit that referenced this issue Apr 13, 2023
### Rationale for this change

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* Closes: #34657

Authored-by: Yevgeny Pats <16490766+yevgenypats@users.noreply.github.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade zeroshade added this to the 13.0.0 milestone Apr 13, 2023
liujiacheng777 pushed a commit to LoongArch-Python/arrow that referenced this issue May 11, 2023
…4986)

### Rationale for this change

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* Closes: apache#34657

Authored-by: Yevgeny Pats <16490766+yevgenypats@users.noreply.github.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
…4986)

### Rationale for this change

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* Closes: apache#34657

Authored-by: Yevgeny Pats <16490766+yevgenypats@users.noreply.github.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue May 16, 2023
…4986)

### Rationale for this change

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* Closes: apache#34657

Authored-by: Yevgeny Pats <16490766+yevgenypats@users.noreply.github.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@raulcd raulcd modified the milestones: 13.0.0, 12.0.1 May 30, 2023
raulcd pushed a commit that referenced this issue May 30, 2023
### Rationale for this change

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* Closes: #34657

Authored-by: Yevgeny Pats <16490766+yevgenypats@users.noreply.github.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment