Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] Support Array Diff #34790

Closed
hermanschaaf opened this issue Mar 30, 2023 · 2 comments · Fixed by #34806
Closed

[Go] Support Array Diff #34790

hermanschaaf opened this issue Mar 30, 2023 · 2 comments · Fixed by #34806

Comments

@hermanschaaf
Copy link
Contributor

Describe the enhancement requested

Similar to the CPP implementation of Array Diff (Python docs: https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.diff) it would be good to have the same in Go. As far as I can tell, there is an array.Equal in Go right now, but not array.Diff.

Like its CPP counterpart, given two string arrays ["one", "two", "three"] and ["two", None, "two-and-a-half", "three"] it would output a string like this:

@@ -0, +0 @@
-"one"
@@ -2, +1 @@
+null
+"two-and-a-half"

Component(s)

Go

@hermanschaaf
Copy link
Contributor Author

hermanschaaf commented Mar 30, 2023

I am happy to submit a PR for this.

Currently I'm thinking it would go into a diff.go file inside go/arrow/array. The function signature would be array.Diff(left, right arrow.Array) (string, error). An error would be returned if two arrays of different types are passed in, for example. (This is also not handled in the CPP version right now. See

Status Visit(const NullType&) { return Status::NotImplemented("null type"); }
Status Visit(const ExtensionType&) { return Status::NotImplemented("extension type"); }
Status Visit(const DictionaryType&) {
return Status::NotImplemented("dictionary type");
}
Status Visit(const RunEndEncodedType&) {
return Status::NotImplemented("run-end encoded type");
}

We can also consider exposing an internal function that returns the differences in Struct format, rather than as a string.

@hermanschaaf
Copy link
Contributor Author

I have broken the implementation into two parts:

@zeroshade zeroshade added this to the 12.0.0 milestone Apr 5, 2023
zeroshade pushed a commit that referenced this issue Apr 5, 2023
This adds an `array.Diff` function that returns an edit script that, when applied to `base`, produces `target`. This mirrors the C++ implementation. 

It does not yet include a string diff formatter.  This can be done in a follow-up.

* Closes: #34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
zeroshade pushed a commit that referenced this issue Apr 10, 2023
This adds a `UnifiedDiff(base, target arrow.Array)` method to the `array.Edits` type. It returns a string diff in Unified Diff format. This makes use of the `array.Edits` type returned by the `arrays.Diff()` function added in #34806

- Part of #34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
liujiacheng777 pushed a commit to LoongArch-Python/arrow that referenced this issue May 11, 2023
This adds a `UnifiedDiff(base, target arrow.Array)` method to the `array.Edits` type. It returns a string diff in Unified Diff format. This makes use of the `array.Edits` type returned by the `arrays.Diff()` function added in apache#34806

- Part of apache#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
This adds an `array.Diff` function that returns an edit script that, when applied to `base`, produces `target`. This mirrors the C++ implementation. 

It does not yet include a string diff formatter.  This can be done in a follow-up.

* Closes: apache#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
This adds a `UnifiedDiff(base, target arrow.Array)` method to the `array.Edits` type. It returns a string diff in Unified Diff format. This makes use of the `array.Edits` type returned by the `arrays.Diff()` function added in apache#34806

- Part of apache#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue May 16, 2023
This adds an `array.Diff` function that returns an edit script that, when applied to `base`, produces `target`. This mirrors the C++ implementation. 

It does not yet include a string diff formatter.  This can be done in a follow-up.

* Closes: apache#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue May 16, 2023
This adds a `UnifiedDiff(base, target arrow.Array)` method to the `array.Edits` type. It returns a string diff in Unified Diff format. This makes use of the `array.Edits` type returned by the `arrays.Diff()` function added in apache#34806

- Part of apache#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
kou pushed a commit to apache/arrow-go that referenced this issue Aug 30, 2024
This adds a `UnifiedDiff(base, target arrow.Array)` method to the `array.Edits` type. It returns a string diff in Unified Diff format. This makes use of the `array.Edits` type returned by the `arrays.Diff()` function added in apache/arrow#34806

- Part of apache/arrow#34790

Authored-by: Herman Schaaf <hermanschaaf@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants