Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] Compute package closed to adding custom functions. #36936

Closed
cube2222 opened this issue Jul 29, 2023 · 4 comments · Fixed by #36959
Closed

[Go] Compute package closed to adding custom functions. #36936

cube2222 opened this issue Jul 29, 2023 · 4 comments · Fixed by #36959
Assignees
Labels
Component: Go Type: usage Issue is a user question
Milestone

Comments

@cube2222
Copy link
Contributor

Describe the usage question you have. Please include as many useful details as possible.

Hey!

I've tried to use the Go arrow module, specifically the compute package. It contains some handy "infrastructure" for defining functions, kernels for them, matching the right kernels to the type, etc.

However, it seems like adding custom functions to a FunctionRegistry as a consumer of the module is impossible?

Implementing the Function interface requires returning an exec.Kernel - which is a struct from the internal arrow/compute/internal/kernels package, so that's impossible. Same goes for reusing existing utility structures, like the ScalarFunction.

Could you please explain the reason for the API disallowing custom functions (or, probably, showing me what I'm missing to define custom functions)? Thanks a lot in advance!

Component(s)

Go

@cube2222 cube2222 added the Type: usage Issue is a user question label Jul 29, 2023
@cube2222
Copy link
Contributor Author

@zeroshade I hope you don't mind the direct tag, but I've seen your name in most of the compute packages commits, would you be able to help me here?

@zeroshade
Copy link
Member

Hey @cube2222 no worries about the tag!

So, part of disallowing the external custom functions like that was just overlooking it as the interface grew and expanded while trying to keep people from shooting themselves in the foot. Another part of it was to try to encourage people to contribute functions back to the repo, lol.

But you're right, I should make the top level interfaces like exec.Kernel externally accessible to allow registering more custom functions like that. I'll add that to my list of things to do, but if you wanna take a stab at it feel free to file PRs and it'll auto-tag me to review them!

@cube2222
Copy link
Contributor Author

cube2222 commented Jul 31, 2023

So, part of disallowing the external custom functions like that was just overlooking it as the interface grew and expanded while trying to keep people from shooting themselves in the foot. Another part of it was to try to encourage people to contribute functions back to the repo, lol.

Hah, I guess that makes sense @zeroshade! Thanks for the explanation.

Not sure if you have any specific vision on what should and what should not be exported, but based on the structures and interfaces required to register a function, it seems that it makes sense to just make the whole exec package module-public.

In case that makes sense, here's a PR #36959, but no hard feelings if you prefer to approach it differently.

With this change registering functions from "userspace" works like a charm. Quick example:

	myFunc := compute.NewScalarFunction("papaya", compute.Arity{
		NArgs: 1,
	}, compute.FunctionDoc{})
	if err := myFunc.AddNewKernel(
		[]exec.InputType{
			{
				Kind: exec.InputExact,
				Type: arrow.PrimitiveTypes.Int64,
			},
		},
		exec.NewOutputType(arrow.PrimitiveTypes.Int64),
		func(ctx *exec.KernelCtx, span *exec.ExecSpan, result *exec.ExecResult) error {
			inSlice := GetSpanValues(&span.Values[0].Array, 1)
			outSlice := GetSpanValues(result, 1)

			for i := range inSlice {
				outSlice[i] = inSlice[i] + 42
			}
			return nil
		},
		nil,
	); err != nil {
		panic(err)
	}
	execCtx.Registry.AddFunction(myFunc, true)

	arr1Builder := array.NewInt64Builder(allocator)
	for i := 0; i < 1024; i++ {
		arr1Builder.Append(int64(i))
	}
	arr1 := arr1Builder.NewArray()

	arr2Datum, err := compute.CallFunction(
		ctx,
		"papaya",
		nil,
		&compute.ArrayDatum{Value: arr1.Data()},
	)
	if err != nil {
		panic(err)
	}

	log.Println(array.NewInt64Data(arr2Datum.(*compute.ArrayDatum).Value))

works as expected.

All tests pass too.

@zeroshade
Copy link
Member

I think that's a fine solution, ideally I shouldn't have to make many changes to the current structure of those interfaces and If I do they are interfaces which should be easy to use without breaking. Hopefully I don't regret this statement in the future 😆

zeroshade pushed a commit that referenced this issue Aug 7, 2023
### Rationale for this change

As discussed in the issue #36936 , this change makes it possible for the consumers of the module to register custom functions, while taking advantage of the existing infrastructure for matching types and choosing kernels.

### What changes are included in this PR?

Just moving the package one level higher, so that it's module-exported.

### Are these changes tested?

There's no actual code changes other than moving files around and updating imports - existing tests still pass.

### Are there any user-facing changes?

The `compute/exec` is now exported.

* Closes: #36936

Authored-by: Jakub Martin <jakub.wit.martin@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade zeroshade added this to the 14.0.0 milestone Aug 7, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…pache#36959)

### Rationale for this change

As discussed in the issue apache#36936 , this change makes it possible for the consumers of the module to register custom functions, while taking advantage of the existing infrastructure for matching types and choosing kernels.

### What changes are included in this PR?

Just moving the package one level higher, so that it's module-exported.

### Are these changes tested?

There's no actual code changes other than moving files around and updating imports - existing tests still pass.

### Are there any user-facing changes?

The `compute/exec` is now exported.

* Closes: apache#36936

Authored-by: Jakub Martin <jakub.wit.martin@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Go Type: usage Issue is a user question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants