Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] Unable to build with tinygo #32832

Closed
asfimport opened this issue Aug 31, 2022 · 8 comments · Fixed by #35723
Closed

[Go] Unable to build with tinygo #32832

asfimport opened this issue Aug 31, 2022 · 8 comments · Fixed by #35723
Assignees
Milestone

Comments

@asfimport
Copy link

I was hoping to use TinyGo to build WASM binaries with Arrow. TinyGo can generate builds that are 1% the size of those generated with Go (significant for applications hosted on the web).

Arrow's use of reflect.SliceHeader fields limits the portability of the code. For example, the Len and Cap fields are assumed to be int here: https://github.com/apache/arrow/blob/go/v9.0.0/go/arrow/bitutil/bitutil.go#L158-L159

Go's reflect package warns that the SliceHeader "cannot be used safely or portably and its representation may change in a later release."

Attempts to build a WASM binary using the github.com/apache/arrow/go/v10 module result in failures like this:

tinygo build -tags noasm -o test.wasm ./main.go
             
# github.com/apache/arrow/go/v10/arrow/bitutil
../../go/pkg/mod/github.com/apache/arrow/go/v10@v10.0.0-20220831082949-cf27001da088/arrow/bitutil/bitutil.go:158:10: invalid operation: h.Len / uint64SizeBytes (mismatched types uintptr and int)
../../go/pkg/mod/github.com/apache/arrow/go/v10@v10.0.0-20220831082949-cf27001da088/arrow/bitutil/bitutil.go:159:10: invalid operation: h.Cap / uint64SizeBytes (mismatched types uintptr and int)

This happens because TinyGo uses uintptr for the corresponding types: https://github.com/tinygo-org/tinygo/blob/v0.25.0/src/reflect/value.go#L773-L777

This feels like an issue with TinyGo, and it has been ticketed there multiple times (see tinygo-org/tinygo#1284). They lean on the warnings in the Go sources that use of the SliceHeader fields makes code unportable and suggest changes to the libraries that do not heed this warning.

I don't have a suggested fix or alternative for Arrow's use of SliceHeader fields, but I'm wondering if there would be willingness on the part of this package to make WASM builds work with TinyGo. Perhaps the TinyGo authors could also offer suggested changes.

Reporter: Tim Schaub

PRs and other links:

Note: This issue was originally created as ARROW-17584. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Matthew Topol / @zeroshade:
[~tschaub] So far we've been keeping the N-2 concept for support, as unsafe.Slice was introduced in go1.17 it would make sense for us to bump our required version to go1.17 and convert our references to the sliceheader to instead use unsafe.Slice. Can you check / confirm whether or not unsafe.Slice works correctly with TinyGo such that it would be a viable alternative?

@asfimport
Copy link
Author

Tim Schaub:
@zeroshade - it looks like things should work with unsafe.Slice. I found that tinygo also does not yet implement map.LoadAndDelete for sync, but I've proposed adding that here tinygo-org/tinygo#3118

I started to put together a change that converts the casting functions to use unsafe.Slice, but I'm struggling with what the appropriate len/cap should be.

For example, I was assuming bytesTOUint64 in arrow/bitutil/bitutil.go might look something like this:

func bytesToUint64(b []byte) []uint64 {
	return unsafe.Slice((*uint64)(unsafe.Pointer(&b[0])), len(b)/uint64SizeBytes)
}

But I'm uncertain if that is the correct length/capacity. If you have thoughts on the correct implementation, I would be happy to put together a pull request that changes this and the other casting functions.

@asfimport
Copy link
Author

Matthew Topol / @zeroshade:
From what I've been able to determine about it, the value for the second parameter should be the number of values of the resulting type. So, to use your example:

func bytesToUint64(b []byte) []uint64 {
        return unsafe.Slice((*uint64)(unsafe.Pointer(&b[0])), len(b)/uint64SizeBytes)
}

Would be the correct implementation (no need to bother with pulling the SliceHeader in this case.

If you're up for putting together a pull request that would be fantastic. Go for it and just tag me on the PR when you file it and I'll give it a review. Thanks much!

@asfimport
Copy link
Author

Tim Schaub:
Thanks, @zeroshade. Looks like I edited my comment to avoid the use of SliceHeader at about the same time you left your comment :)

I think there are still issues with the changes (tests are failing for me), but I put up a draft PR: #14026

@asfimport
Copy link
Author

Matthew Topol / @zeroshade:
Hey [~tschaub] I figured out the issue you ran into on the draft PR and commented as such with how to fix them :)

@asfimport
Copy link
Author

Apache Arrow JIRA Bot:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@chriscasola
Copy link
Contributor

@zeroshade -- Hi from FDS! I just ran across this issue and a previous attempt in #14026 to support tinygo.

We are successfully building using the native GOOS=js GOARCH=wasm go build but the binary size is not ideal for loading into web browsers.

I'm not sure what the cost of supporting tinygo would be but if there is any interest in doing so we would be game to help and test. We are wanting to compile to WASM so that we can use a lot of tooling around Arrow tables that we've implemented in Go, in the browser as well.

@zeroshade
Copy link
Member

Howdy @chriscasola! Hope all is well over there! 😄

If I remember correctly, the thing that was holding back tinygo at that point was an issue with thrift (because Parquet still uses thrift under the hood) and a net interface thing that wasn't implemented/didn't exist in tinygo.

It's likely possible to muck about with build constraints to exclude the parquet libs from a tinygo/wasm build and avoid that issue if it still exists. Feel free to give it a shot and poke about with it in a PR. I don't have the time at the moment to try this myself but I can definitely review a PR if you submit one.

zeroshade pushed a commit that referenced this issue Jun 12, 2023
### Rationale for this change

To support compiling with tinygo which enables use of arrow in environments where binary size is important, like web assembly.

### What changes are included in this PR?

Using an internal JSON package that uses `goccy/go-json` for regular builds as it does currently, but uses the native `encoding/json` for tinygo builds. This is necessary because go-json has a lot of code that is incompatible with tinygo.

Remove dependency on `parquet` package from non-parquet code since it is also incompatible with tinygo.

Other minor tweaks for compatibility with tinygo.

### Are these changes tested?

Should we add a build step that compiles the example with tinygo?

### Are there any user-facing changes?

None.

* Closes: #32832

Lead-authored-by: Chris Casola <ccasola@factset.com>
Co-authored-by: Adam Gaynor <adam.gaynor@factset.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade zeroshade added this to the 13.0.0 milestone Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants