New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic in page.go on bigendian arch #56
Comments
Hello @pavolloffay, thanks for reporting! We might need a bit more details to be able to track down this issue. Would you be able to share a file that causes the panic? Or steps to generate one? |
The IBM Z (s390x) is big-endian. Is this causing the issue? https://github.com/search?q=repo%3Aparquet-go%2Fparquet-go%20LittleEndian&type=code |
@achille-roussel see #58 to reproduce failing tests on s390x |
The panic in #58 happens as well in |
Hi all, may I ask what the status of this is please, it is blocking us from upgrading Tempo. |
@pavolloffay I'm going to work on a patch to Parquet, not quite sure I can do it but I'll try. Without it we cant upgrade. Are you in Zurich? |
Hello @JavaPerformance, progress on the resolution was tracked in #58 There hasn't been much progress lately, but it would be extremely helpful if you have the cycles and can pick it up. I don't have access to a system running on a big-endian architecture, so it's a bit tricky to investigate the remaining issues. |
@pavolloffay , @achille-roussel I have tried reproducing the issue on s390x platform(IBM-Z) I am able reproduce the issue, below is the stack trace which I have got from the panic.
In theTestOpenFile table driven test, the test "TestOpenFile/testdata/alltypes_tiny_pages.parquet:" which causing the issue.
After fixing this with endian check and if Big endian, swapped the Bytes. There are other failures which I have observed after this fix which already causing the same test to failure.
It would be really helpful any pointer's on bitpacked true if case execution code path. I am suspecting the issue lies in this code flow. |
Hi all, is there any update on this, I know that the s390x Tempo is small but we are doing interesting things and being stuck on Tempo 1.5 is holding us back. @joe-elliott |
unfortunately, i think those that using big endian architectures are going to have to dig this one out on their own. we do not have the bandwidth or incentive right now. |
@joe-elliott I understand, I'll make an effort. Forgive me ask the odd naive question in the process. |
no worries! i just want to be clear about where we are on this. huge thx for making an attempt 🙏 |
I've been playing around, although there will probably be many other places in the code that have to be worked on I started with rle.go that @srinivas-pokala found. The private encodeInt32 decodeInt32 work just fine and pass the simple test I wrote. However the exported function DecodeInt32 is a problem and fails on BigEndian because of the unsafecasts (I was lazy and just replicated the unsafecasts from DecodeInt32 in my test that directly calls the provate decodeIn32. At least it gives me somewhere to start. @joe-elliott is RLE used often? could it be the reason for the panic in page.go or is this a case of fix this anyway but keep looking? === RUN TestEncodeDecodeInt32 === RUN TestEncodeDecodeInt32 |
@joe-elliott I've started work on this, A few of those question I warned you about :)
|
Are you asking from a parquet-go perspective? Like should a file written by a big endian system be readable by a little endian system? I think ideally the storage format would not be dependent on the cpu architecture it was created with. also, it seems like the parquet specification itself has endian requirements and we'd like the repo to be consistent: https://parquet.apache.org/docs/file-format/data-pages/encodings/
Yeah, I think same answer as above. Presumably the tests are reading parquet files that are written using the standard encodings detailed above and a big endian system should be able to read those. cc @achille-roussel in case he has a different viewpoint. |
Agreed on both points. *_bigendian.go and *_bigendian_test.go OK for you? |
binary.LittleEndian is safe parquet-go/internal/unsafecast/unsafecast.go Lines 62 to 80 in c916150
"the layouts mismatch", It refers to cast little endian file to big endian int There is no way to fix unsafecast package without incurring significant performance overhead |
@forsaken628 I realise that I'll have to take a performance hit on Mainframe but my changes won't affect other platforms which will continue using unsafe casts and copies and maintain their existing performance. |
Hi @pavolloffay, We see that many of the test cases are failing right now on s390x, and we tried our best to handle the testcase failure by changing endianness of the data types.. Even though, it helps in succeeding at some place, but it eventually fails at later stages. These are testcase groups that were failing initially when we started looking into the parquet issue. Out of the above testcases, we have successfully fixed “TestConvertValue/string_to_int96” testcase… We need your help and support in understanding the complete parquet package and its functionality, so that we can work on fixing the other testcase failures.. Warm Regards, |
@Vishwanatha-HD feel free to send your questions, I'll do my best to respond promptly. We can chat on the #parquet-go Gopher Slack channel for more direct communication as well! |
@Vishwanatha-HD @achille-roussel Hi Vishwa, I have been attempting a port for a little while, but struggling. If you haven't already you might want to look at sparse/array.go and add 8- sizeof type to the Index functions. For example:
|
Hi, we are seeing the following error on IBM Z 64bit with Grafana Tempo and Parquet format. Weird is that the exact same setup works well on other architectures.
I have tried rebuilding Tempo with
-tags purego
and got the same panic.Any help would be appreciated.
The text was updated successfully, but these errors were encountered: