Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big-endian seems to work: maybe remove misleading requirement on CRAN? #275

Open
barracuda156 opened this issue Mar 6, 2023 · 3 comments · May be fixed by #276
Open

Big-endian seems to work: maybe remove misleading requirement on CRAN? #275

barracuda156 opened this issue Mar 6, 2023 · 3 comments · May be fixed by #276
Assignees
Milestone

Comments

@barracuda156
Copy link

barracuda156 commented Mar 6, 2023

CRAN lists little-endian as a requirement. Why is it so? What may be needed to add big-endian support?

P. S. fstlib claims that it can be compiled on all major platforms, and zstd and lz4 certainly build and work fine on Big-endian platforms.
@MarcusKlik Could you please comment on this?

@barracuda156
Copy link
Author

Hmm, it seems to work fine on ppc:


R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.8.0 (32-bit)

> 
> # required packages
> library(testthat)
> library(fstcore)
> library(lintr)
> 
> # run tests
> test_check("fstcore")
[ FAIL 0 | WARN 0 | SKIP 2 | PASS 11 ]

══ Skipped tests ═══════════════════════════════════════════════════════════════
• On CRAN (2)

[ FAIL 0 | WARN 0 | SKIP 2 | PASS 11 ]
> 
> proc.time()
   user  system elapsed 
  4.561   0.287   4.878 

@barracuda156 barracuda156 changed the title Big-endian support? Big-endian seems to work: maybe remove misleading requirement on CRAN? Apr 9, 2023
@barracuda156
Copy link
Author

Everything works:

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.8.0 (32-bit)


> 
> # required packages
> library(testthat)
> library(fst)
> 
> # run tests
> test_check("fst")
[ FAIL 0 | WARN 0 | SKIP 1 | PASS 823 ]

══ Skipped tests ═══════════════════════════════════════════════════════════════
• On CRAN (1)

[ FAIL 0 | WARN 0 | SKIP 1 | PASS 823 ]
> 
> proc.time()
   user  system elapsed 
 99.967   1.810 101.811 

@MarcusKlik
Copy link
Collaborator

Hi @barracuda156, you're absolutely right, Big-endian reads and writes work correctly if done on the same system.

The problem is when you transport the resulting fst files from a Big-endian system to a Little-endian system or visa versa. In that case, integer reads can mixed up for example because the byte-orders are reversed. This problem can be solved with an adjustment to the reading algorithm for those cases if there would be a need for that.

So the LZ4 and ZSTD compression used by fst is already Endian-safe (see the block format description) but the read from the decompressed integers to memory is not...

@MarcusKlik MarcusKlik self-assigned this Dec 1, 2023
@MarcusKlik MarcusKlik added this to the Candidate milestone Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants