Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builders: Implement non-host endianess writes using a single write. #531

Merged
merged 1 commit into from
Jul 14, 2022

Conversation

AndreasPK
Copy link
Contributor

We do this by swapping the byte order in memory before we write.

This speeds up int64BE by about a factor of three on my machine (skylake).

int16BE hardly changes, i guess the write buffer is not a bottle neck for 16bit words. But beyond that it makes a large difference.

Before

    int16BE (10000):                                          OK (0.54s)
      7.68 μs ± 215 ns
    int32BE (10000):                                          OK (0.43s)
      11.6 μs ± 342 ns
    int64BE (10000):                                          OK (0.42s)
      22.7 μs ± 669 ns
    word16BE (10000):                                         OK (0.55s)
      7.68 μs ± 218 ns
    word32BE (10000):                                         OK (0.42s)
      11.6 μs ± 355 ns
    word64BE (10000):                                         OK (0.42s)
      22.7 μs ± 728 ns
    floatBE (10000):                                          OK (0.52s)
      14.6 μs ± 392 ns
    doubleBE (10000):                                         OK (0.47s)
      25.7 μs ± 689 ns

After

    int16BE (10000):                                          OK (0.53s)
      7.69 μs ± 223 ns
    int32BE (10000):                                          OK (0.30s)
      7.84 μs ± 397 ns
    int64BE (10000):                                          OK (0.29s)
      7.20 μs ± 590 ns
    word16BE (10000):                                         OK (0.55s)
      7.67 μs ± 217 ns
    word32BE (10000):                                         OK (0.30s)
      7.83 μs ± 393 ns
    word64BE (10000):                                         OK (0.29s)
      7.21 μs ± 595 ns
    floatBE (10000):                                          OK (0.34s)
      8.98 μs ± 328 ns
    doubleBE (10000):                                         OK (0.23s)
      10.8 μs ± 664 ns

We do this by swapping the byte order in memory before we write.
This speeds up int64BE by a factor of three.
Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers! :)

@sjakobi sjakobi requested a review from Bodigrim July 14, 2022 15:31
@Bodigrim Bodigrim added this to the 0.11.4.0 milestone Jul 14, 2022
@Bodigrim Bodigrim merged commit e38782a into haskell:master Jul 14, 2022
sjakobi pushed a commit that referenced this pull request Jul 25, 2022
…531)

We do this by swapping the byte order in memory before we write.
This speeds up int64BE by a factor of three.

(cherry picked from commit e38782a)
sjakobi pushed a commit that referenced this pull request Jul 25, 2022
…531)

We do this by swapping the byte order in memory before we write.
This speeds up int64BE by a factor of three.

(cherry picked from commit e38782a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Highly inefficient implementation of word*BE builders for x86 arch.
3 participants