Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the implementation of Data.Text.concat #551

Merged
merged 1 commit into from
Feb 1, 2024

Conversation

meooow25
Copy link
Contributor

@meooow25 meooow25 commented Jan 11, 2024

Avoids allocating intermediate lists. Does not make any change to the algorithm.

Fixes #550.

As mentioned in #550 (comment), concat can alternately be implemented using a strategy which makes one pass over the list, filling and growing a buffer. This PR only optimizes the existing implementation and does not change to this strategy.

Output of cabal run text-benchmarks -- -p '$4 == "concat" && $5 == "Text"' +RTS -T with GHC 9.6.3:

Before:
All
  Pure
    tiny
      concat
        Text: OK
          15.5 ns ± 1.3 ns,  79 B  allocated,   0 B  copied,  35 MB peak memory
    ascii-small
      concat
        Text: OK
          24.4 μs ± 1.3 μs, 147 KB allocated,  17 B  copied,  62 MB peak memory
    ascii
      concat
        Text: OK
          41.1 ms ± 3.5 ms, 136 MB allocated,  15 MB copied, 511 MB peak memory
    english
      concat
        Text: OK
          2.17 ms ± 213 μs, 8.5 MB allocated,  14 KB copied, 511 MB peak memory
    russian
      concat
        Text: OK
          2.53 μs ± 194 ns,  18 KB allocated,   0 B  copied, 511 MB peak memory
    japanese
      concat
        Text: OK
          6.00 μs ± 251 ns,  39 KB allocated,   2 B  copied, 511 MB peak memory

After:

  Pure
    tiny
      concat
        Text: OK
          4.67 ns ± 346 ps,  23 B  allocated,   0 B  copied,  35 MB peak memory, 69% less than baseline
    ascii-small
      concat
        Text: OK
          12.1 μs ± 964 ns,  66 KB allocated,   0 B  copied,  73 MB peak memory, 50% less than baseline
    ascii
      concat
        Text: OK
          16.3 ms ± 726 μs,  57 MB allocated, 951 B  copied, 498 MB peak memory, 60% less than baseline
    english
      concat
        Text: OK
          874  μs ±  73 μs, 3.8 MB allocated,  25 B  copied, 498 MB peak memory, 59% less than baseline
    russian
      concat
        Text: OK
          1.65 μs ± 159 ns,  12 KB allocated,   0 B  copied, 498 MB peak memory, 34% less than baseline
    japanese
      concat
        Text: OK
          3.22 μs ± 157 ns,  12 KB allocated,   0 B  copied, 498 MB peak memory, 46% less than baseline

@Bodigrim Bodigrim requested a review from Lysxia January 11, 2024 22:41
Copy link
Contributor

@Lysxia Lysxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. I left a tiny suggestion.

src/Data/Text.hs Show resolved Hide resolved
src/Data/Text.hs Show resolved Hide resolved
Avoid allocating intermediate lists. The algorithm used is not changed,
the total length is calculated in one pass over the input and the output
Text populated in another pass.

Note: In case there is one non-empty Text and one or more empty Texts,
the non-empty Text is copied whereas it would previously be returned as
is. The behavior is unchanged for every other case.
Copy link
Contributor Author

@meooow25 meooow25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
Also added a note in the commit message about a case I thought of where the behavior changes from the current implementation.

Note: In case there is one non-empty Text and one or more empty Texts,
the non-empty Text is copied whereas it would previously be returned as
is. The behavior is unchanged for every other case.

src/Data/Text.hs Show resolved Hide resolved
src/Data/Text.hs Show resolved Hide resolved
@meooow25 meooow25 requested a review from Lysxia January 30, 2024 19:18
@Lysxia Lysxia merged commit 1a6a06a into haskell:master Feb 1, 2024
27 checks passed
@meooow25 meooow25 deleted the concat branch February 7, 2024 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concat can be faster
3 participants