Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful span #437

Merged
merged 2 commits into from
May 18, 2022
Merged

Stateful span #437

merged 2 commits into from
May 18, 2022

Conversation

Lysxia
Copy link
Contributor

@Lysxia Lysxia commented May 16, 2022

Closes #178 and #362
Depends on PR #436

This PR implements spanM (span from the left) and spanEndM (span from the right) for both strict and lazy Text.

spanM is a combination of span and foldr.

  • spanM, as its name indicates, generalizes span, with a monadic predicate, of type Char -> m Bool rather than Char -> Bool.
  • spanM generalizes a short-circuiting foldr, by also remembering the location where it stopped so you can easily resume with another search from there.
  • spanM can also implement foldl' with the same space and time efficiency, by making the predicate always True and only using the state effect.

I argue these two functions pass the Fairbairn threshold as they provide the ability to safely move back and forth inside a Text. As mentioned in #178, this is currently not possible to do both efficiently and safely: either you keep track of an offset in code points, which then requires a redundant traversal to do splitAt, or you keep track of an offset in bytes, which is unsafe.

With spanM one can easily implement a takeWhileM (#362) by ignoring the remaining suffix. I'm not sure if its allocation would be optimized out already, but if not, I can generalize `spanM`` with some continuation-passing trick to do that in a future PR.

@Bodigrim
Copy link
Contributor

How about spanM :: Monad m => (Char -> m Bool) -> Text -> m (Text, Text)? It is more general and easier to guess semantics from the type signature.

@Lysxia
Copy link
Contributor Author

Lysxia commented May 16, 2022

How about spanM :: Monad m => (Char -> m Bool) -> Text -> m (Text, Text)?

Very nice idea, thanks!

src/Data/Text.hs Outdated Show resolved Hide resolved
tests/Tests/Properties/Substrings.hs Outdated Show resolved Hide resolved
src/Data/Text/Lazy.hs Outdated Show resolved Hide resolved
Copy link
Contributor

@Bodigrim Bodigrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase.

@Lysxia Lysxia force-pushed the statefulSpan branch 2 times, most recently from eb6341e to a5865c8 Compare May 17, 2022 20:16
@Bodigrim Bodigrim merged commit 8cc6ba2 into haskell:master May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add stateful 'span'?
2 participants