Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose SIMD UTF-8 validation functions from internal module #483

Merged
merged 2 commits into from
Apr 18, 2023

Conversation

andrewthad
Copy link
Contributor

This makes it possible for users to validate ByteArray# when compiling with the SIMDUTF8 flag on. It does not, however, provide any fallbacks for when that flag is off. That is left as future work.

Related to #479

@andrewthad
Copy link
Contributor Author

I've added portable functions that work without SIMDUTF being turned on. This is ready for review.

@andrewthad
Copy link
Contributor Author

Fixed a problem with GHC 8.0. I still need to add some tests to the test suite. The ByteArray# functions have no coverage.

@andrewthad andrewthad force-pushed the expose-simd-validation-functions branch from 21bda65 to ea6a3d6 Compare December 28, 2022 15:50
@andrewthad
Copy link
Contributor Author

Just squashed this down to remove a bunch of meaningless commits.

@andrewthad andrewthad force-pushed the expose-simd-validation-functions branch 2 times, most recently from 7e72724 to 0780ba6 Compare January 5, 2023 15:25
@andrewthad
Copy link
Contributor Author

Just squashed everything down.

@andrewthad
Copy link
Contributor Author

This is ready for another round of review now that support for GHC 8.0 is gone. The only changes since last time are documentation. I've provided the motivation for the module in the module's haddock, and I've documented the behaviors of the functions.

src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Show resolved Hide resolved
text.cabal Outdated Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Show resolved Hide resolved
src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
@andrewthad
Copy link
Contributor Author

Failure on GHC 8.6.5 looks spurious.

@andrewthad andrewthad force-pushed the expose-simd-validation-functions branch from 8dfa76f to b7ffc31 Compare January 12, 2023 16:49
@andrewthad
Copy link
Contributor Author

Just squashed. This will kick off another round of CI.

@andrewthad
Copy link
Contributor Author

CI is still failing, but I’m pretty sure that it’s spurious. This is ready for another round of review

@andrewthad
Copy link
Contributor Author

Wait, something really is wrong on the ppcle64 architecture. For some reason, the dependency on the byte array compatibility library doesn’t get picked up, and so it ends up missing the module. I don’t see anything wrong in the Cabal file though.

@Lysxia
Copy link
Contributor

Lysxia commented Jan 12, 2023

The emulated.yml workflow uses ghc --make instead of cabal to build, due to memory constraints according to the comment in emulated.yml. So the extra package probably needs to be installed explicitly somehow.

@Bodigrim
Copy link
Contributor

@andrewthad necessary changes are in d0b70f7. I think you squashed a bit overzealously.

@andrewthad
Copy link
Contributor Author

I’ve incorporated the diff manually. Thanks. Looks like that fixed it.

@andrewthad
Copy link
Contributor Author

Only remaining CI failure seems unrelated to the change.

@andrewthad
Copy link
Contributor Author

Is there any additional feedback, or is this good to go?

@Bodigrim
Copy link
Contributor

It's good to go, but on hold waiting for #474 to land (which in its turn waits until we cut a release to accompany GHC 9.6).

@Bodigrim
Copy link
Contributor

@Lysxia given that #448 has landed, could you take a look at what remains of this PR? I assume the new design is flexible enough already, but we might still want to expose low-level calls to simdutf?..

I’m AFK and cannot take a close look myself.

@andrewthad
Copy link
Contributor Author

Is there anything I can do to move this forward? Should I resolve the merge conflict, or are there concerns about this feature no longer fitting into the design of text?

@Lysxia
Copy link
Contributor

Lysxia commented Mar 9, 2023

We can still accept this. The bytearray validation functions are new. And while we now have a fancier bytestring validation function (now in Data.Text.Internal.Encoding), adding the bool version would look consistent with the bytearray ones.

src/Data/Text/Internal/Validate.hs Show resolved Hide resolved
-- @safe@ FFI call is executing. The byte array argument /must/ be pinned,
-- and the calling context is responsible for enforcing this. If the
-- byte array is not pinned, this function's behavior is undefined.
isValidUtf8ByteArraySafe ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Unsafe/Safe why not Unpinned/Pinned?

The Safe/Unsafe naming is confusing since "safe FFI" calls are unsafe and "unsafe FFI" calls are safe. The FFI functions themselves don't need to be renamed.

We could also consider Unsafe/Pinned but this still gives the wrong impression that the pinned one is safer than Unsafe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. Pinned/unpinned is much more clear. I'll change this.

src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
@andrewthad
Copy link
Contributor Author

Responding to the comment about the 128KiB limit for switching from unsafe to safe ffi:

No reason for that particular number (it's just an arbitrary large power of two). Here is everything that I understand about safe FFI calls. First, I'm provide a list of primary sources, and then I'll offer my interpretation of this. My interpretation could be wrong since I've never actually read the source code for this part of the GHC runtime.

  • GHC User Manual section 6.17.1.2: "A safe call calling into haskell is run on a bound thread by the RTS. This means any nesting of safe calls will be executed on the same operating system thread. Sequential safe calls however do not enjoy this luxury and may be run on arbitrary OS threads."
  • GHC Issue 13296: Slyfox writes "ghc's RTS does not limit amount of outstanding safe FFIs". rwbarton measures that the overhead of safe FFI call is 100ns in a particular situation. Simon Marlow implies that the estimate of 100ns is reasonable.
  • GHC Issue 11075: A discussion of how safe FFI calls can spawn an unlimited number of OS threads, which essentially thwarts the GHC scheduler (this one is a little pathological and seems like it was more of a thought experiment than something observed in a production environment). Simon Marlow suggests a lightweight extension to the FFI. Some code for this actually existed at some point, but it was never merged into GHC.

Here's is my analysis. The phrase "nesting of safe calls" in the GHC User Manual is talking about calling Haskell from C, but I interpret this entire section to mean either, or both, of these:

  • If you perform a safe FFI call and then call back into Haskell code from C, and then that Haskell code makes another safe FFI call, then you stay in the same thread. (the original intent)
  • If you perform a safe FFI call inside of a bound thread (created by forkOS), GHC does not have to transfer control to another OS thread. It uses the one you are already in. (something that would also make sense)

Regardless of which one of these is the case, it is certain that if you make a safe FFI call from an unbound thread (a green thread), the runtime has to change the execution context from the green thread's current capability's OS thread to a different OS thread. These steps must happen:

  1. Find an OS thread in the bound thread pool, or create a new one if all the existing ones are busy.
  2. Communicate to this OS thread the task it needs to do (not sure how this works exactly)
  3. Put the green thread to sleep, relinquishing its capability, and add the green thread to a wait queue that will wake it back up when the result from the bound OS thread is done (this is very similar to calling takeMVar but it might be totally different code in the runtime that handles this). This might cause a different green thread (one that had been waiting for a capability to become available) to start running on that capability.
  4. Wake the green thread back up once the result is ready. Once a capability becomes available, the green thread can start running on it, although this will not necessarily be the same capability it was originally running on.

That's a lot of stuff going on. There are two pitfalls to the safe FFI:

  1. It involves transferring control between thread and synchronization between threads, both of which can be expensive, but the context (like if you are already running in a bound thread) might change the cost.
  2. The concerns from GHC Issue 11075, where the safe FFI thwarts the GHC scheduler.

For these reasons, I tend to be cautious about using the safe FFI for compute intensive things. For anything that might take longer than 1ms, I think the safe FFI makes sense because blocking GC sync for a stop-the-world collector is really bad. But for things that are certainly under 500us, I think the unsafe FFI makes sense. The numbers 1ms and 500us are arbitrary. I cannot justify those with anything other than intuition about what makes sense. But for the 128KiB breakpoint that I chose, at a speed of validating 1B/ns (a conservative estimate), it takes 128us to validate. I think that performing an uninterruptible (preventing GHC across all threads) 128us operation ought to be fine. But maybe that threshold should be lower. Maybe 64KiB or 32KiB would be better. I do think that just using "is it pinned?" alone to make the decision is not great. Everything above 3KiB is pinned, and I don't think that the pitfalls of the safe FFI are worth making sure that GC can run concurrently with a 3us operation.

It's a little flimsy on hard evidence, but that's the best argument for it I can provide. I'm fine with changing the switch-to-safe-ffi-at-128kib behavior to something else if you prefer though.

@Lysxia
Copy link
Contributor

Lysxia commented Mar 9, 2023

That's a very clear analysis! That makes sense to me. Confirming this further, the GHC User Manual also says:

This means that if you need to make a foreign call to a function that takes a long time or potentially blocks, then you should mark it safe and use -threaded. Some library functions make such calls internally; their documentation should indicate when this is the case.

On the other hand, a foreign call to a function that is guaranteed to take a short time, and does not call back into Haskell can be marked unsafe.

@Bodigrim
Copy link
Contributor

@andrewthad could you please rebase?

@andrewthad andrewthad force-pushed the expose-simd-validation-functions branch from 924ada3 to 5620537 Compare March 20, 2023 12:36
This makes it possible for users to validate that a ByteArray is
a well formed UTF-8 sequence. This works both with and without the
SIMDUTF flag, falling back to a pure-Haskell implementation when
SIMDUTF is off.

Pick up a dependency on data-array-byte to be able to refer to
the lifted ByteArray type on older versions of GHC. Use data-array-byte
dep for emulated workflow.
@andrewthad andrewthad force-pushed the expose-simd-validation-functions branch from 5620537 to e04674e Compare March 20, 2023 12:51
@andrewthad
Copy link
Contributor Author

I've just now squashed everything down and rebased.

src/Data/Text/Internal/Validate.hs Outdated Show resolved Hide resolved
text.cabal Outdated Show resolved Hide resolved
text.cabal Show resolved Hide resolved
src/Data/Text/Internal/Validate/Simd.hs Outdated Show resolved Hide resolved
@Bodigrim Bodigrim requested a review from Lysxia April 3, 2023 19:35
@andrewthad
Copy link
Contributor Author

I'm not able to make sense of the NetBSD test failure output.

@andrewthad
Copy link
Contributor Author

Is there anything else that I can improve in this PR, or is this ready to be merged?

@Bodigrim Bodigrim merged commit 6c2ad14 into haskell:master Apr 18, 2023
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants