Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide conversions between PosixPath and WindowsPath #214

Open
Bodigrim opened this issue Dec 11, 2023 · 3 comments
Open

Provide conversions between PosixPath and WindowsPath #214

Bodigrim opened this issue Dec 11, 2023 · 3 comments

Comments

@Bodigrim
Copy link
Contributor

haskell/tar#88 introduces

-- | We assume UTF-8 on posix and UTF-16 on windows.
toWindowsPath :: MonadThrow m => PosixPath -> m WindowsPath
toWindowsPath posix = do
  str <- PFP.decodeUtf posix
  win <- WFP.encodeUtf str
  pure $ WS.map (\c -> if WFP.isPathSeparator c then WFP.pathSeparator else c) win

-- | We assume UTF-8 on posix and UTF-16 on windows.
toPosixPath :: MonadThrow m => WindowsPath -> m PosixPath
toPosixPath win = do
  str   <- WFP.decodeUtf win
  posix <- PFP.encodeUtf str
  pure $ PS.map (\c -> if PFP.isPathSeparator c then PFP.pathSeparator else c) posix

IMHO such utilities should better be provided by filepath itself, ideally optimized to a single pass without any intermediate structures.

@hasufell
Copy link
Member

hasufell commented Dec 12, 2023

Our primitives rely on the TextEncoding modules in base and use peekCStringLen etc.

See:

-- | Decode with the given 'TextEncoding'.
decodeWithTE :: TextEncoding -> BS8.ShortByteString -> Either EncodingException String
decodeWithTE enc ba = unsafePerformIO $ do
  r <- try @SomeException $ BS8.useAsCStringLen ba $ \fp -> GHC.peekCStringLen enc fp
  evaluate $ force $ first (flip EncodingError Nothing . displayException) r

-- | Encode with the given 'TextEncoding'.
encodeWithTE :: TextEncoding -> String -> Either EncodingException BS8.ShortByteString
encodeWithTE enc str = unsafePerformIO $ do
  r <- try @SomeException $ GHC.withCStringLen enc str $ \cstr -> BS8.packCStringLen cstr
  evaluate $ force $ first (flip EncodingError Nothing . displayException) r

The encoders/decoders API don't work well with non-String afair: https://hackage.haskell.org/package/base-4.19.0.0/docs/GHC-IO-Encoding.html

Because e.g. TextEncoder is fixed to char: type TextDecoder state = BufferCodec Word8 CharBufElem state.

How do you propose we get the API with TextEncoding for free and avoid intermediate representations? Can we not just rely on list fusion or so?

@Bodigrim
Copy link
Contributor Author

Can we not just rely on list fusion or so?

Given that PFP.decodeUtf is monadic, all effects must happen before we proceed to the next line. This almost certainly prevents any list fusion.

I suggest to write toWindowsPath and toPosixPath manually, without reliance on GHC.IO.Encoding. UTF-8 to UTF-16 and back conversion is reasonably simple.

@hasufell
Copy link
Member

That sounds hard. Can we cry for help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants