-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AFP] provide an (AbstractFilePath -> Data.ByteString.Builder.Builder) #157
Comments
In GitLab by @maerwald on Jun 22, 2022, 17:05
What is the use case to converting to bytestring? The main reason we make this hard is because windows and unix filepaths have different encoding (and underlying syscall types) and converting to bytes has difficult to understand semantics. If you want to print the bytes, that may be fine, but it may lead people into thinking they can use some bytestring APIs safely. This way you're forced to use internal modules, at least. On the other hand, I already added https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/System-AbstractFilePath-Internal.html#v:bytesToAFP so a variant that does the opposite could be added too, but definitely only in internal modules to signal this is unsafe.
Absolutely not! The entire point of AFPP is to make writing cross platform code safer.
#if defined(mingw32_HOST_OS) || defined(__MINGW32__)
type PlatformString = WindowsString
#else
type PlatformString = PosixString
#endif So matching on This translates to AbstractFilePath, because: newtype OsString = OsString PlatformString
type AbstractFilePath = OsString OsString/AbstractFilePath hide those implementation details behind a newtype. So, if you write cross platform code, you have to do it like this: fromPlatformStringEnc :: TextEncoding
-> AbstractFilePath
-> Either EncodingException String
#if defined(mingw32_HOST_OS) || defined(__MINGW32__)
fromPlatformStringEnc winEnc (OsString (WS ba)) = unsafePerformIO $ do
r <- try @SomeException $ BS8.useAsCStringLen ba $ \fp -> GHC.peekCStringLen winEnc fp
evaluate $ force $ first (flip EncodingError Nothing . displayException) r
#else
fromPlatformStringEnc unixEnc (OsString (PS ba)) = unsafePerformIO $ do
r <- try @SomeException $ BS.useAsCStringLen ba $ \fp -> GHC.peekCStringLen unixEnc fp
evaluate $ force $ first (flip EncodingError Nothing . displayException) r
#endif This allows to safely write API for AbstractFilePath (whose semantics depend on the current platform), as well as modules that are platform specific and work the same across all platforms (see https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/System-AbstractFilePath-Windows.html). So if you want to deal with tar archives, you simply use So no, we don't want any coercions. I will explain this more in depth in a blog post after release. Also note: using |
In GitLab by @frasertweedale on Jun 22, 2022, 21:21 To clarify, the Alternatively (equivalently?), something like |
In GitLab by @frasertweedale on Jun 22, 2022, 21:57
There must be many. Here's one: browse filesystem, choose a file to attach to mail, set Granted, you do not want to attach the bytes as-is. In general they must be decoded first, then re-encoded to what the application requires. But filepath only gives a way to convert to So thinking this through more, I agree that offering a conversion to Maybe all of these performance issues are all in my head as paths are typically not that long, and all those fully evaluated |
In GitLab by @maerwald on Jun 22, 2022, 22:34 My thinking was:
I'm not sure I entirely understand how |
In GitLab by @maerwald on Jun 22, 2022, 22:43 Ok, so from what I understand, you basically want a function like this: reEncode :: (TextEncoding, TextEncoding) -- ^ unix (decoder, encoder)
-> (TextEncoding, TextEncoding) -- ^ windows (decoder, encoder)
-> AbstractFilePath
-> Either EncodingException ShortByteString So basically re-encoding. E.g. if you want all filepaths to be converted to UTF-8, regardless of platform. And then you want the intermediate common representation
I don't think we can fuse it, because we can't use |
In GitLab by @frasertweedale on Jun 22, 2022, 22:49 Fair enough. The more I think about it, I think what I'm asking for is premature optimisation and I just ought to use the AFP as it spreads through enough of the core libraries. The pain points and performance issues, if any, will reveal themselves in time. Thanks for all your feedback; I'll close this ticket now. |
In GitLab by @maerwald on Jun 23, 2022, 05:42 Well, I'm glad for input. Not many users so far have commented on this whole thing. |
In GitLab by @frasertweedale on Jun 22, 2022, 09:54
The main API only gives you a way to convert to String, but doesn't give you any
convenient and efficient way of getting at the bytes (e.g. to print or send to a
handle).
It is unfortunate to have to dig into
System.OsString.Internal.Types
and get atthe inner values via the constructors just to print/send
AbstractFilePath
valueswithout intermediate conversion to
String
. Furthermore, withoutCoercion
s(or equivalent) it is tricky to write cross-platform code - because you don't know
if you have
PosixString
orWindowsString
"under the hood".Ideally there should be a function in the main module that gives you a
Builder
forthe
AbstractFilePath
. Perhaps alsoMaybe Coercion
s to the underlying types (in theInternal module).
The text was updated successfully, but these errors were encountered: