Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AFPP support #136

Merged
merged 1 commit into from
Sep 8, 2022
Merged

Add AFPP support #136

merged 1 commit into from
Sep 8, 2022

Conversation

hasufell
Copy link
Member

@hasufell hasufell commented May 8, 2022

This solution is quite CPP heavy.

Context

Dependent PRs

Caveats

  • some code from https://github.com/hasufell/file-io needed to be inlined
  • there's probably some more code we need from base
  • we drop support for older unix and Win32, otherwise we can't provide the new API
  • this also raises the minimum version for bytestring and filepath, there's really no way around this
  • it should still be theoretically buildable with older GHCs

There are currently no additional tests, but the FilePath version shares most of the code with AbstractFilePath.

CI seems to partly fail, because it uses v1.

cabal.project Outdated Show resolved Hide resolved
directory.cabal Outdated Show resolved Hide resolved
System/Directory/Internal/Posix/Template.hs Outdated Show resolved Hide resolved
System/Directory/Template.hs Outdated Show resolved Hide resolved
System/Directory/Internal/Common/AbstractFilePath.hs Outdated Show resolved Hide resolved
@hasufell
Copy link
Member Author

hasufell commented May 8, 2022

Generally, there's probably some optimization that can be done in a couple of places by avoiding the unpack/pack roundtrips and instead use something like uncons/unsnoc etc. on the underlying ShortByteStrings, but that would somewhat complicate the pattern matching code. Probably not worth it for now.

@hasufell
Copy link
Member Author

hasufell commented May 8, 2022

@Bodigrim

Copy link
Member

@Rufflewind Rufflewind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change of this magnitude merits some more upfront discussion, e.g. starting a thread in Issues. Without discussing the design beforehand, there is a risk that a Pull Request may incur significant churn.

directory.cabal Outdated Show resolved Hide resolved
directory.cabal Outdated Show resolved Hide resolved
System/Directory/Internal/Posix/Template.hs Outdated Show resolved Hide resolved
System/Directory/Template.hs Outdated Show resolved Hide resolved
System/Directory/Template.hs Outdated Show resolved Hide resolved
Copy link
Member

@Rufflewind Rufflewind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Updating PR status)

@hasufell
Copy link
Member Author

hasufell commented May 9, 2022

A change of this magnitude merits some more upfront discussion, e.g. starting a thread in Issues. Without discussing the design beforehand, there is a risk that a Pull Request may incur significant churn.

I've opened several discussions and threads wrt AFPP over the past year, on discourse, mailing lists etc, but there wasn't really much engagement. I don't have the energy to single handedly convert the entire ecosystem.

The PRs I've done until now already constitute 20k+ LOC patches and several months worth of work and discussion.

So I'm relying on people being excited about this change.

@Rufflewind
Copy link
Member

I've opened several discussions and threads wrt AFPP over the past year, on discourse, mailing lists etc, ...

There is a distinction between having a high-level discussion about the ecosystem as a whole, which you have done appropriately on the forums and mailing lists, and having a discussion specific to each of the impacted libraries like directory, which is what the Issue tracker is better suited for. If a discussion had been made, then we could have reviewed the potential approaches ahead of time, avoiding wasted efforts.

In the current state, the Pull Request would lead to significant regressions in maintainability and consequently I am unable to recommend merging it as is. I will leave the PR open for a couple months in case other people have the bandwidth to iterate on it further. Feel free to ask around.

I do appreciate the work you have put into this project so far. AFPP is a large and ambitious effort. I wish you the best of luck.

@hasufell
Copy link
Member Author

hasufell commented Jul 1, 2022

Anything left here?

@hasufell hasufell force-pushed the AFPP branch 3 times, most recently from 539e41a to 15338c0 Compare July 1, 2022 23:53
@Rufflewind Rufflewind self-assigned this Jul 6, 2022
Copy link
Member

@Rufflewind Rufflewind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. The main issues that I see as blockers are:

  • There are several widely path-related helper functions that don't exist in filepath, but are being implemented in directory. This signals a usability problem; it may be that there are missing utility functions in the new API. Pushing the responsibility onto directory just masks the problems.

    Specifically, I think the new filepath API is severely underestimating how often a user might want to convert from String into OsString and back. The API for doing those are very clunky right now, which will make it hard to advocate adoption in practical projects. At the very least, an official IsString instance might be useful to have?

  • I also have concerns regarding the excessive use of the word "unsafe", which can end up creating a bit of a cry-wolf problem. "Unsafe" should be reserved for things that are truly dangerous and therefore very rarely used.

  • System.File.Posix/Windows don't belong here. They rightfully belong in unix and Win32. Yes, directory does carry a bunch of platform-specific baggage today, but those are just workarounds for missing functionality. Now that we are dropping compatibility with older versions of unix and Win32, there's no reason to not upstream those capabilities.

Separately, I feel like the whole AFPP project is pushing a lot of additional burden on the library maintainers (non-end-users) who need to maintain two parallel APIs.

  • Is there any way this cost can be mitigated better so that library maintainers don't have to individually cook up their own hacky solutions with CPP or other code generation trickery? Would a type class work better here? Is there a forum where people can exchange ideas?

  • Is there a clear timeline on when the old FilePaths will go away?

tests/TestUtils.hs Show resolved Hide resolved
System/Directory/Internal/Common/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/Common/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/Common/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/Common/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/WindowsFFI/OsPath.hsc Outdated Show resolved Hide resolved
System/Directory/Internal/WindowsFFI/OsPath.hsc Outdated Show resolved Hide resolved
System/Directory/Internal/Windows/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/Windows/OsPath.hs Outdated Show resolved Hide resolved
System/Directory/Internal/OsPath.hs Outdated Show resolved Hide resolved
@hasufell
Copy link
Member Author

hasufell commented Jul 10, 2022

Thanks for the update. The main issues that I see as blockers are:

  • There are several widely path-related helper functions that don't exist in filepath, but are being implemented in directory. This signals a usability problem; it may be that there are missing utility functions in the new API. Pushing the responsibility onto directory just masks the problems.

I cannot follow. Can you be more specific?

Specifically, I think the new filepath API is severely underestimating how often a user might want to convert from String into OsString and back. The API for doing those are very clunky right now, which will make it hard to advocate adoption in practical projects. At the very least, an official IsString instance might be useful to have?

I disagree. There are 3 functions to convert from String, all are well documented:

Likewise, there are 3 functions to convert from OsPath to String:

Can you explain precisely what is wrong with them? Your comment it too vague to be actionable.

  • I also have concerns regarding the excessive use of the word "unsafe", which can end up creating a bit of a cry-wolf problem. "Unsafe" should be reserved for things that are truly dangerous and therefore very rarely used.

This again is vague. unsafeFromChar (which you might be referring to) is safe for ascii literals, which is where it's used.

  • System.File.Posix/Windows don't belong here. They rightfully belong in unix and Win32. Yes, directory does carry a bunch of platform-specific baggage today, but those are just workarounds for missing functionality. Now that we are dropping compatibility with older versions of unix and Win32, there's no reason to not upstream those capabilities.

No, they don't belong in unix or Win32. Those packages are bindings, they don't add additional functions. These functions (as I already explained) will be added to the file-io package. But since that package is not a boot library, we have to inline these functions here.

Separately, I feel like the whole AFPP project is pushing a lot of additional burden on the library maintainers (non-end-users) who need to maintain two parallel APIs.

Excuse me? I implemented two approaches for you from scratch. The second approach does exactly what you wanted: it uses AFPP types for the core and provides a shim wrapper for String. There is no additional maintenance burden.

  • Is there any way this cost can be mitigated better so that library maintainers don't have to individually cook up their own hacky solutions with CPP or other code generation trickery? Would a type class work better here? Is there a forum where people can exchange ideas?

There are no code generation tricks involved in the latest patch. I don't understand what you mean.

  • Is there a clear timeline on when the old FilePaths will go away?

They will not go away.


I'd appreciate if I can get a clear answer here whether this patch will be accepted. I'm suffering from RSI and patches like these cause me physical pain, because they require a lot of typing.

I am fine with forking the directory package too, if that is what is required to get adoption for abstract filepath.

@Rufflewind
Copy link
Member

I cannot follow. Can you be more specific?

I disagree. There are 3 functions to convert from String, all are well documented:

Let's continue the discussion upstream: haskellfoundation/tech-proposals#35 (comment)

Can you explain precisely what is wrong with them? Your comment it too vague to be actionable.

System.Directory.Internal.Common.OsPath should not be necessary to begin with. Leaving aside the re-exports, which can be avoided trivially, it provides several helper functions to cover several common use cases in the rest of the directory modules. I think those utility functions are better off in filepath instead. After all, if directory has uses for them, wouldn't other libraries find them useful too?

This again is vague. unsafeFromChar (which you might be referring to) is safe for ascii literals, which is where it's used.

There are several comments regarding the use of "unsafe":

No, they don't belong in unix or Win32. Those packages are bindings, they don't add additional functions. These functions (as I already explained) will be added to the file-io package. But since that package is not a boot library, we have to inline these functions here.

In that case, would it be possible to either (1) convert file-io into a boot package or (2) merge file-io into filepath instead?

Excuse me?

I am referring to the issue that the API and documentation have been duplicated in both Directory and Directory.OsPath.

While I'm not asking for further major changes to this directory patch, I think there needs to be some reflection on the general migration cost to the ecosystem as a whole. The AFPP proposal is not just causing churn, but it is asking a lot of libraries to maintain two APIs in parallel, which is not just a one-time cost but an ongoing one.

Ideally, it'd be nice to see these costs are either addressed or mitigated by the AFPP proposal. At the very least, it is worth documenting these drawbacks in the main proposal or the tutorial (hasufell/hasufell.github.io#4).

I can probably find my own mechanism to mitigate the costs, but if this proposal goes live, other maintainers are going to be asking the same question.

I'd appreciate if I can get a clear answer here whether this patch will be accepted.

The main blockers for merging this PR are:

  • Items 1 and 3 in Add AFPP support #136 (review) which concern the boundaries of ownership. Agreement is needed here.
  • There needs to be a release of filepath, unix, and Win32 with the AFPP changes (or at least a committed timeline for them).

@hasufell
Copy link
Member Author

In that case, would it be possible to either (1) convert file-io into a boot package or (2) merge file-io into filepath instead?

No, both is not possible. I don't have time or energy for (1) and (2) would cause circular package imports.

I am referring to the issue that the API and documentation have been duplicated in both Directory and Directory.OsPath.

Afaiu, that is what you asked for here: #136 (comment)

You wanted two APIs, one of them being a shim wrapper. At this point I think I don't really understand what you want and suggest that you do further architectural changes yourself.

I'm fine with cleaning up the PR, but I don't want to start another approach.

Items 1 and 3 in #136 (review) which concern the boundaries of ownership. Agreement is needed here.

To point 1 I can only repeat that I disagree with you. The new API is not clunky at all. It provides 3 functions to convert from String to OsPath. Each of them have their own caveats... that's the entire point of the AFPP proposal: viewing filepaths as strings is wrong, so users have to take a step back and figure out what their code really means. If your filepaths are in the ascii subset, you can just use encodeUtf with fromJust. If you know the underlying encoding, you can use encodeWith. If you don't know anything and want the old broken behavior of base (which directory uses as well), you can use encodeFS... which is also used in the wrapper layer to maintain existing behavior.

So please explain to me what is clunky here. Those are the 3 options. The IsString interface is broken: haskell/bytestring#140

We can't express failure, we can't restrict String to a scalar value subset. Text has the same problem and just replaces invalid stuff with Unicode replacement char. All of this is broken and dangerous for filepaths.

It is possible to add an unsafeFromString for ascii literals that just truncates without errors. But that really goes against the entire AFPP. We want to signal failure. Partial functions have been heavily debated here as well: haskell/core-libraries-committee#70

While I'm not a friend of deprecating them, I don't want to make it easy for end-users either. E.g. Data.ByteString.Char8 has caused a lot of problems, because people don't understand what it does.

If you're confident, you can use fromJust . encodeUtf. This is a perfectly valid pattern for ASCII literals. Please explain what functions are missing.


Let me know if you're still interested about this.

The timeline for these changes are this month. We want to release filepath, unix and Win32 in time for GHC 9.6. Maintainers have agreed on this. Only directory has not. So time is critical here, because users will be confused that they can't really make much use of it.

If you don't want this change to happen here or now, please make this clear, so that I know what my next steps will be.

@Rufflewind
Copy link
Member

The timeline for these changes are this month. We want to release filepath, unix and Win32 in time for GHC 9.6. Maintainers have agreed on this. Only directory has not.

There was no deadline communicated at the start of this Pull Request. The PR would have received higher priority if the timeline had been clearly communicated. The fact that this PR is a blocker for GHC 9.6 was not indicated.

I think, for the sake of the broader ecosystem, directory is probably the preferred place to host this code. However, given the short notice, there is no guarantee that it can be merged and released by the end of the month.

At this point, I am not expecting any further contributions from you on this PR. I will make necessary changes, aiming for a release by end of August, unless you decide to pursue a different route.

@hasufell
Copy link
Member Author

That sounds like a good strategy.

@Bodigrim
Copy link
Contributor

At this point, I am not expecting any further contributions from you on this PR. I will make necessary changes, aiming for a release by end of August, unless you decide to pursue a different route.

@Rufflewind any news on this? It would be crucial to get directory updated in time for GHC 9.6 feature freeze, which could happen pretty soon.

I very much appreciate your work on directory, and indeed this patch requires a significant effort to review. Since directory is a core library, feel free to ask CLC to provide additional maintainers to help you with it or any other assistance, if you need it.

@Rufflewind
Copy link
Member

Sorry for the delay and thanks for the reminder. I will attempt merging this over the next few days and provide an update this weekend.

@Rufflewind
Copy link
Member

Update: I have completed review of all the changes and anticipate closing this pull request and pushing out a release some time tomorrow, barring unforeseen outages.

This is tidied up commit containing the following changes by
https://github.com/hasufell:

* haskell@66b3f48
* haskell@15338c0

Not all changes made their way through. Beyond the stylistic cleanup,
notable reversions include:

* System.File.OsPath and its submodules have been removed as very little
  of directory relies on them. The few remaining portions have been
  inlined into submodules of System.Directory.Internal. For file
  operations, users are expected to use upstream packages at
  https://github.com/hasufell/file-io .

* Renaming of internal modules have been undone to untangle the AFPP
  revamp from unrelated refactors.

* Dependence on bytestring has been removed as it was found to be
  unnecessary after the cleanup.

Closes haskell#136.
System/Directory/Internal/Windows/OsPath.hs Outdated Show resolved Hide resolved
directory.cabal Outdated Show resolved Hide resolved
@Rufflewind Rufflewind merged commit 78b3e59 into haskell:master Sep 8, 2022
@Rufflewind
Copy link
Member

Published: https://hackage.haskell.org/package/directory-1.3.8.0

@hasufell
Copy link
Member Author

hasufell commented Sep 8, 2022

Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants