Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure Haskell implementation of GHC.Unicode #59

Closed
wismill opened this issue May 6, 2022 · 15 comments
Closed

Pure Haskell implementation of GHC.Unicode #59

wismill opened this issue May 6, 2022 · 15 comments
Labels
approved Approved by CLC vote base-4.18 Implemented in base-4.18 (GHC 9.6)

Comments

@wismill
Copy link

wismill commented May 6, 2022

Following this discussion, the goal of this proposal is to refactor GHC.Unicode from a C implementation to a pure Haskell implementation.

Motivation

Implementation

I opened a merge request with a working implementation.

This work is based on the package unicode-data, developed by @Bodigrim, @harendra-kumar, @adithyaov, me and others.

The re-licensing is discussed in this issue.

Relevant links

Further discussion

Other interesting functions could be imported from unicode-data. This proposal only use those already in base, in order to keep the interface unchanged.

Aknowledgement

I would like to thank @Bodigrim, @harendra-kumar and @adithyaov for making unicode-data and @nomeata for his guidance.

@tomjaguarpaw
Copy link
Member

This sounds absolutely tremendous!

@nomeata
Copy link

nomeata commented May 6, 2022

Thanks for pushing this forward! Highly appreciated

@harendra-kumar
Copy link

Nice to see one significant dependency on C libraries removed. I appreciate the contributions of @wismill to the unicode-data package, making it a drop-in replacement for Data.Char and now pushing this into GHC, actually replacing Data.Char! And thanks to @nomeata for taking the initiative and helping out on getting this done.

@Bodigrim
Copy link
Collaborator

I conducted a technical review of https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8072 and it looks good to me.

Dear CLC members, let's vote on the proposal to speed up Data.Char 4 times :-P
@tomjaguarpaw @cigsender @cgibbard @emilypi @chessai

+1 from me.

@chessai
Copy link
Member

chessai commented May 17, 2022

+1

@tomjaguarpaw
Copy link
Member

+1

2 similar comments
@mixphix
Copy link
Collaborator

mixphix commented May 18, 2022

+1

@cgibbard
Copy link
Contributor

+1

@Bodigrim
Copy link
Collaborator

With 5 votes in favor out of 6 possible, the proposal is now approved by CLC. Thanks @wismill.

@Bodigrim Bodigrim added the approved Approved by CLC vote label May 19, 2022
ghc-mirror-bot pushed a commit to ghc/ghc that referenced this issue Jun 1, 2022
Switch to a pure Haskell implementation of base:GHC.Unicode, based on the implementation of the package unicode-data (https://github.com/composewell/unicode-data/).

Approved by CLC as per haskell/core-libraries-committee#59 (comment).

- Remove current Unicode cbits.
- Add generator for Unicode property files from Unicode Character Database.
- Generate internal modules.
- Update GHC.Unicode.
- Add unicode003 test for general categories and case mappings.
- Add Python scripts to check 'base' Unicode tests outputs and characters properties.

Fixes #21375

-------------------------
Metric Decrease:
    T16875
Metric Increase:
    T4029
    T18304
    haddock.base
-------------------------
@chshersh
Copy link
Member

chshersh commented Mar 22, 2023

I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved CLC proposals.

Field Value
Authors @wismill
Status merged
base version 4.18.0.0
Merge Request (MR) https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8072
Blocked by nothing
CHANGELOG entry missing (please, raise an MR to base to update changelog)
Migration guide not needed

Please, let me know if you find any mistakes 🙂


@wismill I find it hard to determine the exact version of base that implemented this proposal. Could you open an MR to base with the CHANGELOG entry for the corresponding base version?

@wismill
Copy link
Author

wismill commented Mar 22, 2023

@chshersh thanks for your effort. base-4.18.0.0 implements this proposal. I think there is no changelog entry because it is not an API change, merely an implementation one.

@chshersh chshersh added the base-4.18 Implemented in base-4.18 (GHC 9.6) label Mar 22, 2023
@chshersh
Copy link
Member

@wismill Thanks for confirming the base version!

In my understanding, if a change requires a CLC proposal, it should go to CHANGELOG, and I don't view CHANGELOG as API-only list of changes. But views of other CLC members may differ 😌

@parsonsmatt
Copy link

I would personally prefer to see nearly all changes in the changelog - subtle implementation differences can be hard to track down, and a changelog makes that much easier

@chessai
Copy link
Member

chessai commented Mar 22, 2023

I would personally prefer to see nearly all changes in the changelog - subtle implementation differences can be hard to track down, and a changelog makes that much easier

I agree, and that base should be held to a very strict standard of this

@wismill
Copy link
Author

wismill commented Mar 24, 2023

MR sent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved by CLC vote base-4.18 Implemented in base-4.18 (GHC 9.6)
Projects
None yet
Development

No branches or pull requests

10 participants