Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace unicode-general-category w/ unicode-properties. #71

Merged

Conversation

eddyb
Copy link
Contributor

@eddyb eddyb commented Jun 29, 2023

What prompted me to look at unicode-general-category was these build timings:

While ttf-parser prevents rustybuzz from starting its compilation earlier anyway, it still seemed weird that only one of several unicode-* deps was so much more involved than the others.
(And within a large build, it is possible that such a serial dependency would have an impact)

The description of unicode-general-category seemed to suggest it was used for performance reasons, and I spent some time optimizing away some low-hanging fruit (it wastes over half a second just on not using buffered IO, but also it generates Rust source code instead of binary blobs, and there's a few more things like that).


But then I noticed something: the yeslogic org has a bunch of unicode-* crates, but most of them are prefixed with yeslogic- on crates.io (since they're alternatives to e.g. unicode-rs crates).

And there's no any other obvious choice of a crate for querying "general category", not on crates.io. Which leads me to believe the use of an yeslogic crate wasn't intentional but rather accidental/incidental (as it looked like the only option, and it didn't have a prefix that would make it seem out of place).


There are two blockers for this PR:

@RazrFalcon
Copy link
Owner

Hi! I genuinely do not remember why I have chosen unicode-general-category. It was 3 years ago.
And yes, rustybuzz has an absurd amount of low-hanging fruits. I simply have no time working on it and it's semi-abandoned.
As long as tests are passing - I'm fine with it.

Sadly, there are no alternatives (yes, I know about swash) and I hope someone else would keep maintaining it. Which is unlikely, because the domain is so absurdly complex and niche.

@RazrFalcon
Copy link
Owner

RazrFalcon commented Jun 29, 2023

Which leads me to believe the use of an yeslogic crate wasn't intentional but rather accidental/incidental (as it looked like the only option, and it didn't have a prefix that would make it seem out of place).

No, I knew it was by yeslogic.

Maybe at that time, early 2020, there were no stand-alone Unicode general category crates. And I did not wanted to blow up the binary size by including the whole Unicode table data.

The current rustybuzz status is: thanks God it finally works... It took 8 months to port, so I was fine with any solutions to get it to the finish line.

@Manishearth
Copy link

Btw, filed unicode-rs/unicode-properties#1 about the names of the enums

@eddyb eddyb force-pushed the replace-unicode-general-category branch from 377e961 to fdcdc4d Compare July 27, 2023 15:54
@Manishearth
Copy link

It's now published, sorry for the delay

@eddyb eddyb force-pushed the replace-unicode-general-category branch from fdcdc4d to eefccfb Compare July 27, 2023 19:53
@eddyb eddyb marked this pull request as ready for review July 27, 2023 19:53
@eddyb
Copy link
Contributor Author

eddyb commented Jul 27, 2023

It's now published, sorry for the delay

@Manishearth no worries, I lost track of this as well, thanks for publishing!


@RazrFalcon should be good to go (and the PR is trivial now, thanks to the renaming).

@RazrFalcon RazrFalcon merged commit a0b8aa3 into RazrFalcon:master Jul 28, 2023
1 check passed
@RazrFalcon
Copy link
Owner

Thanks!

@eddyb eddyb deleted the replace-unicode-general-category branch August 1, 2023 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants