Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Performance of TTFunk::Table::OS2#group_original_code_points_by_bit #83

Conversation

kokuyouwind
Copy link
Contributor

@kokuyouwind kokuyouwind commented Jun 8, 2020

Abstruct

fix #82

Original code takes O(N * M) time where N is UNICODE_RANGES size and M is code_map keys size.
Improved code takes O(max(N, M)) time.

Idea

Original code calls r.cover? with all pair of UNICODE_RANGES and code_map keys.
This takes O(N * M) time.

If UNICODE_RANGES and code_map keys are both sorted, first UNICODE_RANGE contains some code_map keys of the head flagment (or empty), next UNICODE_RANGE contains next flagment, and so on.
This storategy loops UNICODE_RANGES and code_map keys indivisualy, so it takes O(max(N, M)) time.

Performance Test Result

I measure Problem: TTFunk::Subset#encode time with #82 reproduction code.(English Font changed to 'spec/fonts/DejaVuSans.ttf')
It's about 8x faster than master branch, still 4x slower than 1.5.1 branch.

branch font time[s]
master DejaVuSans.ttf 0.451248sec
this branch DejaVuSans.ttf 0.077370sec
(reference)1.5.1 DejaVuSans.ttf 0.021442sec
master GenShinGothic-Normal.ttf 1.706043sec
this branch GenShinGothic-Normal.ttf 0.220942sec
(reference)1.5.1 GenShinGothic-Normal.ttf 0.064324sec

@kokuyouwind kokuyouwind force-pushed the improve_group_original_code_points_by_bit branch from 950edc9 to 406528d Compare June 8, 2020 07:41
@pointlessone
Copy link
Member

@kokuyouwind Thank you for your contribution. The numbers look great.

@camertron Could you please take a quick look?

@camertron camertron self-requested a review June 9, 2020 16:11
Copy link
Member

@camertron camertron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, nice! Looks good to me, nice improvement :)

@f440
Copy link

f440 commented Jan 5, 2021

@pointlessone

is there any progress? i'm looking forward to merging this fix🥺

@pointlessone
Copy link
Member

@kokuyouwind Could you please rebase your branch on top of the current master?

…by_bit

Original Code takes O(N * M) time
  where N is UNICODE_RANGES size and M is code_map keys size
Improved Code takes O(MAX(N, M)) time
@kokuyouwind kokuyouwind force-pushed the improve_group_original_code_points_by_bit branch from 406528d to 461c00b Compare January 7, 2021 08:20
@kokuyouwind
Copy link
Contributor Author

@pointlessone OK, rebased.

@shrkw
Copy link

shrkw commented Feb 1, 2021

@pointlessone @camertron or somebody authorized,
Hi, would you please merge this PR?

@pointlessone pointlessone merged commit 08dcaf4 into prawnpdf:master Feb 1, 2021
@kokuyouwind kokuyouwind deleted the improve_group_original_code_points_by_bit branch February 1, 2021 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Performance Problem: TTFunk::Subset#encode 30x slower than v1.5.1
5 participants