Don’t calculate whole sets of unicode codepoints #1984

liZe · 2020-06-03T22:48:57Z

_getUnicodeRangeSets used to calculate sets containing lots of numbers, only to get intersections between a set and ranges. Creating and manipulating a lot of big sets is both slow and memory consuming.

The function has been replaced by _getUnicodeRanges, returning ranges instead of sets. Ranges are both really lightweight and really fast.

Tests on intersectUnicodeRanges are now 3 times faster and save about 130 MB (!) of RAM.

Before: 0.28 seconds, 148 MB RAM

time python fonttools/Lib/fontTools/ttLib/tables/O_S_2f_2.py
0.23user 0.05system 0:00.28elapsed 100%CPU (0avgtext+0avgdata 147760maxresident)k

After: 0.08 seconds (-70%), 14 MB RAM (-90%)

time python fonttools/Lib/fontTools/ttLib/tables/O_S_2f_2.py
0.08user 0.00system 0:00.08elapsed 100%CPU (0avgtext+0avgdata 13940maxresident)k

anthrotype · 2020-06-04T09:04:42Z

thanks. You only timed the module loading time. It would be interesting to benchmark also the time it takes to run the intersectUnicodeRanges function on random sets of unicode codepoints between the two set vs ranges implementations.
If it turns out that building up the sets on module import makes it faster to check for intersection, despite a slower import time and increased memory usage, it may be a reasonable trade off.

anthrotype · 2020-06-04T09:51:42Z

As I was expecting, running intersectUnicodeRanges with your approach is much slower.
This is the benchmark code I used for testing:

import timeit

num_executions = 1000
total_time = timeit.timeit(
    "bits = intersectUnicodeRanges(random.choices(range(0x110000), k=1000))",
    setup="import random; from fontTools.ttLib.tables.O_S_2f_2 import intersectUnicodeRanges",
    number=num_executions,
)

print(total_time / num_executions)

With the current set implementation, on my MBP, I get:

0.0004779480810000223

With your range and for-loop based implementation, I get:

0.015389551754999956

That is, almost 30 times slower.
So I guess, it all depends on how many times one runs the intersectUnicodeRanges function per session, and on the size of the font in terms of number of unicode characters to test for intersection.

I am still not 100% convinced that your solution would be preferable.

liZe · 2020-06-04T11:47:54Z

You only timed the module loading time.

That’s not exactly true, because the module launches 3 unit tests…

It would be interesting to benchmark also the time it takes to run the intersectUnicodeRanges function on random sets of unicode codepoints between the two set vs ranges implementations.

… but you’re right: having some speed improvement for the set generation + 3 intersectUnicodeRanges is not the same as only testing intersectUnicodeRanges speed.

So I guess, it all depends on how many times one runs the intersectUnicodeRanges function per session, and on the size of the font in terms of number of unicode characters to test for intersection.

👍

I am still not 100% convinced that your solution would be preferable.

Storing 130 MB in RAM during the whole execution time of the program is really bad for my use cases, but I get your point. I’ll try to improve my solution.

liZe · 2020-06-04T18:16:43Z

I’ve changed the implementation, here are the results using the script you provided.

Before: 0.82 ms / loop, 147 MB RAM

time python speed.py
0.0008236442100023851
0.90user 0.07system 0:00.98elapsed 99%CPU (0avgtext+0avgdata 146996maxresident)k

After: 0.85 ms / loop, 13 MB RAM

time python speed.py
0.0008471214559976943
0.90user 0.01system 0:00.92elapsed 100%CPU (0avgtext+0avgdata 13296maxresident)k

Lib/fontTools/ttLib/tables/O_S_2f_2.py

anthrotype · 2020-06-04T18:40:28Z

Nice! bisect did the trick :)
+1 on simplifying the building of the steps list like Just suggested.

justvanrossum · 2020-06-04T18:56:26Z

Another approach would be to special-case the non-BMP codepoints, as that makes up more than 81% of the current set-based data set.

Lib/fontTools/ttLib/tables/O_S_2f_2.py

liZe · 2020-06-04T21:43:29Z

New results are equivalent, with less code.

time python speed.py
0.000836255980990245
0.90user 0.00system 0:00.91elapsed 100%CPU (0avgtext+0avgdata 13360maxresident)k

I’m sure that it could be even better, but I think that it’s good enough for now 😉. Thanks a lot for the quick review.

Lib/fontTools/ttLib/tables/O_S_2f_2.py

_getUnicodeRangeSets used to calculate sets containing lots of numbers, only to get intersections between a set and ranges. Creating and manipulating a lot of big sets requires a lot of memory. The function has been replaced by _getUnicodeRanges, returning a list of range starts boundaries and a list of range stops + corresponding bits. Tests on intersectUnicodeRanges save about 130 MB (!) of RAM, with no significant speed penalty.

anthrotype

LGTM, thanks!

justvanrossum reviewed Jun 4, 2020

View reviewed changes

Lib/fontTools/ttLib/tables/O_S_2f_2.py Outdated Show resolved Hide resolved

dscorbett reviewed Jun 4, 2020

View reviewed changes

Lib/fontTools/ttLib/tables/O_S_2f_2.py Outdated Show resolved Hide resolved

justvanrossum requested changes Jun 5, 2020

View reviewed changes

Lib/fontTools/ttLib/tables/O_S_2f_2.py Outdated Show resolved Hide resolved

Lib/fontTools/ttLib/tables/O_S_2f_2.py Outdated Show resolved Hide resolved

Lib/fontTools/ttLib/tables/O_S_2f_2.py Outdated Show resolved Hide resolved

justvanrossum reviewed Jun 5, 2020

View reviewed changes

Lib/fontTools/ttLib/tables/O_S_2f_2.py Show resolved Hide resolved

anthrotype approved these changes Jun 5, 2020

View reviewed changes

anthrotype requested a review from justvanrossum June 5, 2020 09:31

justvanrossum approved these changes Jun 5, 2020

View reviewed changes

anthrotype merged commit babca16 into fonttools:master Jun 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don’t calculate whole sets of unicode codepoints #1984

Don’t calculate whole sets of unicode codepoints #1984

liZe commented Jun 3, 2020

anthrotype commented Jun 4, 2020

anthrotype commented Jun 4, 2020

liZe commented Jun 4, 2020

liZe commented Jun 4, 2020 •

edited

anthrotype commented Jun 4, 2020

justvanrossum commented Jun 4, 2020

liZe commented Jun 4, 2020

anthrotype left a comment

Don’t calculate whole sets of unicode codepoints #1984

Don’t calculate whole sets of unicode codepoints #1984

Conversation

liZe commented Jun 3, 2020

anthrotype commented Jun 4, 2020

anthrotype commented Jun 4, 2020

liZe commented Jun 4, 2020

liZe commented Jun 4, 2020 • edited

anthrotype commented Jun 4, 2020

justvanrossum commented Jun 4, 2020

liZe commented Jun 4, 2020

anthrotype left a comment

Choose a reason for hiding this comment

liZe commented Jun 4, 2020 •

edited