Skip to content

Conversation

tayloraswift
Copy link
Contributor

(Addresses #287)

There are probably many better ways of doing this than adding a field to hb_glyph_position_t but as a proof of concept, it is possible to achieve a roughly 300% increase in speed by having the glyph info/position lookups resolved in C rather than in Python through GObject.

def shape_new_buffer(extractor=lambda b: b):
    buf = hb.buffer_create ()
    hb.buffer_add_utf32 (buf, array.array('I', text.encode('utf-32')), 0, -1)
    hb.buffer_guess_segment_properties (buf)
    hb.shape (font, buf, [])
    return extractor(buf)

def unpack_hb_buffer1(HBB):
    return [(N.cluster, N.codepoint, P.x_advance, P.x_offset) for N, P in zip(hb.buffer_get_glyph_infos(HBB), hb.buffer_get_glyph_positions(HBB))]

def unpack_hb_buffer2(HBB):
    hb.buffer_compact_glyphs(HBB)
    return [P.compact for P in hb.buffer_get_glyph_positions(HBB)]

import timeit
t1 = timeit.timeit("shape_new_buffer(unpack_hb_buffer1)", number=1000, setup="from __main__ import shape_new_buffer, unpack_hb_buffer1")

t2 = timeit.timeit("shape_new_buffer(unpack_hb_buffer2)", number=1000, setup="from __main__ import shape_new_buffer, unpack_hb_buffer2")

print(t1)
print(t2)
print(t1/t2)

shaped_buffer = shape_new_buffer()

t3 = timeit.timeit("unpack_hb_buffer1(shaped_buffer)", number=1000, setup="from __main__ import shaped_buffer, unpack_hb_buffer1")

t4 = timeit.timeit("unpack_hb_buffer2(shaped_buffer)", number=1000, setup="from __main__ import shaped_buffer, unpack_hb_buffer2")

print(t3)
print(t4)
print(t3/t4)
0.5077813390016672
0.1881202860022313
2.6992375452567856 (overall stack performance improvement factor)
0.5054081650014268
0.16407813999830978
3.0802894584655407 (glyph extraction performance improvement factor)

This reduces the number of python attribute lookups required from 4 to 1 per glyph. Since sticking the values into the hb_glyph_position_t struct probably isn’t the best way to go, if there was a way to generate the array independently, testing shows it would probably achieve close to a 1000% percent performance improvement (since the bottleneck is still the one python attribute lookup).

@behdad
Copy link
Member

behdad commented Aug 5, 2016

You can convert the glyph positions and glyph infos into Python array.array objects and access those. Let me make it absolutely clear: no change to the library is going to be accepted for this purpose.

@behdad behdad closed this Aug 5, 2016
@tayloraswift
Copy link
Contributor Author

@behdad how to convert <HarfBuzz.glyph_info_t object> into python array?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants