Implement 'rand' #803

dscorbett · 2018-02-18T22:13:01Z

This adds support for 'rand'. If the 'rand' feature appears in a font but is not explicitly set to a certain value, the feature value is randomized for each glyph. The algorithm is deterministic, seeded from a hash of the input glyphs’ indices. The entire buffer is unsafe to break. The reason to seed based on the whole buffer instead of a constant or user-specified seed is so that strings with the same initial characters do not have the same initial glyphs, which wouldn’t look very random.

This implementation is unoptimized and minimally viable. It probably shouldn’t be merged as is. The point of the pull request is to get some feedback so I don’t spend time optimizing something that doesn’t matter.

behdad · 2018-02-18T22:48:05Z

Interesting approach! Thanks. Will take a look.

behdad · 2018-02-19T21:09:09Z

The reason to seed based on the whole buffer instead of a constant or user-specified seed is so that strings with the same initial characters do not have the same initial glyphs, which wouldn’t look very random.

But this also means that when typing in a box, everything will change shape as one types...

Same starting words resulting in same shape is consistent with some definition of random, just not other definitions.

dscorbett · 2018-02-23T17:36:54Z

I removed the hashing of the buffer, so previous glyphs won’t change when typing in a box. This also removes the one inefficiency I was aware of, that it hashed the buffer for each 'rand' lookup instead of just once.

Now that randomization does not depend on all the glyphs, it is not necessarily unsafe to break the whole buffer. It is only unsafe to break from the first randomized glyph to the last randomized glyph. I figured a random font would probably randomize most glyphs, so I kept the call to unsafe_to_break_all, as greater precision wasn’t worth the bookkeeping complexity.

behdad · 2018-02-23T19:48:13Z

Thanks. I'm busy reviewing the hb-subset changes. Please remind me to review this in a few weeks. Cheers

khaledhosny · 2018-02-23T22:13:18Z

I can now use https://github.com/khaledhosny/punk-otf 🎉

behdad · 2018-02-23T22:14:27Z

Woohoo!

dscorbett · 2018-03-23T18:50:21Z

Please remind me to review this in a few weeks.

@behdad

behdad · 2018-04-13T17:19:11Z

I reviewed the code. Looks great!

Let's add some API on the buffer so user can configure if entire buffer should set the state or not, also a way to set and get the random state such that higher level can do smart things across lines, etc. Maybe also way to set the rng function.

I'll finish and merge in a few days.

ebraminio · 2018-06-09T23:09:29Z

I'll finish and merge in a few days.

So, a friendly ping :)

behdad · 2018-07-23T03:31:14Z

I like to get this in. Any chance you can rebase and resolve the conflict? Thanks.

dscorbett · 2018-07-23T13:56:23Z

Done.

Randomization only happens by default. If the user specifies a value for 'rand', that value is respected.

behdad · 2018-09-10T14:17:04Z

I'm trying to merge this. I don't understand the default_rand thing in hb-ot-shape.cc

behdad · 2018-09-10T14:19:27Z

Actually, I think we need a better way to specify how random interacts with user-specified values.

behdad · 2018-09-10T14:22:56Z

I'm trying to merge this. I don't understand the default_rand thing in hb-ot-shape.cc

You want that to allow user-specified value to override random behavior. It doesn't work though, if user specified rand feature value for part of the input only. We need to figure out how to handle that. For now, I'm making it ignore user-value, only allow disabling random, not setting specific value.

behdad · 2018-09-10T14:26:46Z

Also missing is API to set buffer random state, and query it, such that client can chain it.

Humm. Right now, using the same buffer, calling shape continuously produces different results, right?

behdad · 2018-09-10T14:26:57Z

Humm. Right now, using the same buffer, calling shape continuously produces different results, right?

Or guess not.

behdad · 2018-09-10T14:28:15Z

We should add random-seed setter/getter to hb-buffer and use it to initialize the random_state in gsubgpos.hh. I'm not sure how to query the buffer state after shaping.

behdad · 2018-09-10T14:29:28Z

I pushed my modified branch into https://github.com/harfbuzz/harfbuzz/tree/rand

behdad · 2018-09-10T14:39:14Z

Okay, I made it respect user value if set to > 1. Setting to 1 means "randomize". I don't like the exception, but that's better than before.

behdad · 2018-09-10T14:40:04Z

Also, maybe we should add a setting to buffer to update the random seed from random state after hb_shape call. That way, one will get different results from subsequent calls to shape.

dscorbett · 2018-09-10T15:14:12Z

How about making UINT_MAX be the special value to enable randomness? That way explicitly setting 'rand' to 1 would work as expected.

behdad · 2018-09-10T15:17:02Z

How about making UINT_MAX be the special value to enable randomness? That way explicitly setting 'rand' to 1 would work as expected.

But then user would need to enter, on the command line: --features=rand=2147483647. The reason 1 is special, is that that's what gets passed down when user does --features=rand.

Umm. Ok, that's a crappy argument, since random is default-on. Problem with UINT_MAX is that it would consume all our bits. Wouldn't even fit. Can use 255 for that. Would consume 8 bits. But yeah, I like using a large number. Let me do.

behdad · 2018-09-10T15:22:14Z

Another thing that would be nice / cool is, when user specifies a value, feed that value back into random state. Such that if user changes glyph directly, the glyphs after that also change. Or is that a bad idea? Guess one can say it's a bad idea.

Use rand=255 to mean "randomize". Part of #803

behdad · 2018-09-10T20:38:31Z

Ok, implemented rand=255. Like that. Thanks for the idea!

behdad · 2018-09-10T20:44:09Z

Ok so let's talk about buffer random seed api:

Are we ok with unsigned int? We don't quite have uint64_t in the headers.

Also do we need a 64-bit random algorithm? Or can we downgrade to 32 bits?

void hb_buffer_set_random_seed (buffer, unsigned int);
unsigned int hb_buffer_get_random_seed (buffer);

Then, how to get random state?

unsigned int hb_buffer_get_random_state (buffer);

?

How to make it loop back the state into seed?

hb_buffer_set_randomness (...);

?
takes an enum?

Not sure. Need ideas.

dscorbett · 2018-09-10T21:30:46Z

A 32-bit state would be sufficient, but I don’t recommend using unsigned int: it has a variable size, so the algorithm would produce different results on different platforms. It would be nice to keep it consistent for debuggability. HarfBuzz currently statically asserts that an unsigned int is 32 bits but the API would be clearer if it didn’t rely on that detail.

A really ambitious idea would be to leave it to the caller to set the randomization function as a callback and to determine the type of the random state, which would have to be passed around as an opaque hb_random_state_t.

Another idea is to provide an option to randomly seed based on the entire buffer, as I originally had it, which would, I think, give better results for non-interactive text buffers.

behdad · 2018-09-10T21:34:19Z

A 32-bit state would be sufficient, but I don’t recommend using unsigned int: it has a variable size, so the algorithm would produce different results on different platforms. It would be nice to keep it consistent for debuggability. HarfBuzz currently statically asserts that an unsigned int is 32 bits but the API would be clearer if it didn’t rely on that detail.

Right. Internally we would keep it uint32_t. In the API we assume unsigned int is uint32_t anyway.

A really ambitious idea would be to leave it to the caller to set the randomization function as a callback and to determine the type of the random state, which would have to be passed around as an opaque hb_random_state_t.

Right. Might be over-engineering.

Another idea is to provide an option to randomly seed based on the entire buffer, as I originally had it, which would, I think, give better results for non-interactive text buffers.

Yes I think I like that as an option as well.

I was thinking today about my future task to add lookup-direction flags. With those, one can make the random lookup work backwards, which would make it process back to front, so strings ending the same will get same ending glyphs... But whole-buffer is also useful, I agree.

dscorbett · 2018-09-10T21:51:21Z

Might be over-engineering.

It probably is. I guess it depends on the use case for the API to get and set the random state, which I’m not clear on.

behdad · 2018-09-10T23:00:21Z

Might be over-engineering.

It probably is.

I think that's overengineering because that forces every client to have to set it up. Or if there would be a default, we don't have a compelling reason why a client might want to change the default RNG algorithm.

I guess it depends on the use case for the API to get and set the random state, which I’m not clear on.

That's different. That one is such that a client can cascade the random state from one text run to the next, such that a full document gets randomless flowing. Ie. if you have five lines, each having the same text, each of them look different.

Use rand=255 to mean "randomize". Part of #803

behdad · 2018-09-11T08:48:30Z

Merged!

Let's switch to 32bit RNG internally.

behdad · 2018-09-11T08:58:03Z

Let's switch to 32bit RNG internally.

Done.

khaledhosny · 2020-03-10T21:10:10Z

I’m adding a rand feature to an Arabic font, where many glyphs have alternate forms, but I noticed that the rand lookup is executed first regardless of its order (which I believe is a result of 71c9f84, where it would have been executed last before this).

Now this breaks my font because if an isolated glyph has an alternate, init/medi etc. features will not affect the alternate glyph and shaping will break. I can work around this in some laborious way, but I’m wondering if we should move the rand lookups after those of other default features, e.g. with:

diff --git a/src/hb-ot-shape.cc b/src/hb-ot-shape.cc
index 00ecdfab..a7cf6433 100644
--- a/src/hb-ot-shape.cc
+++ b/src/hb-ot-shape.cc
@@ -332,9 +332,6 @@ hb_ot_shape_collect_features (hb_ot_shape_planner_t          *planner,
   map->add_feature (HB_TAG ('d','n','o','m'));
 #endif
 
-  /* Random! */
-  map->enable_feature (HB_TAG ('r','a','n','d'), F_RANDOM, HB_OT_MAP_MAX_VALUE);
-
 #ifndef HB_NO_AAT_SHAPE
   /* Tracking.  We enable dummy feature here just to allow disabling
    * AAT 'trak' table using features.
@@ -364,6 +361,9 @@ hb_ot_shape_collect_features (hb_ot_shape_planner_t          *planner,
     map->enable_feature (HB_TAG ('v','e','r','t'), F_GLOBAL_SEARCH);
   }
 
+  /* Random! */
+  map->enable_feature (HB_TAG ('r','a','n','d'), F_RANDOM, HB_OT_MAP_MAX_VALUE);
+
   for (unsigned int i = 0; i < num_user_features; i++)
   {
     const hb_feature_t *feature = &user_features[i];

or even better, respect the lookup order (not sure how to do that, though).

behdad · 2020-03-11T04:47:38Z

Humm. I don't think that change has anything to do with it. It's simply a side-effect of we features in stages. In particular, to match Uniscribe, we pause before each of init, medi, isol, fina... So rand can be applied either before, or after, unless we come up with a completely different scheme.

This is a common problem; same applies to rvrn etc.

khaledhosny · 2020-03-11T15:36:04Z

I see. So no way to make it respect lookup order i.e. execute rand lookups based on their order in the lookup list?

behdad · 2020-04-22T21:48:21Z

I see. So no way to make it respect lookup order i.e. execute rand lookups based on their order in the lookup list?

We do things in stages and within each stage by lookup-order. So the final order of lookups is not sorted. How would you fit rand in there? You end up putting in either the first or last stage and sort within that...

Same applies to features like rvrn. Some designers like to do their mapping before everything else. Some after. We can't have both, so spec put it first.

khaledhosny · 2020-04-23T00:07:07Z

I overlooked the stages bit. All fine then. I ended up not adding rand to that font after all, but if I got to do it for another font I think I’ll randomize the isolated glyphs (even if some look identical) and map these to different positional glyphs.

KrasnayaPloshchad mentioned this pull request Feb 19, 2018

OpenType 'rand' (Randomize) feature not correctly implemented #427

Closed

dscorbett added 4 commits September 7, 2018 15:52

Implement 'rand'

0380ad4

Allow requesting a specific glyph for 'rand'

0e28de0

Randomization only happens by default. If the user specifies a value for 'rand', that value is respected.

Test 'rand'

469fe78

Don't seed the RNG from the contents of the buffer

f63cfe1

behdad added a commit that referenced this pull request Sep 10, 2018

Make --features rand=1 available to the user

de801b7

Use rand=255 to mean "randomize". Part of #803

behdad added a commit that referenced this pull request Sep 11, 2018

Make --features rand=1 available to the user

71c9f84

Use rand=255 to mean "randomize". Part of #803

behdad closed this Sep 11, 2018

behdad mentioned this pull request Sep 11, 2018

Add random-seed API to buffer #1155

Closed

dscorbett deleted the rand branch September 11, 2018 12:59

pontaoski mentioned this pull request Mar 13, 2024

HarfBuzz's rand implementation does not expose the seed #4620

Closed

Implement 'rand' #803

Implement 'rand' #803

Conversation

dscorbett commented Feb 18, 2018

behdad commented Feb 18, 2018

behdad commented Feb 19, 2018

dscorbett commented Feb 23, 2018

behdad commented Feb 23, 2018

khaledhosny commented Feb 23, 2018

behdad commented Feb 23, 2018

dscorbett commented Mar 23, 2018

behdad commented Apr 13, 2018

ebraminio commented Jun 9, 2018

behdad commented Jul 23, 2018

dscorbett commented Jul 23, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

dscorbett commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 10, 2018

dscorbett commented Sep 10, 2018

behdad commented Sep 10, 2018

dscorbett commented Sep 10, 2018

behdad commented Sep 10, 2018

behdad commented Sep 11, 2018

behdad commented Sep 11, 2018

khaledhosny commented Mar 10, 2020

behdad commented Mar 11, 2020

khaledhosny commented Mar 11, 2020

behdad commented Apr 22, 2020

khaledhosny commented Apr 23, 2020