Programatically creating the font from the handwriting sample #16

pelson · 2017-04-21T04:10:29Z

Transcribed below:

I'm opening this issue because I've done some work with the handwriting
dataset that Randall produced, and am able to programatically produce
a pleasing fontfile from it.

First and foremost, I wanted to get your feedback on the approach that
I've taken. I'd be more than happy to discuss options for bringing this
work into this repo (and the logistics/code organisation etc.).

A whole saga has been written up @ https://pelson.github.io/2017/xkcd_font/
with way more detail than is necessary for the everyday interested
party. My intention was to document the whole process so that the
font is entirely reproducible from the source document (the handwriting
sample). Because of the detail, I'm not widely advertising the article's
existence at this point, but will do in the next few weeks/months once
we have decided what is best to do with its findings.

Any input on next steps would be greatly appreciated.

pelson · 2017-04-21T04:17:07Z

P.S. My spelling is terrible, and there are a number of typos that need fixing in my writeup (I'm on it 😄 ).

Just in case CC @rgbkrk & @spu7nik as the most recent contributors.

rgbkrk · 2017-04-21T06:38:08Z

Well that's awesome Phil!

takluyver · 2017-04-21T09:29:01Z

Nice! I really enjoyed the detailed write up, thanks. :-)

HughP · 2017-04-21T15:37:58Z

Nice. I'm going to work my way through the write up. I've been looking for a tutorial on how to do this kind of stuff.

rgbkrk · 2017-04-21T18:10:35Z

That was a fabulous article, I loved the ligature notes.

Carreau · 2017-04-21T21:46:58Z

Sweet ! On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ? Might be a bit annoying because of narrow characters but this can likely can be automatized.

cc @mpacer as well who will enjoy this.

Carreau · 2017-04-21T21:49:40Z

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

pelson · 2017-04-21T22:27:57Z

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

Definitely plausible, but also possibly in to the territory of diminishing returns... 😉

On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ?

You mean, rather than my defining the paragraph complete with ligatures, I just define the paragraph and auto-detect the wide images as ligatures? I think that is definitely possible, and think it might work well iteratively (as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]).

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

Carreau · 2017-04-21T23:27:05Z

as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]

You might need to take the "average" bounding box lenght, and say "oh, my bounding box is about 3 times the average, it's likely a 3 char ligature". The position of the BB on the x axis may also give you hints of the width of the characters being ligatured and wether they are narrow or wide.

You might be able to also say :

I have N character on my line with typical with w_i
B bounding boxes of with b_i
So L=N-B ligatures.

ligatured characters have length \Sum w_i

Then optimize for the positions of the ligatures to minimize the distance between the BB width and the lenght of the w_i/ Sum w_i.

You can use simulated aneling but it may be too much.

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

I would minimize the chances of having out of date artifacts and have people modify autogenerated files. Maybe we can get it to build on travis...

pelson · 2017-04-22T01:21:48Z

I'm up for having it build on travis. If we are doing that though, we will also need to have some integration test(s) to confirm the font continues to look correct (not a biggy, just making it clear).

damianavila · 2017-05-15T22:17:02Z

Lovely saga @pelson!!

Because of the detail, I'm not widely advertising the article's
existence at this point, but will do in the next few weeks/months once
we have decided what is best to do with its findings.

Can we share it now, am I right?

rgbkrk · 2017-06-02T01:54:54Z

When are you comfortable with this post being widely dispersed? We just brought this up at the dev meetings and I personally would love to share it more widely.

rgbkrk · 2017-06-02T01:55:49Z

As for what to do, I think it would be great to incorporate your new glyphs into the current font.

pelson · 2017-06-14T07:17:56Z

I've opened a PR. Hopefully that might lead to a happy ending for to the saga 😉 .

takluyver · 2017-06-21T09:35:12Z

Closing as that PR was merged; thanks @pelson !

pelson · 2017-06-21T09:44:26Z

Cool. I'm going to do a bit more tidy work on the code in the repo, then I'll publicise the blog post in the next few days/week or so to see if we can get a burst of potential collaborators.

damianavila · 2017-06-21T10:47:07Z

Let us know when you publicize it so we can can spread the word 😉

mpacer · 2017-06-21T16:00:57Z

I'm really excited to figure out how to work on the kerning pairs, @suchow and I have a paper from last years cogsci about using crowdsourcing and transmission chains to find people's inductive biases toward kerning pair expectations(https://cocosci.berkeley.edu/papers/zerothprinciples.pdf). It should be able to be modified to allow designing kerning pairs as well as just extracting the bias. Also I'd just love to play around with the each of the pieces of the code. I'm still really impressed at your persistence in this as I've tried to do similar things and always got stymied in just setting up the system to work at all. Any word on getting conda-forge versions of the various packages you use?

…

On Wed, Jun 21, 2017 at 03:47 Damian Avila ***@***.***> wrote: Let us know when you publicize it so we can can spread the word 😉 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#16 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACXg6OJFtXwHSRiIhmj-fJb5_8d42U24ks5sGPSsgaJpZM4ND159> .

pelson · 2017-06-22T09:58:00Z

Any word on getting conda-forge versions of the various packages you use?

I haven't yet gone down that road, though it is the obvious next step in terms of the software stack. In truth, this has been a hobby project that I got so far down that I couldn't bear to see just dropped, but I don't have the capacity to maintain a fully functional conda-forge build of fontforge. The compromise I made was to have a pre-built (albeit unreproducible) docker image with the tools necessary for the job. I'm sure you've seen that you can currently get hold of that on dockerhub pelson/fontbuilder.

Hopefully, the work that I've done here has reduced the barrier to entry for improvements such as the one you pointed out. We will at some point get to the interesting situation where "improvements" will be highly subjective, and I recommend we find some way to meet that head-on (perhaps a BDOFFL? [benign dictator of fonts for life], or a rate-my-changes voting policy)

pelson mentioned this issue Jun 14, 2017

Auto generated font from Randall's handwriting #17

Merged

takluyver closed this as completed Jun 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Programatically creating the font from the handwriting sample #16

Programatically creating the font from the handwriting sample #16

pelson commented Apr 21, 2017

pelson commented Apr 21, 2017

rgbkrk commented Apr 21, 2017

takluyver commented Apr 21, 2017

HughP commented Apr 21, 2017

rgbkrk commented Apr 21, 2017

Carreau commented Apr 21, 2017

Carreau commented Apr 21, 2017

pelson commented Apr 21, 2017

Carreau commented Apr 21, 2017

pelson commented Apr 22, 2017

damianavila commented May 15, 2017

rgbkrk commented Jun 2, 2017 •

edited

Loading

rgbkrk commented Jun 2, 2017

pelson commented Jun 14, 2017

takluyver commented Jun 21, 2017

pelson commented Jun 21, 2017

damianavila commented Jun 21, 2017

mpacer commented Jun 21, 2017 via email

pelson commented Jun 22, 2017

Programatically creating the font from the handwriting sample #16

Programatically creating the font from the handwriting sample #16

Comments

pelson commented Apr 21, 2017

pelson commented Apr 21, 2017

rgbkrk commented Apr 21, 2017

takluyver commented Apr 21, 2017

HughP commented Apr 21, 2017

rgbkrk commented Apr 21, 2017

Carreau commented Apr 21, 2017

Carreau commented Apr 21, 2017

pelson commented Apr 21, 2017

Carreau commented Apr 21, 2017

pelson commented Apr 22, 2017

damianavila commented May 15, 2017

rgbkrk commented Jun 2, 2017 • edited Loading

rgbkrk commented Jun 2, 2017

pelson commented Jun 14, 2017

takluyver commented Jun 21, 2017

pelson commented Jun 21, 2017

damianavila commented Jun 21, 2017

mpacer commented Jun 21, 2017 via email

pelson commented Jun 22, 2017

rgbkrk commented Jun 2, 2017 •

edited

Loading