Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programatically creating the font from the handwriting sample #16

Closed
pelson opened this issue Apr 21, 2017 · 19 comments
Closed

Programatically creating the font from the handwriting sample #16

pelson opened this issue Apr 21, 2017 · 19 comments

Comments

@pelson
Copy link
Member

pelson commented Apr 21, 2017

screen shot 2017-04-21 at 14 07 37


Transcribed below:

I'm opening this issue because I've done some work with the handwriting
dataset that Randall produced, and am able to programatically produce
a pleasing fontfile from it.

First and foremost, I wanted to get your feedback on the approach that
I've taken. I'd be more than happy to discuss options for bringing this
work into this repo (and the logistics/code organisation etc.).

A whole saga has been written up @ https://pelson.github.io/2017/xkcd_font/
with way more detail than is necessary for the everyday interested
party. My intention was to document the whole process so that the
font is entirely reproducible from the source document (the handwriting
sample). Because of the detail, I'm not widely advertising the article's
existence at this point, but will do in the next few weeks/months once
we have decided what is best to do with its findings.

Any input on next steps would be greatly appreciated.

@pelson
Copy link
Member Author

pelson commented Apr 21, 2017

P.S. My spelling is terrible, and there are a number of typos that need fixing in my writeup (I'm on it 😄 ).

Just in case CC @rgbkrk & @spu7nik as the most recent contributors.

@rgbkrk
Copy link
Member

rgbkrk commented Apr 21, 2017

Well that's awesome Phil!

@takluyver
Copy link
Member

Nice! I really enjoyed the detailed write up, thanks. :-)

@HughP
Copy link

HughP commented Apr 21, 2017

Nice. I'm going to work my way through the write up. I've been looking for a tutorial on how to do this kind of stuff.

@rgbkrk
Copy link
Member

rgbkrk commented Apr 21, 2017

That was a fabulous article, I loved the ligature notes.

@Carreau
Copy link
Member

Carreau commented Apr 21, 2017

Sweet ! On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ? Might be a bit annoying because of narrow characters but this can likely can be automatized.

cc @mpacer as well who will enjoy this.

@Carreau
Copy link
Member

Carreau commented Apr 21, 2017

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

@pelson
Copy link
Member Author

pelson commented Apr 21, 2017

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

Definitely plausible, but also possibly in to the territory of diminishing returns... 😉

On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ?

You mean, rather than my defining the paragraph complete with ligatures, I just define the paragraph and auto-detect the wide images as ligatures? I think that is definitely possible, and think it might work well iteratively (as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]).

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

@Carreau
Copy link
Member

Carreau commented Apr 21, 2017

as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]

You might need to take the "average" bounding box lenght, and say "oh, my bounding box is about 3 times the average, it's likely a 3 char ligature". The position of the BB on the x axis may also give you hints of the width of the characters being ligatured and wether they are narrow or wide.

You might be able to also say :

  • I have N character on my line with typical with w_i
  • B bounding boxes of with b_i
  • So L=N-B ligatures.

ligatured characters have length \Sum w_i

Then optimize for the positions of the ligatures to minimize the distance between the BB width and the lenght of the w_i/ Sum w_i.

You can use simulated aneling but it may be too much.

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

I would minimize the chances of having out of date artifacts and have people modify autogenerated files. Maybe we can get it to build on travis...

@pelson
Copy link
Member Author

pelson commented Apr 22, 2017

I'm up for having it build on travis. If we are doing that though, we will also need to have some integration test(s) to confirm the font continues to look correct (not a biggy, just making it clear).

@damianavila
Copy link
Member

Lovely saga @pelson!!

Because of the detail, I'm not widely advertising the article's
existence at this point, but will do in the next few weeks/months once
we have decided what is best to do with its findings.

Can we share it now, am I right?

@rgbkrk
Copy link
Member

rgbkrk commented Jun 2, 2017

When are you comfortable with this post being widely dispersed? We just brought this up at the dev meetings and I personally would love to share it more widely.

@rgbkrk
Copy link
Member

rgbkrk commented Jun 2, 2017

As for what to do, I think it would be great to incorporate your new glyphs into the current font.

@pelson
Copy link
Member Author

pelson commented Jun 14, 2017

I've opened a PR. Hopefully that might lead to a happy ending for to the saga 😉 .

@takluyver
Copy link
Member

Closing as that PR was merged; thanks @pelson !

@pelson
Copy link
Member Author

pelson commented Jun 21, 2017

Cool. I'm going to do a bit more tidy work on the code in the repo, then I'll publicise the blog post in the next few days/week or so to see if we can get a burst of potential collaborators.

@damianavila
Copy link
Member

Let us know when you publicize it so we can can spread the word 😉

@mpacer
Copy link
Member

mpacer commented Jun 21, 2017 via email

@pelson
Copy link
Member Author

pelson commented Jun 22, 2017

Any word on getting conda-forge versions of the various packages you use?

I haven't yet gone down that road, though it is the obvious next step in terms of the software stack. In truth, this has been a hobby project that I got so far down that I couldn't bear to see just dropped, but I don't have the capacity to maintain a fully functional conda-forge build of fontforge. The compromise I made was to have a pre-built (albeit unreproducible) docker image with the tools necessary for the job. I'm sure you've seen that you can currently get hold of that on dockerhub pelson/fontbuilder.

Hopefully, the work that I've done here has reduced the barrier to entry for improvements such as the one you pointed out. We will at some point get to the interesting situation where "improvements" will be highly subjective, and I recommend we find some way to meet that head-on (perhaps a BDOFFL? [benign dictator of fonts for life], or a rate-my-changes voting policy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants