Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Introduce a file name parser #706

Closed
wants to merge 21 commits into from

Conversation

Finii
Copy link
Collaborator

@Finii Finii commented Dec 5, 2021

This is obolete -=> See #717

Description

The font-patcher shall patch (almost) any input font file and generate a useful new font file. This can be complicated because the font name and font name parts need to be changed in the process. At least the "Nerd Font" label shall be added to the name, so that the fonts are differentiated from the original fonts if both are installed.
This is furthermore complicated, because the "Nerd Fonts" has to be placed at the right position in the font name.

Then there are some fonts with issues, where the grouping of related fonts is not correct in some applications, or some become invisible.

The current solutiuon (in font-patcher:setup_font_names) is not very systematic. This MR shall develop a new way to setup the names (and naming parts) of the patched fonts and more.

The goal is to have a very robust font (re)naming implemented so that any sets of fonts can be patched without loss (and possibly increased) usability apart from the added glyphs. That would make adding new source fonts a breeze (like all the Cascadia styles) and additionally helps people that patch other fonts for their own use.

What does this Pull Request (PR) do?

At the moment the proposed algorithm is added as test script. It uses just/exclusively the font filenames to generate the font naming parts.
To show that this approach is viable:

  • It is added as test script
  • The script creates all the relevant naming parts (*) of all files in src/unpatched_fonts and compares these to the embedded (original) information in all these files
  • While comparing a well defined set of 'lenience' rules (where we do not compare too exact) is used
  • For cases that still differ there is a file with extra allowances that each has the reason documented (usually the generated names are more correct than the original names)
  • The test script can be re-run any time and it reports in detail which font changed in which respect compared to the last run. That helps a lot when tweaking the new name generation algorithm

More will be explained later.

The proposed algorithm is a Python class like this:

class FontnameParser:             """Deconstruct a font filename to get standardized name parts"""
    
    def __init__(self, filename): """Parse a font filename and store the results"""
    
    def fullname(self):           """Get the SFNT Fullname (ID 4)"""

    def psname(self):             """Get the SFNT PostScriptName (ID 6)"""
    
    def preferred_family(self):   """Get the SFNT Preferred Familyname (ID 16)"""
    
    def preferred_styles(self):   """Get the SFNT Preferred Styles (ID 17)"""

    def family(self):             """Get the SFNT Familyname (ID 1)"""
    
    def subfamily(self):          """Get the SFNT SubFamily (ID 2)"""
    
    def ps_familyname(self):      """Get the PS Familyname"""

    def ps_fontname(self):        """Get the PS fontname"""
    
    def macstyle(self, style):    """Modify a given macStyle value for current name, just bits 0 and 1 touched"""
    
    def fs_selection(self, fs):   """Modify a given fsSelection value for current name, bits 0, 5, 6, 8, 9 touched"""

# Of course there are some methods that we need for out renaming purposes like for example:

    def inject_suffix(self, fullname, fontname, family): """Add a custom additonal string that shows up in the resulting names"""
    
    def set_for_windows(self, for_windows):              """Create slightly different names, suitable for Windows use"""
        
    def add_name_substitution_table(self, table):        """Have some fonts renamed, takes list of tuples (regex, replacement) (SIL table)"""

The use rater goes along this lines:

n = FontnameParser(fname).enable_short_style_when('Noto').add_name_substitution_table(SIL_TABLE)
n.set_keep_regular_in_family(is_keep_regular(n)) # Standard is FALSE
n.inject_suffix("Nerd Font Complete Mono", "Nerd Font Complete M", "Nerd Font Mono")

How should this be manually tested?

$ fontforge bin/scripts/name_parser_test1 src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

Any background context you can provide?

What are the relevant tickets (if any)?

#663 #690 #695

Screenshots (if appropriate or helpful)

Requirements / Checklist

Put this last, as this is not so important at the moment.

  • Read the Contributing Guidelines
  • Read or at least glanced at the FAQ
  • Read or at least glanced at the Wiki
  • Scripts execute without error (if necessary):
    • If any of the scripts were modified they have been tested and execute without error, e.g.:
      • ./font-patcher Inconsolata.otf --fontawesome --octicons --pomicons
      • ./gotta-patch-em-all-font-patcher\!.sh Hermit
  • Extended the README and documentation if necessary, e.g. You added a new font please update the table

DO NOT MERGE

Invoke:
$ fontforge bin/scripts/name_parser_test1 src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

This looks into name_parser_test1.known_issues for explained issues
and generates a new known_issues file from the findings.

If there are new issues they will turn up as 'AUTOGENERATED', and
obsolete issues (fixed ones) will be listed at the end of the new file.

In this way one can tweak the code and compare very easily what a change
means for all the fonts, which will break or be repaired.

This can also be used if new fonts are added, one can check the parser
output and potentially add the needed new known_issue rules by just
moving known_issued.new to knows_issues.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii Finii added the not ready label Dec 5, 2021
@Finii
Copy link
Collaborator Author

Finii commented Dec 5, 2021

I just wanted to get this going/official.
Maybe @ryanoasis can comment if this is out of the question or if there is room that I can convince :-)

Further explanations, tables, screenshots, reference websites, etc pp will be added tomorrow (or so). Please bear with me.

[why]
We want to reuse the code in different tests.

[how]
Modularize.

[note]
Also remove wrong 'close' call.
Also correct remark on Lilex.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Finii added a commit to adam7/delugia-code that referenced this pull request Dec 7, 2021
[why]
The Light fonts are not grouped to the Regular and Bold fonts in
typographic names aware applications.

[how]
This is a longer standing problem with font-patcher, that does not fill
the typographic family and style with meaningful values. We could do
better with the `do_rename` script, but we did not.

Instead of repairing the `do_rename` script the `name_parser.py` module
is included, that I would like to get into `font-parser` to repair its
behaviour for all fonts.
As noone knows when or if at all it will end up in the patcher script,
we add the same code here.

The module will set the 4 names (family, subfamily, typogr. family and
typogr. style) in a way that we get correct grouping in both application
types: typographic aware one and 'ordinary' ones.

[note]
The `name_parser.py` originates here:
ryanoasis/nerd-fonts#706

Fixes: #72

Reported-by: Rashil Gandhi <mail@rashil2000.me>
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
The fontforge Python interface is not able to easily remove existing
SFNT entries.

[how]
Because it it not so easy to apply the findings of the FontnameParser to
a font a helper function is added that does all the necessary bits and
pieces.

[note]
Also fix instances where name_parser uses Python features that are
too 'new' on some machines (i.e. Gitlab Ubu 18.04 runner).

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Dec 7, 2021

Added a convenience function to the parser, that applies all the needed stuff to a font.

Example:

font = fontforge.open(fname)

n = FontnameParser(fname)
n.enable_short_style_when('Noto')
n.add_name_substitution_table(SIL_TABLE)
n.inject_suffix("Nerd Font Complete Mono", "Nerd Font Complete M", "Nerd Font Mono")

n.rename_font(font)        #    <=-  does all the magic

Use this thing now at adam7/delugia-code@203324a

@Finii
Copy link
Collaborator Author

Finii commented Dec 7, 2021

The last commit (555b4ac) handles the stuff mentioned in #690.

This MR is favored for fixing, as it fixes also other stuff.

@Finii
Copy link
Collaborator Author

Finii commented Dec 7, 2021

Codeclimate ... *cough*

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
The very low number of lines per file forces us to split up the stuff
into two files...
To still keep this a bit organized (and with the background that the
code propably will end up in the font-patcher file anyhow), all this
stuff is put into a dedicated subdir.

The call changes thus:

$ fontforge bin/scripts/name_parser/name_parser_test1 src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii Finii force-pushed the feature/rewrite-setup_font_names branch from d85f8af to 694e999 Compare December 8, 2021 08:39
@Finii
Copy link
Collaborator Author

Finii commented Dec 8, 2021

Fixed the code climate stuff. There are some changes for the better but also some that I would rather have put to 'wontfix'. Especially the limit to 250 LOC is ridiculously a bit low.
I also guess the mental burden calculation is ... interesting: list comprehension seems to count zero, a readable-for-non-pythonesioans loop construct is expensive... right :-(

Whatever. Lets turn to the actual issues this shall fix, in the next comment.

[why]
Fontforge makes a lot of assumptions when we want to manipulate the SFNT
table. We do not want that; we want to get what we asked for.

[how]
Manipulate the SFNT tuples directly.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
Suffixes should not start with a blank or end with one, the resulting
names would look wrong. But `font-patcher` calculates them with a
beginning blank (for easy addition in the old code).

It is more convenient and also more robust if the parser makes sure that
no superfluous blanks are added.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
These scripts can help to detect issues with fonts.
Of course there are also a lot of other tools that do the same.

These tools document how the comments in the PR were generated.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
To correct a lot of problems we want to use a new way to set the
various name parts in the patched font.

[how]
Keep the old behavior, add new command line flag that enables the new
method (i.e. --parser).

The result will be unchanged if the flag is not given.

The FontnameParser is only used to set the names if the flag is given.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Dec 8, 2021

Here one motivation example, why we want a better renaming process.

The helper scripts are also in this branch. I removed some 'noise' from the output.

$ git status
On branch feature/rewrite-setup_font_names
Your branch is up to date with 'origin/feature/rewrite-setup_font_names'.

$ echo Examine one font from patched_fonts/
$ fontforge bin/scripts/name_parser/query_names patched-fonts/CascadiaCode/Light/complete/Caskaydia\ Cove\ Light\ Nerd\ Font\ Complete\ Windows\ Compatible.otf 2>/dev/null
Examining 1 font files
 |Filename                                           | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |Caskaydia Cove Light Nerd Font Complete Windows Co | | Caskaydia Cove Light Nerd Font Complete Windows Co | | Cascadia Code Light            | | Regular                        | | CaskaydiaCove NF               | | Light

$ echo The Family name is strange, lets see:
$ rm Ca*
$ fontforge --script font-patcher --powerline -w  src/unpatched-fonts/CascadiaCode/Light/CascadiaCode-Light.otf >/dev/null 2>&1
$ fontforge bin/scripts/name_parser/query_names Ca* 2>/dev/null
Examining 1 font files
 |Filename                                           | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |Caskaydia Cove Light Nerd Font Windows Compatible. | | Caskaydia Cove Light Nerd Font Windows Compatible  | | Cascadia Code Light            | | Regular                        | | CaskaydiaCove NF               | | Light

$ echo Is also present when we patch afresh...
$ echo Now lets use the FontnameParser, just add --parser
$ rm Ca*
$ fontforge --script font-patcher --powerline -w  --parser src/unpatched-fonts/CascadiaCode/Light/CascadiaCode-Light.otf >/dev/null 2>&1
$ fontforge bin/scripts/name_parser/query_names Ca* 2>/dev/null
Examining 1 font files
 |Filename                                           | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |CaskaydiaCove Nerd Font Windows Compatible Light.o | | CaskaydiaCove Nerd Font Windows Compatible Light   | | CaskaydiaCove NF Light         | | Regular                        | | CaskaydiaCove NF               | | Light

$ echo Fine :-)

The details are here:
image

  • The "Nerd Font" is attached after the weight, that is rather 'unusual' for a name part
  • The Family name is NOT overwritten but contains Cascadia
  • The Fullname contains a blank between Caskaydia and Code, while the Familyname does not. That is wrong.

If the renaming should result in Caskaydia Code or CaskaydiaCode can be easily tweaked in the SIL_TABLE and can be changed whatever we want there.

[why]
The `font-patcher` supresses name parts that are 'for powerline' or
'powerline'. The reason is, that all Nerd Fonts font contains these
anyhow.

[note]
Change `font-patcher` to use the new function.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
Somehow the line breaks of now-dropped known issues are missing

[how]
Add a newline to output files.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
When we apply one substitution table and then switch to another the old
table seems to keep being active.

[how]
If there is no new replacement the old replacement is not undone.
Restore the basename now always to the default/startvalue.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
The `font-patcher` (historically?) created the Family names of 'Caskaydia Cove'
without the blank as 'CaskaydiaCove' (same for Mono). This SIL table
adapted that behavior.
But then, it seems almost wrong.

[how]
Change it back to include the blank (if existing).

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Dec 8, 2021

I changed the SIL table, so that name parts are separated by blanks, as in the original fonts. Ultimately this is a matter of taste and I do not care much.

But Fullname = Family + SubFamily should hold, so we need to do it either in both (Fullname and Family) or in neither. The current font-patcher does removes (some) blanks in the Family but not in the Fullname.

Nice that we can test if code changes resulted in any unexpected changes... but no:

$ fontforge bin/scripts/name_parser/name_parser_test1 src/unpatched-fonts/**/*.[ot]tf 2>/dev/null 
[...]
Fonts with different name rendering: 70/740 (70/70 are in known_issues)

So all the 'issues' are already known and explained for / accepted. ;-)


CodeClimate is so bad ...

image

Refactor one function away ... yes, that improves code quality for sure *raise eyebrows*

[why]
I have no clue.

[how]
Just drop a debugging functions. Who needs debugging.

And we can get the data out of the object anyhow.
Encapsulation in Python is a farce ;-)

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
It can be good to query more than one SFTN value at once.

[how]
The key can now be given as comma-separated-list.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Dec 8, 2021

Here is another Issue of the current font-patcher naming scheme:

$ fontforge bin/scripts/name_parser/query_sftn \
    'Fullname,Family,SubFamily,Preferred Family,Preferred Styles' \
    src/unpatched-fonts/Noto/Sans-Display/NotoSansDisplay-ExtraCondensedThinItalic.ttf 2>/dev/null                                        
SFNT Fullname             is 'Noto Sans Display ExtraCondensed Thin Italic'                                  
SFNT Family               is 'Noto Sans Disp ExtCond Thin'                                                   
SFNT SubFamily            is 'Italic'                                                                        
SFNT Preferred Family     is 'Noto Sans Display'                                                             
SFNT Preferred Styles     is 'ExtraCondensed Thin Italic'                                                    

$ fontforge bin/scripts/name_parser/query_sftn \
    'Fullname,Family,SubFamily,Preferred Family,Preferred Styles' \
    patched-fonts/Noto/Sans-Display/complete/Noto\ Sans\ Display\ ExtraCondensed\ Thin\ Italic\ Nerd\ Font\ Complete\ Mono.ttf 2>/dev/null
SFNT Fullname             is 'Noto Sans Display ExtraCondensed Thin Italic Nerd Font Complete Mono'          
SFNT Family               is 'NotoSansDisplay Nerd Font Mono'                                                
SFNT SubFamily            is 'Italic'                                                                        
SFNT Preferred Family     is 'NotoSansDisplay Nerd Font Mono'                                                
SFNT Preferred Styles     is 'ExtraCondensed Thin Italic'

image

Both the ExtraCondensed and the Thin are missing from the Family name. That means that all weights of that font will belong into one font group for 'not typographically aware' applications (which are most, I assume), and thus are unreachable.
It is in the Preferred stuff for 'typogr. aware' applications.

@Finii
Copy link
Collaborator Author

Finii commented Dec 8, 2021

Compare with a font-patcher --parser run (i.e. with this MR enabled), we get a correct Family:

$ fontforge font-patcher --powerline --parser src/unpatched-fonts/Noto/Sans-Display/NotoSansDisplay-ExtraCondensedThinItalic.ttf
[...] Generated: NotoSansDisplayNerdFont-ExtraCondensedThinItalic
$ fontforge bin/scripts/name_parser/query_sftn \
    'Fullname,Family,SubFamily,Preferred Family,Preferred Styles' \
    Noto* 2>/dev/null
SFNT Fullname             is 'Noto Sans Display Nerd Font ExtraCondensed Thin Italic'                        
SFNT Family               is 'Noto Sans Display Nerd Font ExtraCondensed Thin'                               
SFNT SubFamily            is 'Italic'                                                                        
SFNT Preferred Family     is 'Noto Sans Display Nerd Font'                                                   
SFNT Preferred Styles     is 'ExtraCondensed Thin Italic'      

[why]
The font-patcher historically grouped all name parts of the original
font together (if there are any) to form a camel case name:
  Cascadia Code => CaskaydiaCove

But only in the Family and Preferred Family names.

[how]
We already have an option to create short(er) Families by usage of
abbreviated style names (stemming from Noto).

Add an additional option that CamelCases the basic fontname in the
Family names.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Dec 9, 2021

Now we mimic the historic font-patcher behavior with CamelCase font base names in the Familes, but not in the Fullnames:

$ fontforge font-patcher --powerline --parser src/unpatched-fonts/Noto/Sans-Display/NotoSansDisplay-ExtraCondensedThinItalic.ttf
[...] Generated: NotoSansDisplayNerdFont-ExtraCondensedThinItalic
$ fontforge bin/scripts/name_parser/query_sftn \
    'Fullname,Family,SubFamily,Preferred Family,Preferred Styles' \
    Noto* 2>/dev/null
SFNT Fullname             is 'Noto Sans Display Nerd Font ExtraCondensed Thin Italic'                        
SFNT Family               is 'NotoSansDisplay Nerd Font ExtraCondensed Thin'                                 
SFNT SubFamily            is 'Italic'                                                                        
SFNT Preferred Family     is 'NotoSansDisplay Nerd Font'                                                     
SFNT Preferred Styles     is 'ExtraCondensed Thin Italic'  

I am not sure if the behavior makes any sense or is accidentially.

If we really strive for short Family names, we should maybe activate 'short styles' also, for all fonts.

But for the moment lets stick as close to the old naming as reasonably possible.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
The script patches a given font two times with font-patcher
* 'normal', just with --powerline and noithing more
* one times with --parser specified (will use the FontnameParser

The font files that resulted will be opened and the embedded names
compared. It is important to really patch a file and see the outcome
because fontforge is a bit unpredictable when it comes to what actuall
is set in a font file ;)

The names are compared with a but of lenience, like with
name_parser_test1.

What is missing is a known-issues file. Will add that later.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii Finii force-pushed the feature/rewrite-setup_font_names branch from 4370add to e77f382 Compare December 9, 2021 16:16
@Finii
Copy link
Collaborator Author

Finii commented Dec 9, 2021

(Force push because commit title was wrong 🙄 (CWD vs PWD))

@Finii
Copy link
Collaborator Author

Finii commented Dec 10, 2021

I guess this MR has too much noise.
And it explains it from the wrong end.

Will close this and reopen a new one, with hopefully better reasoning.

(All the explanatory scripts are ready now, I guess.)

@Finii Finii closed this Dec 10, 2021
@ryanoasis
Copy link
Owner

hey @Finii I was somewhat following along in the shadows. Sorry I haven't commented to know. All this work is really impressive and to be honest I feel a lot of it is above my head (at least at this moment) so I wasn't sure what to comment specifically. It's impressive work, I'll be reviewing some other PRs (including yours in the meantime).

Will close this and reopen a new one, with hopefully better reasoning.

Sure if that makes more sense to you.

Also I did see that the font patcher test workflows finally showed up and that was of particular interest to me at that point.

👍🏻

@Finii
Copy link
Collaborator Author

Finii commented Dec 11, 2021

Yes, sorry; to prepare all the additional material (reasoning, tests, explanations) takes much more time than the actual coding itself. One test-run with all fonts takes like 3 hours on my machine. But I guess I'm getting close to present this in a easier to follow way. This here is way too convoluted in train of thought.

The problem is to present the results (741 fonts with all possibly different names) in a way that is easy to follow and is not more than some lines long, but still explains everything :-D

@Finii
Copy link
Collaborator Author

Finii commented Dec 11, 2021

If you like some preliminary reading, this is what I have written so far for the upcoming PR. ;-)
No need to read it now, just if you are bored.


Creating Consistently Grouped Patched Fonts

This is a small sub-project to font-patcher that uses a little bit more knowledge
to come up with font names and name parts. In applications multiple fonts are grouped
under a 'Family'. Each member of the Family has a different 'SubFamily' or 'Style'.

Consider a font named 'Times' that has two variants: normal and bold. For this font the
Family would be 'Times' and the 'Style' would be 'Regular' (i.e normal) in one file and
'Bold' in the other file.

With this applications are able to group all 'Times' together and additionally choose the
'Bold' font if the user pushes the 'B' button on the font style dialog in that application.

Motivation

Quite a number of patched fonts have inconsistent or simply wrong font grouping. The naming in
general is sometimes surprising and not following naming conventions. This is in part due to
the font-patcher, but in part the source fonts are already strange.
This results in invisible (but installed) fonts in some applications, inconsistent naming
(Familyname differs from Fullname) and not correctly working bold/italic selectors in some applications.

And we would like to have the information within the names sorted in a consistent way.
usually a font name consists of these parts (in this order):

  1. Name base (e.g. Noto)
  2. Variant (e.g. Sans)
  3. Subvariant (e.g. Display)
  4. Weight (e.g. Black)
  5. Style (e.g. Italic)

This is important because we want to add subvariant information, namely the Nerd Font part.

Example:

  • (old) Iosevka Term Light Italic Nerd Font
  • (new) Iosevka Term Nerd Font Light Italic

The Plan

To solve these issues the font name parts have to be analyzed more thoroughly and then categorized.
These categories are then used to assemble the names in correct order. The simple (not
typographically aware) applications shall always get groups of at most four styles, and these
are Regular, Bold, Italic, and Bold-Italic. Other styles turn up as Families, because this is the
only way they would work in these more simple applications.

Typographically aware applications, on the other hand, get all styles grouped under one Family name.

First experiments showed that the full information can usually be restored already from the file
names that our source fonts have.

This new naming is complete optional (but recommended). Give the option --parser to font-patcher
and it will try to come up with reasonable grouping and naming. Leave the option out and it will
work as it always did.

The Tests

In this directory there are two tests.

  1. The first test checks the basics of the algorithm. It takes the filenames of all fonts in
    src/unpatched\_fonts, then it calculates the naming and compares it to the original
    naming in the font files. Ideally they would be equal.
  2. The second test does a 'production run'. It patches each font in src/unpatched_fonts/
    and patches it two times: Once without --parser and once with. Then it compares the
    naming, and it also shows the original font naming (for comparison).

All tests base on these assumptions

  • Fullname must be roughly equal
  • Fontname must be roughly equal
  • Familyname must roughly equal, order of all words does not matter
    (Order of words is ignored with test 2 only)
  • SubFamilyname must be equal, order of words does not matter
    (First word must be equal, order of other words is ignored with test 2 only)
  • Typographic names can be empty if the correct typographic name would be equal to the ordinary name
  • Tests are done case insensitive
  • Some special exemptions are made (see lenient_cmp() in test scripts)

Test 1

fontforge name_parser_test1 ../../../src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

This test takes the filename of a font, parses it and generates names from it. Then the actual
font is opened and the generated names are compared with the stored names. This test is used
to test the algorithm itself. Of course no SIL table is active as we want to preserve the original
names.

The output shows all the names, always two lines: first the generated names, then the readout
names. If there are differences the generated names are tagged with + and the readout ones
with -. If there are differences the actually different name part is marked with an X.

The differences have reasons, and there is a file with textual explanations for them. So far
all differences are 'ok'. A new run of the script will compare all differences with the stored
ones and alert the user if a new difference is detected (or a difference vanished). In this
way changes of the algorithm can be tested with a wide base of inputs.

Test 2

fontforge name_parser_test2 ../../../src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

This test compares actually patched fonts. Every font in src/unpatched_fonts/ is patched two
times: First with the 'old/classic' font-patcher naming, and second with the new naming
algorithm in action (by specifying --parser). Again the name parts are compared with some
lenience and an output generated like test 1 does.

Also again a file with known differences (with explanations) is read, and any new or vanished
differences are reported. In the report an additional line is given, tagged with >, that
contains the names of the original font, for human interpretation (often the reason
for a difference is obvious, because the classic font-patcher dropped information.

Further steps

One can examine all the (current) naming differences in the name_parser_test2.known_issues
file. The Explanation is followed by three lines of names: source-file, patched-with-parser,
and patched-classic.

The Explanation sorts most differences into common groups. This helps to weed out
explanations that might do not need much attention.

Helper scripts

There are some helper scripts that help examining the font files. Of course there are other,
more professional tools to dump font information, but here we get all we need in a concise
way

@Finii
Copy link
Collaborator Author

Finii commented Dec 11, 2021

One example, where the current font-patcher has an issue, here: Nerd Font suffix missing in Fullname (??!)

image

@ryanoasis
Copy link
Owner

Fixed the code climate stuff. There are some changes for the better but also some that I would rather have put to 'wontfix'. Especially the limit to 250 LOC is ridiculously a bit low.
I also guess the mental burden calculation is ... interesting: list comprehension seems to count zero, a readable-for-non-pythonesioans loop construct is expensive... right :-(

Just so my thought isn't lost, on the code climate. I just threw that integration in a few years maybe ago? I'm not tied to it and up for any better suggestions.

That said there isn't much fine tuning available via the UI however there is some advanced configuration to make note of that is interesting: https://docs.codeclimate.com/docs/advanced-configuration#default-checks

@Finii
Copy link
Collaborator Author

Finii commented Dec 13, 2021

One example, where the current font-patcher has an issue, here: Nerd Font suffix missing in Fullname (??!)

image

> is original font
+ is new 'parser'
- is current behavior

Forgot to push Comment 🙄

@Finii
Copy link
Collaborator Author

Finii commented Dec 15, 2021

I just threw that integration in a few years maybe ago? I'm not tied to it and up for any better suggestions.

I reckon it's better than nothing. It did trigger some improvements in my code here. Something should be there.

Personally I have no experience with code quality for Python, I do C++ the whole day ;)

That said there isn't much fine tuning available via the UI however there is some advanced configuration to make note of that is interesting: https://docs.codeclimate.com/docs/advanced-configuration#default-checks

I added some tweaks to it in my new (to be completed) MR, namely:

name new limit old limit
file-lines 500 250
method-complexity 9 5
method-count 30 20
method-lines 50 25

I believe these values are still low enough to encourage more thoughts being put into code structure, but the code can leave the 'casual script' level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants