New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
makeutype is_Ligatures.c #2762
makeutype is_Ligatures.c #2762
Conversation
The "python" program to build these files contributed to a headache worth of problems earlier between 2012 and 2014 when Fontforge built python as an "optional" choice and not a "forced" dependency. Removing the python program allowed us to continue onto 20140101 and onwards... Example problems/issues fontforge#384, fontforge#387, fontforge#229, fontforge#565, fontforge#382, fontforge#241, fontforge#249, fontforge#233 makeutype.c already uses the Unicode list, so it can do this here too.
if ( index>0x11ffff ) | ||
return; | ||
if ( index<0 ) return( -1 ); | ||
if ( index>0x11ffff ) return( 0 ); | ||
++pt; /* move past semicolon */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does pt originate? Are we sure that it is of non-zero length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehehe ;-) your turn to go off-topic ;-P
see: #2735
"Could we just fix the header and not mess with the whitespace and minor verbiage? I don't think it's worth the changelog/blame clutter."
I've already had to rebase patches a few times for throwing in extras, so, I'll just leave that pt alone for now. ;-P
A lot of code assumes things work fine and doesn't deal with "what if?" scenarios, so...as you note here...it's too easy to go fix one-more thing and throw it into a patch while you're there now.
This entire pull request is sort-of a subset of adding authors and copyrights to satisfy #2735, but this patch focuses on creating is_Ligatures.c as a substitute for the python code that had to be removed in the past. Would prefer to throw-in python script commands to keep this all together as a mini-howto of what needs attention, but this single patch is pretty big already without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not whitespace, and you're already adjusting this file. Surely adding if (pt[0] == '\0') return -1
would not ruin your day? I'm happy to do it if necessary, but it's much more efficient if you do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the structure of unialt.c shows that fontforge is unable to cope with codepoint==0, or has difficulty and needing workaround exceptions. Older bitmaps have references to char 0, and as you point-out, it won't kill us to make it 1 instead of zero, but why limit ourselves? These functions here will work fine if the value is zero. Youll note that there is another routine in ustring that also handles zero too. unialt could probably use a rewrite and I believe we can add zero capability into it too. Where 0 is a problem is with higher level fonts, so it makes sense to test for zero upthere...but down here should be okay to look for zero without having to limit ourselves from the get-go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I meant. What I meant was that there is a potential segfault here if argument pt is a zero-length string because it gets incremented past the first value before it gets used. All we need is if (pt[0] == '\0') return -1
, and I'm hoping that you can just put it into this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ( s<m ) | ||
fprintf( data, " else\n\treturn( (int32)(%s32[n-%d]) );\n", t, s ); | ||
fprintf( data, "}\n\n" ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So m
is table size. What are t
, n
, and s
in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'm' is the entire maximum length of the table *dt which will got from reading unicode list and then be named 't' when building the tables and functions in is_Ligatures.c. Looking at a byte count, it made less waste to have an array of 500+ ligatures, 20+ vulgars, and about 50 odd fraction definitions (approx 600 x 4bytes = 2400bytes) instead of assigning an individual 1bit for each in the lookup table (approx 1bit x MAXC = 1/8byte x 65536 = 8k). doing this for ligs, vulgs and frac would make it 8k x 3 = 24k used in flag space. Since I was already building a program, it seemed worth going the extra step and compressing everything in {0..65535} into uint16 (so we split the table 0...s), and anything left afterwards goes into {65536....->) which is uint32 (and is the remainder of the table from s+1...m. so instead of using 2400bytes, now is closer to 1200bytes to define about 600 ligs/vuls/fracs.
'n' is used to name the various functions using the tables.
These routines are reusable 3x to build ligs, vuls, fracs., so testing against one verifies the other two will build similarly the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Can we get this (or a shorter version of it) in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably ought to rename parameter n to something like nam so as to avoid confusion with the other n.
Are these comments right?
// m is the maximum size of the table.
// s is the size of the lower partition of the table.
// t is the first part of the data type name.
// nam is the function name base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added notes
And thanks for making tests! |
I was thinking of adding more stuff, but the only thing I got working ATM is vulgar alt expansions. The unicode list appears inconsistent for getting ligatures and other fractions ATM (needs more thought, and ver9.0 is a larger table, maybe improved too), plus you announced a planned 2week window on the next update, so no point on making it more complex at the moment. |
acf8ee8
to
ca6acd9
Compare
|
||
import sys, fontforge | ||
|
||
print "Get Table Array Totals." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure all tests should conform to Python 3 standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why was the test skipped on Travis? Anyway adding brackets to the print calls should be enough for python 3 compatibility. test1009.py
also needs to be added to Makefile.am
EXTRA_DIST
in the test folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yep. The reason why it wasn't failing on Travis when it should have (which compiles FontForge with Python3 scripting) is because test1009.py
wasn't in Makefile.am
.
Good catch with the python3 @jtanx |
I had a brief look at 926, but it looks like it's harder to get that working on Python 3 due to bytes not being implicitly converted to strings. I'd definitely leave that one for another PR. |
21940f9
to
6e6fe3b
Compare
had to pull-out test1009.py to avoid merging conflict in tests/Makefile.am due to #2811 - will add it later after rebasing (later). |
d0a8852
to
1d5f48b
Compare
Added copyright Authors to satisfy Debian Lint copyright issue fontforge#2643. Added several unicode chart lookup functions, native scripting access, python scripting, test129.pe and test1009.py.
I'm done here. @jtanx? |
@JoesCat are you going to rebase to add back test1009 to the Makefile? Otherwise lgtm |
Gets sort of messy trying to do a PR and rebase at the same time, so test1009.py would sit somewhere after the rebase. I'll add it afterwards.with another isLigature.c PR. |
Weird - seems like we both hit merge at the same time - hehehe finally done - time to move-on. |
makeutype is_Ligatures.c
Follow-up to fontforge#2762
Allow makeutype to build an up-to-date is_Ligature.c file, and add several native script functions to allow read access to internal ligature and fraction tables.