Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get supported characters #52

Closed
RoelN opened this issue Nov 25, 2019 · 5 comments
Closed

Get supported characters #52

RoelN opened this issue Nov 25, 2019 · 5 comments

Comments

@RoelN
Copy link
Collaborator

RoelN commented Nov 25, 2019

What'd be the intented way to get a list of supported chars from the cmap table, other than looping over every unicode value and passing it to supports()?

I figured I could use startCode and endCode to get unicode ranges, but it there appears to be some noise in those arrays, e.g.:

[32, 38, 44, 48, 58, 63, 65, 91, 95, 97, 105, 160, 171, 187, 191, 8211, 8216, 8220, 8230, 8249, 65535, 0, 0, 0, 45, 9, 65534, 65474, 0, 65533, 65468, 0, 0, 65441, 65426, 65407, 57415, 57392, 57386, 57375, 57365, 1]

(Last supported character is 8249, after which it jumps to 65535 and some zeroes etc.)

@Pomax
Copy link
Owner

Pomax commented Nov 25, 2019

depends entirely on the subtables covered by cmap, really. We could add a "getSupportedCharCodes()", but remember that only subtables with a Unicode platform/encoding/language tuple uses character codes based on Unicode code point values.

For subformat 4, it'd indeed involve running through the segments - if you're seeing consecutive segments with unordered endCode values, we'll have to look at that more closely: that'd be a parsing bug.

@Pomax
Copy link
Owner

Pomax commented Nov 25, 2019

Looking at the subformat 4 segments and subformat 12 groups for the Adobe Source Pro fonts used for testing, I don't see out-of-sequence data:

// subtable format 4
otf.opentype.tables.cmap.get(0).segments.map(v => v.endCode).join(', ')
"47, 64, 96, 126, 191, 209, 223, 241, 384, 394, 399, 403, 417, 432, 450, 476, 483, 487,
491, 501, 511, 539, 567, 579, 593, 600, 604, 616, 618, 622, 630, 635, 638, 644, 658,
661, 665, 671, 674, 676, 679, 688, 691, 697, 700, 703, 705, 716, 721, 734, 740, 780,
787, 800, 810, 812, 817, 820, 829, 834, 837, 863, 865, 885, 890, 894, 906, 908, 912,
929, 939, 944, 962, 974, 977, 981, 983, 985, 987, 989, 993, 1039, 1071, 1119, 1123,
1141, 1171, 1179, 1187, 1195, 1203, 1207, 1211, 1218, 1233, 1241, 1251, 1257,
1263, 1267, 7491, 7497, 7501, 7504, 7506, 7512, 7515, 7580, 7584, 7611, 7687,
7697, 7703, 7713, 7723, 7739, 7753, 7763, 7779, 7791, 7813, 7831, 7838, 7929,
8129, 8143, 8159, 8175, 8190, 8199, 8208, 8222, 8226, 8230, 8240, 8243, 8245,
8250, 8255, 8260, 8265, 8305, 8313, 8319, 8329, 8334, 8340, 8353, 8356, 8359,
8361, 8364, 8366, 8370, 8373, 8378, 8381, 8453, 8467, 8471, 8480, 8482, 8486,
8494, 8530, 8538, 8542, 8585, 8597, 8601, 8616, 8659, 8704, 8707, 8710, 8719,
8722, 8725, 8730, 8735, 8745, 8747, 8759, 8776, 8801, 8805, 8962, 8976, 8993,
9633, 9644, 9658, 9668, 9670, 9676, 9679, 9689, 9702, 9745, 9749, 9788, 9792,
9794, 9824, 9827, 9830, 9835, 10003, 10066, 10084, 10215, 11800, 11813, 57506,
57523, 64258, 65039, 65279, 65535"
// subtable format 12
otf.opentype.tables.cmap.get(1).groups.map(v => v.endCharCode).join(', ')
"32, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, 44, 45, 46, 47, 57, 59, 60, 61,
62, 63, 64, 90, 91, 92, 93, 94, 95,96,
122, 123, 124, 125, 126, 160, 161, 162, 163, 164, 165, 166, 167,
168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 179, 180, 181, 182, 183, 184, 185,
186, 187, 190, 191, 196, 197, 198, 199, 202, 203, 206, 207, 208, 209, 214, 215, 216,
219, 220, 221, 222, 223, 228, 229, 230, 231, 234, 235, 238, 239, 240, 241, 246, 247,
248, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266,
267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283,
284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300,
301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317,
318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334,
335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351,
352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368,
369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 394,
399, 402, 403, 416, 417, 431, 432, 450, 461, 462, 463, 464, 465, 466, 467, 468, 469,
470, 471, 472, 473, 474, 475, 476, 482, 483, 486, 487, 490, 491, 500, 501, 504, 505,
506, 507, 508, 509, 510, 511, 536, 537, 538, 539, 567, 579, 592, 593, 600, 604, 615,
616, 618, 622, 630, 635, 638, 644, 658, 661, 664, 665, 668, 670, 671, 674, 676, 679,
688, 690, 691, 695, 696, 697, 700, 703, 705, 716, 721, 728, 729, 730, 731, 732, 733,
734, 736, 737, 738, 739, 740, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778,
779, 780, 783, 784, 785, 786, 787, 800, 806, 807, 808, 810, 812, 817, 820, 829, 834,
837, 863, 865, 885, 890, 894, 900, 901, 902, 903, 906, 908, 910, 911, 912, 929, 937,
938, 939, 943, 944, 961, 962, 969, 970, 971, 973, 974, 977, 981, 983, 985, 987, 989,
993, 1039, 1071, 1119, 1122, 1123, 1138, 1139, 1140, 1141, 1168, 1169, 1170, 1171,
1174, 1175, 1176, 1177, 1178, 1179, 1184, 1185, 1186, 1187, 1194, 1195, 1198, 1199,
1200, 1201, 1202, 1203, 1206, 1207, 1210, 1211, 1217, 1218, 1231, 1232, 1233, 1236,
1237, 1238, 1239, 1240, 1241, 1250, 1251, 1254, 1255, 1256, 1257, 1262, 1263, 1266,
1267, 7491, 7495, 7497, 7501, 7503, 7504, 7506, 7510, 7512, 7515, 7580, 7584, 7611,
7686, 7687, 7692, 7693, 7694, 7695, 7696, 7697, 7702, 7703, 7712, 7713, 7716, 7717,
7718, 7719, 7720, 7721, 7722, 7723, 7730, 7731, 7732, 7733, 7734, 7735, 7736, 7737,
7738, 7739, 7742, 7743, 7744, 7745, 7746, 7747, 7748, 7749, 7750, 7751, 7752, 7753,
7762, 7763, 7768, 7769, 7770, 7771, 7772, 7773, 7774, 7775, 7776, 7777, 7778, 7779,
7788, 7789, 7790, 7791, 7806, 7807, 7808, 7809, 7810, 7811, 7812, 7813, 7822, 7823,
7824, 7825, 7826, 7827, 7828, 7829, 7830, 7831, 7838, 7840, 7841, 7842, 7843, 7844,
7845, 7846, 7847, 7848, 7849, 7850, 7851, 7852, 7853, 7854, 7855, 7856, 7857, 7858,
7859, 7860, 7861, 7862, 7863, 7864, 7865, 7866, 7867, 7868, 7869, 7870, 7871, 7872,
7873, 7874, 7875, 7876, 7877, 7878, 7879, 7880, 7881, 7882, 7883, 7884, 7885, 7886,
7887, 7888, 7889, 7890, 7891, 7892, 7893, 7894, 7895, 7896, 7897, 7898, 7899, 7900,
7901, 7902, 7903, 7904, 7905, 7906, 7907, 7908, 7909, 7910, 7911, 7912, 7913, 7914,
7915, 7916, 7917, 7918, 7919, 7920, 7921, 7922, 7923, 7924, 7925, 7926, 7927, 7928,
7929, 8125, 8126, 8127, 8128, 8129, 8141, 8142, 8143, 8157, 8158, 8159, 8174, 8175,
8189, 8190, 8199, 8208, 8210, 8212, 8213, 8214, 8215, 8217, 8218, 8219, 8221, 8222,
8225, 8226, 8230, 8239, 8240, 8243, 8245, 8250, 8252, 8253, 8255, 8260, 8263, 8264,
8265, 8304, 8305, 8313, 8318, 8319, 8329, 8334, 8340, 8353, 8355, 8356, 8359, 8361,
8363, 8364, 8366, 8370, 8373, 8376, 8378, 8381, 8453, 8467, 8470, 8471, 8480, 8482,
8486, 8494, 8528, 8530, 8538, 8542, 8585, 8595, 8597, 8601, 8616, 8659, 8704, 8706,
8707, 8710, 8719, 8721, 8722, 8725, 8729, 8730, 8734, 8735, 8745, 8747, 8759, 8776,
8800, 8801, 8805, 8962, 8976, 8991, 8993, 9631, 9633, 9643, 9644, 9651, 9653, 9655,
9657, 9658, 9661, 9663, 9665, 9667, 9668, 9670, 9673, 9674, 9675, 9676, 9679, 9688,
9689, 9702, 9745, 9749, 9787, 9788, 9792, 9794, 9824, 9827, 9829, 9830, 9835, 10003,
10066, 10084, 10215, 11800, 11813, 57506, 57523, 64258, 65039, 65279, 127926,
128169, 128274, 129302"

@Pomax
Copy link
Owner

Pomax commented Nov 25, 2019

I've rejiggered the cmap a little so that you can get a (segmented) list of supported charcodes for all tables by using:

myFont.opentype.tables.cmap.encodingRecords.map(r => r.table.getSupportedCharCodes());

(But remember that non-unicode subtables are still very much "who knows, might even include charcodes for things that aren't in unicode" =D)

As you can see, records now have a direct .table property that gets you the cmap subtable, and I've also renamed the cmap-level get(tableID) to getSubTable(tableID) just so that it's clearer what it's fetching for you.

@Pomax Pomax closed this as completed Nov 25, 2019
@RoelN
Copy link
Collaborator Author

RoelN commented Nov 26, 2019

I'm gettings this for Fraunces, a random font I was testing with:

console.log(e.detail.font.opentype.tables.cmap.get(1).startCode);
(42) [32, 38, 44, 48, 58, 63, 65, 91, 95, 97, 105, 160, 171, 187, 191, 8211, 8216, 8220, 8230, 8249, 65535, 0, 0, 0, 45, 9, 65534, 65474, 0, 65533, 65468, 0, 0, 65441, 65426, 65407, 57415, 57392, 57386, 57375, 57365, 1]

EDIT: This was still on the previous version. After a pull, the rejiggered version works like a charm! Thanks!

@Pomax
Copy link
Owner

Pomax commented Nov 26, 2019

Oh, weird. Good to know the rejigger jiggered things into the correct sequence =D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants