This standalone utility, written in C++, converts TrueType
fonts to Type 0
Composite fonts that can be accessible from a postscript program. Without this conversion no postscript program can access a truetype font!
Adobe supports 3 approaches for conversion of ttf
fonts to Type 42
base font.
- If
ttf
file character set contains up to 256 glyphs, then conversion involves wrapping up of TrueType font into a PostScript Type 42 font. The Postscript show operator will extract asingle byte
every time. - If
ttf
file character set contains more than 256 glyphs, then Type 0 composite font conversion is required having Type 42 as base font. The TrueType font will be split into a hierarchy of multiple Type 42 fonts supporting access upto 256 glyphs at each level. i.e. At the first level, a Type 42 font can access first 256 glyphs (0 to 255), the second one will access the next 256 glyphs (256 to 511) and so on. The bottom most one will access the remaining glyps in the character set which should never exceed 256. This is called8/8
mapping whose FMapType is2
. The Postscript show operator will extracttwo bytes
every time. First byte isfont number
and the 2nd one will becharacter code
. - If TrueType font character set involves more than 256 glyphs, then a TrueType font can be converted into a CIDFont with TrueType outlines which also falls under Type 0 composite font conversion. This is called
CMap
mapping whose FMapType is9
. The CMap resource (maps character code to CIDs) can be constructed so the mapping inCIDMap
(CIDs to glyph indices) is reduced to an identity mapping. i.e Glyph index = CID. A CIDMap could be either a string or an array of strings or a dictionary. In this software, CIDMap is implemented as a string and Adobe calls these fonts asType 2 CIDfonts
with Type 42 as base font. In this composite font (Type 0), the hierarchy involved is only one level which is the CIDfont itself having the capability of accessing 65535 glyphs! Bear in mind that the Postscript show operator will extracttwo bytes
every time. This 16 bit value will be thecharacter code
that maps toglyph index
throughCIDMap
.
Since number of glyphs exceed 256 for ttf
files related to Indian Languages, conversion utility should be developed based on either 2nd approach or 3rd approach.
- The 3rd approach is not only flexible over 2nd approach but also support for unicode (UTF-8) is implementable readily.
- In the 3rd approach the hierarchy of composite fonts will never exceed 1 whereas in the case of 2nd approach the depth will be atleast 2 and number of Glyphs in the character set will decide the hierarchy's actual depth. i.e.
Depth of Hierarchy = numGlyphs/256 + ((numGlyphs % 256) > 0)
. i.e. add 1 with quotient if remainder is non-zero.
This conversion utility has been developed based on 3rd approach
.
- This utility is a console application developed on Microsoft Visual Studio Community 2022 (64-bit) Edition- Version 17.4.2 under Windows 10.
- This utility is available in Linux platform too.
- Note that this program is 100% portable across Windows and Linux. i.e. The source files (
main.ccp
andttf.h
) are same and identical across platforms. - Ghostscript version is 10.0.0 (64 bit) and GSView version is 5.0 (64 bit).
This program has been developed based on the following documents.
- Microsoft OpenType® Specification Version 1.9
- Fonts - TrueType Reference Manual - Apple Developer
- PostScript Language Reference, third edition - Adobe Corporation (912 pages PDF, 7410K) Feb/1999
- The Type 42 Font Format Specification #5012 (28 pages PDF, 159k) 31/Jul/1998
- Adobe CMap and CID Font Files Specification #5014 (102 pages PDF, 541k) 11/Jun/1996 Version 1.0
Create a folder D:\cidfonts
in D drive (or any other drive) and store ttf
file which needs conversion. Now issue the following command:
ttf2postscriptcid.exe -d "D:\cidfonts\filename.ttf"
If option -d is specified, then ttf
table data will be displayed during execution.
This utility generates the following two files as output:
- filename.t42 is the required converted file with extension Type 42.
- filename.ps is a postscript program file whose execution displays the glyphs present in the character set along with CIDs and Unicode Points. Since only around 12% Glyphs of Indian Languages are alloted
Code Space
(Unicode Points), the Glyphs with no unicode points association will havenone
printed. - Invoke Ghostscript to execute a postscript program in order to display glyphs as follows:
gswin64c.exe "D:\cidfonts\filename.t42" "D:\cidfonts\filename.ps"
Google has developed a family of Tamil fonts called Noto Sans Tamil. In order to convert NotoSansTamil-Regular.ttf font to Type 42 CID-Keyed font, issue the following command:
ttf2postscriptcid.exe -d "D:\cidfonts\NotoSansTamil-Regular.ttf"
Two files are generated by the command which are NotoSansTamil-Regular.t42 and NotoSansTamil-Regular.ps respectively. To view Glyphs for entire character set along with CIDs and Unicode Points, issue the following command:
gswin64c.exe "D:\cidfonts\NotoSansTamil-Regular.t42" "D:\cidfonts\NotoSansTamil-Regular.ps"
.
There are 534 Tamil
and Latin
Glyphs in the character set which will be displayed in 5 pages by Ghostscript.
In order to test NotoSansTamil-Regular.t42 CID-Keyed font file, edit a file tamil.ps using nodepad in the folder D:\cidfonts\
with the following postscript code:
%!PS-Adobe-3.0
/myNoTo {/NotoSansTamil-Regular findfont exch scalefont setfont} bind def
13 myNoTo
100 600 moveto
% தமிழ் தங்களை வரவேற்கிறது!
<0019001d002a005e00030019004e00120030002200030024001f002f0024005b0012002a0020007a00aa> show
100 550 moveto
% Tamil Welcomes You!
<0155017201aa019801a500030163018801a5017f01b101aa018801c20003016901b101cb00aa00b5> show
showpage
Issue the following Ghostscript command to execute the tamil.ps
postscript program.
gswin64c.exe "D:\cidfonts\NotoSansTamil-Regular.t42" "D:\cidfonts\tamil.ps
This will display two strings தமிழ் தங்களை வரவேற்கிறது!
and Tamil Welcomes You!
respectively in subsequent rows.
Note that the strings for show
operator are in Hexadecimal format embeded within angular brackets. Operator show
extracts 2 bytes at a time and maps this CID (16 bit value) to a Glyph.
For example, the first 4 Hex digits in the 1st string is 0019
whose decimal equivalent is 25
. This maps to glyph த
.
This utility has been tested for the following Indian languages successfully.
- Malayalam
- Telugu
- Kannada
- Gujarati
- Gurmukhi (Punjabi)
- Oriya
- Bengali (assamese)
- Devanagari (Sanskrit, Hindi and Marathi)
- The first difference is that PostScript supports
cubic Bézier
curves, where each arc of each glyph is described byfour control
points. TrueType usesquadratic splines
instead ofcubic
, with each arc having onlythree control
points. This offers less control over the shape of the curve. - The second difference is the way they perform
hinting
. Since TrueType was originally targeted to low resolution screen rendering, its hinting system works by adjusting the curves to fit nicely on pixel lattice points, using a fairly elaboratebytecode mechanism
. To edit the hints of a TrueType font, you must learn ttf's low-level programming language which is daunting task for the typical font designer. On the other hand, PostScript fonts were intended for higher resolution paper prints, and used guidelines to snap curves to right angles at appropriate places. For an Adobe font designer, these hints are pretty easy to understand and declare.
For a Truetype font to be recognized by a Postscript interpreter, it must be enclosed in a Postscript font dictionary as a CID font with CIDFontType 2 and FontType 42. CIDMap maps CID(Character Identifier) to Glyph index and enforces Identity Mapping
as follows:
Character code 0 maps to Glyph index 0
Character code 1 maps to Glyph index 1
Character code 2 maps to Glyph index 2
......
......
Character code NumGlyphs-1 maps to Glyph index NumGlyphs-1
This t42
fonts, converted font from ttf
, are futile unless renders support to Unicode (UTF-8).
This has been demonstrated by Tamil.ps
postscript program in which hexadecimal strings in angular brackets are supplied as operands to postscript's show
operator. Of course this hex string definitely must represent some UTF-8 encoded string which should have been supplied by an application software (written either in C, C++, Postscript or any other language) that uses the t42
font.
To get an answer for this important question read the post How to implement Unicode (UTF-8) support for a CID-keyed font (Adobe's Type 0 Composite font) converted from ttf?.