New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Encoding to Font Dictionary #28
Conversation
Hello @jerp. Thank you for submitting this PR! I've done a little bit of research on this and it looks like you're on the right track. However, I think a few more things will need to be done before this is complete.
For example, this is how the text
And this is how it would look as a Hex String:
Notice that each pair of numbers in the hex string is the hexadecimal representation of the corresponding character in the Literal String:
However, JavaScript's Please let me know if you're interested in working on these additional changes. I'm happy to help point you in the right direction and find your way around the Here are some other useful bits of
Here are some of the relevant sections of the PDF specification:
|
I committed a version that uses the mapping of PDFKit. I did not check this mapping but being a user of this library for many years, I trust it better than myself. The way i implemented it, requires to send a PDFHexString instance to drawText instead of a string. This Hex String is encoded properly. See example where I modified the following lines:
Although, I'm an experience Javascript developer, I know little about Typescript, so your input is most than welcome. The next thing I'm looking into is Font subsetting. This is an important requirement for me (file size). |
Wow, this is great Jerome! You churned this out very quickly. I'm impressed. I'll do a full review later this evening. It looks good overall. There are a few stylistic and linting related things that will need tweaked. I'd also like for the P.S. You may have already found this, but the easiest way to try out code changes is by running the integration tests with |
I didn't know About the above, implementing it within |
What do you think about passing the font object to the For example, this is how it works right now: const [helveticaFontRef, helveticaFont] = pdfDoc.embedStandardFont(HELVETIVA_FONT);
// ...
drawText(helveticaFont.encode('Olé! - Œ')[0], {
x: PAGE_1_WIDTH * 0.5 - 30,
y: PAGE_1_HEIGHT - 48 - 30,
font: HELVETIVA_FONT,
size: 12,
}), But it could potentially work like this: const [helveticaFontRef, helveticaFont] = pdfDoc.embedStandardFont(HELVETIVA_FONT);
// ...
drawText('Olé! - Œ', {
x: PAGE_1_WIDTH * 0.5 - 30,
y: PAGE_1_HEIGHT - 48 - 30,
font: HELVETIVA_FONT,
fontObject: helveticaFont,
size: 12,
}), My primary concern here is not technical. Clearly, it works either way. I'm just thinking about the API design. It's a balancing act between creating APIs that are too low-level and confusing for people to quickly grok and use, vs creating a super slick high-level API that requires more effort to maintain. In this case, neither of these APIs is super elegant and easy to use. However, I think passing the font object will be easier to document, and users will be less likely to make mistakes this way. It also paves the way for supporting automatic text wrapping by the I'd like to hear your thoughts on this. And of course, I'm open to alternative solutions as well. |
Can you please run P.S. You can prevent the linter from reformatting a block of code by adding a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this comment be moved in the PDFStandardFontFactory
along with the font dictionary?
|
||
import PDFFontEncoder from 'core/pdf-structures/factories/PDFFontEncoder' | ||
|
||
const toWinAnsi= (charCode: number): number => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be converted from a switch statement to a mapping object? The switch statement feels a little heavyweight for this purpose:
const toWinAnsi = (charCode: number): number => ({
402: 131, // ƒ
8211: 150, // –
...,
381: 142, // Ž
381: 158, // ž
}[charCode] || charCode);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit puzzled by index signature message, but now, I did it like this:
const UnicodeToWinAnsiMap:{ [index:number] : number } = {
402: 131, // ƒ
...,
}
const toWinAnsi= (charCode: number): number => UnicodeToWinAnsiMap[charCode]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll also need to include the default value (|| charCode
) if the map doesn't contain a mapping?
const toWinAnsi = (charCode: number): number => UnicodeToWinAnsiMap[charCode] || charCode
Also, while your type definition for the UnicodeToWinAnsiMap
is correct, there's no need to include it for a simple mapping object. TypeScript should infer the type for you:
const UnicodeToWinAnsiMap = {
402: 131, // ƒ
...,
}
Type inference is one of TypeScript's strong points and it helps avoid a lot of type definition boilerplate. It puts languages like Java or C++ to shame 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try several options, I get an error message if I omit type definition of UnicodeToWinAnsiMap
see a simplified case on TS Playground
When the flag noImplicitAny
is on, this triggers an error message. This saw that this option is on in tsconfig.json
Sure for the default charCode, unit tests detected it!
About inference, got error messages, but I’ll give it another go.
… On 31 Aug 2018, at 23:58, Andrew Dillon ***@***.***> wrote:
@Hopding commented on this pull request.
In src/core/pdf-structures/factories/PDFStandardFontFactory.ts:
> +class PDFStandardFontFactory implements PDFFontEncoder {
+ static for = (fontName: IStandard14FontsUnion): PDFStandardFontFactory =>
+ new PDFStandardFontFactory(fontName);
+
+ fontName: IStandard14FontsUnion;
+
+ constructor(fontName: IStandard14FontsUnion) {
+ validate(
+ fontName,
+ oneOf(...Standard14Fonts),
+ 'PDFDocument.embedStandardFont: "fontName" must be one of the Standard 14 Fonts: ' +
+ values(Standard14Fonts).join(', '),
+ );
+ this.fontName = fontName
+ }
+ encodeText(text: string): PDFHexString {
I'd suggest including the type for the method arrow function's argument (I think it's required, actually), but not for the .map lambda arrow functions:
// Include the param type here
encodeText = (text: string): PDFHexString =>
PDFHexString.fromString(text
.split('')
// but not here
.map(char => char.charCodeAt(0))
.map(toWinAnsi)
// or here
.map(charCode => charCode.toString(16))
.join(''))
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Just tried defining that |
@jerp Is there anything else you plan to do on this PR before it's merged? If not, then I'll go ahead and merge it. |
If you’re statisfied with the latest change go ahead. I’m looking at
fontkit, font subsetting and how to lighten it all up. Fancy a font-lib
repo?
…On Mon, 3 Sep 2018 at 22:26, Andrew Dillon ***@***.***> wrote:
@jerp <https://github.com/jerp> Is there anything else you plan to do on
this PR before it's merged? If not, then I'll go ahead and merge it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#28 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGSuxAuT6Uzz53EKSG4FwhIaQABLGvHKks5uXZCIgaJpZM4WPybC>
.
|
Yeah, this looks great! I'll go ahead and merge it. I still think I might want to change the API for the Thanks again for the work you did on this! It's a real help to me, and provides a much needed feature for Regarding The reason I forked it (and the Are you planning on making any changes in particular? Or still just exploring the codebase and what it supports? |
See #17