Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using drawText on a paragraph that uses multiple fonts #577

Closed
ahwitz opened this issue Aug 26, 2020 · 7 comments
Closed

Using drawText on a paragraph that uses multiple fonts #577

ahwitz opened this issue Aug 26, 2020 · 7 comments

Comments

@ahwitz
Copy link

ahwitz commented Aug 26, 2020

Currently trying to transpile a bunch of code that used reportlab/FPDF in Python into JS. This library is significantly more straightforward to use, but I've run into one problem that I can't find a solution for yet.

I need to embed a citation that uses the following syntax:

Author, Author, Author. Title of article. Title of containing journal. Additional metadata at the end.

I've got maxWidth, multiple fonts, and manual typesetting down. My question is how I could get the "Additional metadata" bit at the end to line-break automatically, because I have to create a separate block of text with the italics. Could make this work with a textIndent option on drawText or something like that.

Is there a way to do multiple fonts in a single paragraph, and/or is there a way to set the indent of the first line of a block of text?

Thanks for any answers/help you can provide!

@ahwitz
Copy link
Author

ahwitz commented Aug 28, 2020

I spent a bit of time trying to write a PR for this, and got stuck in one of two places:

  • First, it looks like all text is done in single lines, without the "paragraph-like" elements (per Adobe's docs) that support the StartIndent attributes. Could also solve [Feature Request] API for text alignment #578 by implementing them, but that's far beyond my understanding of the library, and I think would take a somewhat major refactor.
  • Second, I was trying to make something work using the ability to pass PDFNumbers into TJ operators to prepend spacing before the first character, but TJs are supposed to take arrays, but it looks like the library's internal PDFArray isn't compatible with PDFOperatorArg. If you've got any pointers to be able to easily pass a PDFArray as the second parameter of PDFOperator.of, let me know and I can clean up the rest of my code pretty quickly.

@Hopding
Copy link
Owner

Hopding commented Sep 17, 2020

Hello @ahwitz!

  • Here's an example showing how to convert a PDFArray to a PDFOperatorArg[]:

    const arr = pdfDoc.context.obj(['foo', 'bar']);
    const myTJOperator = PDFOperator.of(
      PDFOperatorNames.ShowTextAdjusted,
      arr.asArray() as PDFOperatorArg[],
    );

    Note that the cast is required. This is because arr.asArray() returns a PDFObject[], which cannot be assigned to PDFOperatorArg[]. If you know that your PDFArray contains only elements of type PDFOperatorArg, this cast is safe.

  • Regarding the StartIndent attribute and "paragraph-like" elements you mentioned, you are correct that pdf-lib does not produce tagged elements. This would be a nice feature to provide and I'd be happy to accept PRs for it. However, these are just structural metadata elements. They do not in any way affect the appearance of a PDF page's content. They're just intended to make it easier to extract content from a document. So I'm not sure that implementing them would necessarily solve [Feature Request] API for text alignment #578.

  • Regarding your original question, I'm not sure I quite understand what you're trying to do. If you just want Additional metadata at the end. to always appear on the next line, could you do something like this?

    pdfPage.drawText('Author, Author, Author. Title of article. Title of containing journal.', { y: 50 })
    pdfPage.drawText('Additional metadata at the end.', { y: 30 })

@ahwitz
Copy link
Author

ahwitz commented Sep 18, 2020

We've actually gone a slightly different direction internally for what we're looking for, but the problem is that we want to write roman text THEN italicized text THEN roman text again in the same paragraph. My implementation right now is something along the lines of:

  • drawText with {font: roman}, note and save width
  • drawText with {font: italic, x: previous width}, note and save width
  • drawText with {font: roman, x: sum of previous widths}

This works, but only if we know the exact width of previous lines in advance. Because these citations can stretch across multiple lines in a single paragraph (and because the "title of containing journal" might be rather long), and because we're using maxWidth to make them look like left-aligned paragraphs, the math to figure out exactly where to start the block of text after the italic is rather difficult.

Does that make sense? I can try to mock up a few PDFs if it's not clear.

@Hopding
Copy link
Owner

Hopding commented Sep 18, 2020

I think that makes sense. So what you're after is a way to know what bounding box (x,y,width,height) contains the text applied to the page by drawText, so that you can position subsequent text blocks below it?

@ahwitz
Copy link
Author

ahwitz commented Sep 19, 2020

Unfortunately, no. I've actually built that on top already without touching the base code, and that was relatively straightforward.

For more clarity:

Screenshot from 2020-09-19 11-33-47

The vertical bounding box components for the blue text would be two lines. The horizontal bounding box components would be 0 to wherever the line break was inserted. Given the current API, to position the green text, we would need to know specifically where the blue text ended on the third line, so that we could start the green text at the proper kerning/spacing from it.

I think something like widthOfTextWithIndentsStartingAtXPos could work to follow what I remember of the current API, but that might start to get a bit annoying. Simplest way to phrase this might be "How can I make it look like one drawText call continues a paragraph started by another drawText call, without knowing how long the first block of text was?"

@Hopding
Copy link
Owner

Hopding commented Oct 2, 2020

@ahwitz That makes sense. It would be nice if pdf-lib provided text layout information to facilitate more complicated use cases like this. However, since the necessary primitives already exist, I will not personally be creating these APIs for some time. There are a number of other features I need to implement first. So I'm going to close this issue for now. That being said, I'm happy to accept PRs if anybody would like to work on this.

@Hopding Hopding closed this as completed Oct 2, 2020
@ahwitz
Copy link
Author

ahwitz commented Oct 2, 2020

I'm definitely not realizing something you are: what primitives exist to allow the green box to be positioned dynamically after the blue box? Or, at least, how can I know where the line break around "extravagantly long word" will be positioned, or how much extra space will be left on the line before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants