Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify API for creating PDF #277

Merged
merged 1 commit into from
Mar 4, 2024
Merged

simplify API for creating PDF #277

merged 1 commit into from
Mar 4, 2024

Conversation

asolntsev
Copy link
Contributor

@asolntsev asolntsev commented Feb 28, 2024

now user needs to write less code to generate PDF from HTML

This commits adds 2 new apis:

  1. class Html2Pdf (a single method for generating PDF from HTML)
  2. renderer.createPDF(doc, os) (instead of old sequence setDocument, layout, createPDF)

The work continues...

…enerate PDF from HTML

This commits adds 2 new apis:
1. class Html2Pdf   (a single method for generating PDF from HTML)
2. renderer.createPDF(doc, os)   (instead of old sequence `setDocument`, `layout`, `createPDF`)
@asolntsev asolntsev added this to the 9.6.0 milestone Feb 28, 2024
@asolntsev asolntsev self-assigned this Feb 28, 2024
@asolntsev asolntsev changed the title simplify API for creating PDF: now user needs to write less code to g… simplify API for creating PDF Feb 28, 2024
@andreasrosdal

This comment was marked as outdated.

@asolntsev
Copy link
Contributor Author

@andreasrosdal Yes, generally we could add some validation/html cleanup.
But using regular expressions doesn't seem to be a good idea for this purpose: https://medium.com/thecyberfibre/stop-parsing-x-html-with-regular-expression-2cf13215b411

@andreasrosdal
Copy link
Contributor

How about using Jsoup to fix invalid XHTML?
https://jsoup.org/

In general I agree that regular expressions should not be used to fix bad HTML. However, we could use something, and at least regular expressions is one of the possible alternatives. Jsoup is probably better.

@andreasrosdal
Copy link
Contributor

@asolntsev
Copy link
Contributor Author

@andreasrosdal Yes, JSoup sounds good.
Could you pelase share some example of such "invalid XHTML" needing a cleanup? I am trying to understand what problem we want to solve.

@andreasrosdal
Copy link
Contributor

andreasrosdal commented Mar 1, 2024

Here are some examples of html changes had to be made to make the html XHTML valid for flying saucer for PDF export:
image
image
image
image

@asolntsev
Copy link
Contributor Author

@andreasrosdal Thank you for the samples. Some of these could be really replaced automatically (e.g. <link> -> <link></link>), but some others are actually invalid, and FS should throw exception in these cases (e.g. <<i class="">).

@andreasrosdal
Copy link
Contributor

I wish that FS would handle these cases, and not throw exception, in the same way that Firefox and Chrome is fault tolerant for invalid HTML in many cases.

@asolntsev asolntsev merged commit 80a757b into main Mar 4, 2024
2 checks passed
@asolntsev asolntsev deleted the refactoring/simplify-api branch March 4, 2024 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants