-
-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inserting HTML to PDF #134
Comments
Thank you for you kind words. I really appreciate them 😁 Can you please share what features are missing? I will consider placing them somewhere on the roadmap.
Generally speaking, I am against creating functionality that resembles HTML. HTML and CSS are really complex and trying to recreate them would be an enormous effort. On the other hand, if we decide to limit the scope, I expect a constant stream of requests (looking like bug reports) to add various functionalities to the HTML parser. However, please elaborate more about your idea 😁 |
My use case might be a little bit special. We are getting a certain html snippet from some third party and need to wrap it into a pdf. This html snippet can contain tables, text, images etc. There is not a lot of CSS going on. And the length of that content can span across multiple pages. But I need control where to do the page break and maybe add a some text above the footer except for the last page and so on. I know that the main tasks are not to be handles by QuestPDF and more from my side but since there is no HTML support yet I can't even look into figuring stuff out around it. |
I understand your point of view. I am also afraid that QuestPDF is not a proper library in your case. You specifically want to render HTML content as PDF files. The easiest way is just to use any HTML-to-PDF converter that exists on the market. As stated before, HTML and CSS are so complex that I am not planning to support anything that resembles such format. Especially, that aformentioned converters (usually based on Chromium which is just a webpage engine) are just better in this regard (when you accept their paging limitations). I want to give more granual control specifically designed for dynamic PDF generation with paging support in mind. I hope you understand 😁 |
Take a look at the HTML-to-PDF converter https://github.com/Kemsty2/HtmlConverter. |
Yes, this library is one of the potential solutions - they are free and paid products based on this idea. They are either based on wkhtml2pdf or chromium engine. Basically, they do run entire webbrowser emulation inside (including javascript). This is slow, usually unstable and has many limitations. I strongly believe that QuestPDF offers and will continue to offer features that are very specific to PDF generation domain, e.g. advanced paging support. However, I do not deny that there are usecases (like yours) where alternatives fit better. I am afraid that I will never attempt to support HTML format in QuestPDF at this level of complexity. |
I too have a similar use case, as we have tried several solutions but the page break is never followed. What if we pre-rendered the html to an image and then placed it into the document? Would that work? |
|
So if I have an image that spans 2 and a half pages, QuestPDF will not
break it up over those pages but just have 1 very tall page, is that
correct? Bummer. We have users who input formatted text and unfortunately
that can't be removed.
…On Sun, Mar 13, 2022 at 1:59 PM Marcin Ziąbek ***@***.***> wrote:
1. Prerendering html as image is a good idea. I suggest rendering with
higher resolution so the content looks sharp.
2. Please notice that QuestPDF will display images as is. It is on
your side to properly divide html content for each page so to achieve
correct page breaks.
3. QuestPDF currently treats images as solid blocks. That means,
images cannot be broken into multiple pages automatically.
—
Reply to this email directly, view it on GitHub
<#134 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABG46GSGFBJ5Z7BMLG7HZA3U7Y3ILANCNFSM5QC4TKYQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
I guess the next possible step would be to somehow parse the html and convert it to QuestPDF syntax. |
That we be a preffered solution if your HTML content is shaped in well-known and predictable way.
|
I have not been aware about such a requirement. If more people ask for it, I can implement native support for image breaking. So far, I expect that the vast majority of use cases want to scale or wrap the image and keep it as a whole. Please remember that you can always predict size of available space on the document and divide your image into smaller image chunks 😁 |
Hi @bgiromini I have faced the same issues when getting HTML snippets from third parties that need to be integrated. A solution, could be to use With this, you can load in an HTML file or snippet, then loop over the nodes inside. If you know what to expect, you can make sure everything is covered. Maybe if there is enough need for this, the community can come together and create a separate package that includes some basic HTML components, for example render |
I started on a lightweight HTML conversion library here: https://github.com/adamfoneil/QuestPdfUtil |
I am excited to observe your progress! It really depends on how many HTML/CSS features we want to support 😁 At this moment, I am not sure if the semi-HTML parser have any benefits over existing API. I expect that developers will constantly hit some incompatibilities or missing features, making such effort a hell. Let's hope that I am wrong! P.S. I am very sorry for replying so late. There is just a lot of going on in my personal life, very positive yet time-consuming events. I hope tha you understand 😁 |
Thank you @MarcinZiabek ! I was pleasantly surprised you starred my repo. No idea what will come of it. I just needed something in a pinch to handle the HTML fragments that are edited in my application. I totally understand why it would not be mainline functionality in a PDF library. Yes I think it would be pretty difficult to make it more full-featured, but what's there now works for my use case. |
In my case our users use TinyMCE to edit text for a proposal.
…On Thu, May 5, 2022 at 3:41 PM adamfoneil ***@***.***> wrote:
Thank you @MarcinZiabek <https://github.com/MarcinZiabek> ! I was
pleasantly surprised you starred my repo. No idea what will come of it. I
just needed something in a pinch to handle the HTML fragments that are
edited in my application. I totally understand why it would not be mainline
functionality in a PDF library. Yes I think it would be pretty difficult to
make it more full-featured, but what's there now works for my use case.
—
Reply to this email directly, view it on GitHub
<#134 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABG46GUQ2XUAGFEN76GWUZDVIQW7LANCNFSM5QC4TKYQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
TinyMCE sounds like limited and well-known environment. In such a case, where you can accuratelly predict all requirements and corner cases, the solution proposed by @adamfoneil may work really nicely. Of course, assuming that his library is extended to support output from TineMCE 😁 |
@adamfoneil, your idea is very useful I look on your repository and find this interesting |
I needed HTML support, so I wrote a small library for this. I will be glad if it will be useful to someone else. @MarcinZiabek I used the outline of your icon. If you are against it, let me know, and I will replace it. |
@GeeSuth This is really an interesting concept but I am not planning to integrate it to the library. The complexity of HTML and CSS (especially modern versions) is overwhelming. It is just not possible to create a translation layer that will work for so many cases. Not to mention, QuestPDF is not that powerful. After all, behind Chromium (and any other web rendering engine) there are dozens of full-time experts. Whatever such library does, it will always be a small subset of the actual HTML+CSS technology. That being said, I am very happy to see new libraries that are attempting to fill that gap. All similar effors are very welcome! I can't wait to observe their development. |
@Relorer Your library looks very interesting and is already quite impressive 😁 Keep going!
I am totally ok with using the library logo (in the original or modified form). I only suggest to use the full name of QuestPDF - this may help you position your nuget in the results. For example, Relorer.QuestPDF.HTML (or something similar) Also, I have a question 😁 In accordance to my message above, writing a fully capable HTML+CSS to QuestPDF converter is just not possible, for multiple reasons. No matter how many features you add, there will always be more features to support. This was nicely shown in your first issue Relorer/HTMLToQPDF#1 However, I expect that many projects may benefit from supporting a very limited and predictable set of HTML content, e.g. something that is an output of WYSIWYG editor. Do you plan supporting this type of scenario? I think this may be a great niche for your project 😁 |
@MarcinZiabek I hadn't thought of such editors, and I didn't even know they had such a name 😆 Now I plan to add minimal CSS support. And then I'll probably really try to explore the existing WYSIWYG editors to add the missing features Thank you for advice 😁 |
Slightly off-topic from HTML parsing, but this is significantly easier when using a reduced markup such as markdown. Unfortunately I can't share it, but I do have a (mostly) working markdown to PDF converter that was created without too much trouble with the Markdig library. |
I was thinking about markdown too 😁 Great idea, thank you! I am not sure though, if all editors are capable of outputting markdown. Also, markdown is not as powerful compared to HTML. Maybe there is space for both approaches? |
In my use case, I have over 10 years worth of data stored as html would need to be converted or deprecated. |
You can always use chrome to convert HTML to PDF using PowerShell https://gist.github.com/ilovefreesw/da435865a443a62923d67e6af6c6b2a8 If it's business oriented that's the shortest way |
You're correct that it's not as powerful, however that probably works in your favour, given it means there's a significantly limited set of possibilities for formatting, layout, etc, compared with HTML |
Finally after a lot of searching I decide to learn QuestPDF concept I find this very useful |
If you have to work with html it might be easier to use something else like Puppeteer Sharp to load/inject a html page in to a headless Chromium browser and just save html to PDF https://github.com/hardkoded/puppeteer-sharp#generate-pdf-files |
The problem with headless Chrome is the flaky paging support for table headers and footers |
What about something simple like Markdown? I suspect there would almost be a 1:1 correlation between Markdown features and QuestPDF features. The use case would be for those complex PDF's that have snippets of text or cover letter content, that you might want the end user to be able to edit. |
@kurtisane @humphrey In case you are still looking for something like this using markdown, I have decided to create a small library that does exactly that, and expect to release a non-preview version in the upcoming days. You can find it here: https://github.com/christiaanderidder/QuestPDF.Markdown |
Hi everyone! I am enjoying the QuestPDF very much, @MarcinZiabek thanks! I am in need of outputing a simple HTML content that is the product of a WYSIWYG editor. I have noticed there are a few suggested solutions, some from a while ago, others quite new. My question is, which is currently the most viable solution? |
At this moment, there is no official support for this feature 😥 Based on nuget and github statistics, this https://github.com/Relorer/HTMLToQPDF is a viable solution. Huge shout out to its authors! Once I finish working on a couple of high-priority features, I will consider introducing official HTML support or (more likely) helping existing projects if their authors are interested in collaboration 😀 |
Hi,
first of thanks for all of this !
We are currently not using this library due to the lack of some features but I love, appreciate and track this already for some time.
I see this as the "rescue" from headless chromium to render stuff.
One of the requirements for us is most definitely the rendering of HTML content.
So I would love to basically create a PDF and insert HTML snippets into it.
Is this something that can be added to the roadmap ?
Regards !
The text was updated successfully, but these errors were encountered: