TOC support broken after upgrade to latest QT #1509
Comments
Did this work with 0.11rc2? |
I'm not sure about 0.11rc2, but does work on wkhtmltopdf-0.11.0_rc1 |
I just now tried this on Windows, and it looks like it is working ... can you confirm it with a screenshot? Which PDF viewer are you using? |
I've just tested the TOC option on windows 32bit & 64bit an neither of them has created a 'table of contents' page. Has something changed on how to enable this option? |
Sorry, I never used the TOC option -- I thought you were talking of the outlines shown in Adobe Reader. I'll look into it a bit further. |
Looks like the XSL transformation is not happening. I'm trying to find out if it is due to a change in the underlying QT, or due to some build configuration changes in wkhtmltopdf. |
I had removed the dependency on XmlPatterns in 303957c -- I was under the impression that reverting this would fix this issue. However, the root cause turned out to be something else. This was due to QTBUG-10309 in QtXmlPatterns which caused breakage of Google Calendar and hence XSLT support was disabled in WebKit -- apparently, there are other issues in the QT XSLT support. This was imported just after QT 4.8.0-beta in wkhtmltopdf/qt@5f4e810 and hence affected wkhtmltopdf due upgrade to QT 4.8.5 (earlier release 0.11rc2 was built with QT 4.7). There has been no progress on the bug at the QT end, even a merge request on Gitorious being ignored. I'm not really sure what the way forward is, I am not comfortable with enabling XSLT support as it would possibly break a lot of things (but solve this use case). Using libxslt (and libxml) would be an option, but would require additional dependencies and take a lot of effort. Applying the patch above may fix the issue, but don't know what other issues will popup. So as it stands now, TOC support is broken and will remain broken for the near future (or unless we change the approach). How many people use this feature? Please chime in here with comments, as I will have to balance working for this issue against other features/issues. |
The best option going forward seems to be dropping the XSL support completely, and roll it out by hand. We have a few options in front of us, in order of simplicity (least to most):
Option 3 will require us to use an external template library like HTML Template C++, which is also license-compatible with wkhtmltopdf (it is LGPLv2.1). We have to hardcode the generation of nested LIs, as there is no supported for nested recursive templates in most template engines. Option 4 is ruled out due to the complexity involved; I am more leaning towards 2 or 3. Either way, support will only be added when I have the time (or a PR with the necessary changes is welcome). |
We used to to hardcoded html+css, but it seems hard to support everything On Wed, Feb 12, 2014 at 12:30 PM, Ashish Kulkarni
|
But does anyone use it? Other than @Jimbobnz, no one has chimed in ... right now, TOC does not work at all -- one would have thought there would be lots of "me too!" reports (which would be welcome, for a change). |
I can't speak for everyone but I use the TOC feature for our reporting. Looking back at original Google's issue tracker (see the links below) I can find a quite a few users who do use this feature. Having the TOC, cover page, headers & footers is what makes wkhtmltopdf a powerfully tool, this is what makes it stands out from is friendly nemesis, phantomjs. As the wkhtmltopdf project as been on hiatus for over 2 year (oct. 2011) it's going to take time for developers/users come-about on the uptake of the new version 0.12.x. I'd imagine most people are sticking with version 0.11.rc1 ("if it's not broken why fix it attitude"). So over time you might find more and more developers asking for this bug/issue to be fixed. Alternative solution (if at all possible) is to do something similar to the header & footer URL page and post a JSON/XML URL encode string containing the TOC data into a user defined custom HTML page which could be used to parse the TOC content using javascript to generate a custom TOC page. The generated TOC page could then be insert back into the PDF output. Hope that all make sense. Example command line would look like something like this: wkhtmltopdf cover mycoverpage.html toc --toc-url mycustomtoc.html somewebpage.html cooloutput.pdf It just a rough concept and I can fully understand how this could not be feasible. |
I'm using the TOC feature a lot. The biggest problem with disabling the TOC feature is IMO the lack of page numbers. While it is no problem (in my use case) to create a separate HTML document from the source and prepend it via No matter how it will done, removing TOC XSLT without introducing something equivalent would mean to me that I have to stick with an older version. |
I am currently using the default XSL as a base template with just a few minor formatting styles in place. I must confess I don't fully understand the option 3. But as long as it works, Great. |
can you post the changed XSL so that I can see if it can be easily done in option 3? |
Doh. There is no easy way to submit my custom version of the xsl file via github. |
Post a gist? |
Still learning how to use github, hopefully you can see this. |
I don't know if option 3 could be a replacement. For example, I'm inserting a caption before the list of TOC entries and I'm parsing the list entries in XSL to change the enumeration style from decimal to roman for certain entries (see https://gist.github.com/mn4367/8972015). So the old decision mentioned by @antialize to offer XSLT was the best way to serve allmost all needs. For option 3, if you allow Javascript in the TOC HTML page it could be possible to modify the entries or do other things to customize the TOC. @Jimbobnz suggestion to post the TOC structure in a simple format to the HTML page sounds good to me since it allows a user to do with it what she/he wants to do. In this case I think that a standard example or default page would be useful for those users who aren't comfortable with hacking Javascript just to get a default TOC. Still, retaining page numbers is crucial and forward and backward links like in older versions are very nice. And to push it further, I'm also using headers and footers in the TOC for example to display page numbers =8). |
@mn4367: read the notes for the above commit and try to see if it works for you. The XSLT support in XmlPatterns is not complete, and the advanced stuff you are using may/may not work. |
Mac OS X binaries for the development snapshot are available. |
Sorry for being very late on this topic. For OS X I can report that TOC generation doesn't work at all (issue #1534). On Windows 764 it's a little bit different:
So I think it's fair to reopen this issue. |
I'm using the TOC feature as well. It's working again on Ubuntu 64bit even with more complicated xsl than default one. I am very grateful for this patch. |
@mn4367: you have to make it XSLT 2.0, as otherwise QtXmlPatterns does not work properly. If that doesn't work, please open a new issue with those details (you seem to have created a lot of issues as well ...) |
OK, I checked it again, with PS: |
[Someone asked for me too reports?] We use TOC and it is broken in 0.12. It works as expected in 0.11.0-rc2. |
@jwernerny: I don't think anyone really likes them, other than the reporter ... did the 0.12.1 build not fix it for you? |
I just downloaded the 0.12.1-dev build (wkhtmltopdf 0.12.1-61b740ee72b5830ad1d07a9bea5246622ed4defb). It appears the issue is fixed in it. Thanks. |
Another "me too" here: I use the TOC feature, and the 0.12.1-dev build seems to generate it again. However, it appears the --toc-depth option is gone. I was using that as well (I don't want my h3's in the TOC). Is there a fix for that? |
Just for the case the --toc-depth is no longer available, as a workaround you could always use your own XSL stylsheet to customize the result. |
Okay I did exactly that, turned out to be simple even for an xslt noob like me. |
I think that the option is now called |
Would it be possible to make an OS X dev build? Or is there an older OS X build available that does have TOC support? |
the one linked from the website should have this fix, although not all the later fixes. Unfortunately, I don't have access to an OS X environment so I can't really make builds for that platform. |
Sorry, which website do you mean? I tried the builds I could find, and none of them had TOC support. Is there anyone on the team who's doing OS X builds? Or if not, would you like me to give it a try? |
There is only one website - http://wkhtmltopdf.org try the development version I think that the build_osx.sh script could possibly work but ymmv. @npinchot was going to port it to the common build script but he doesn't seem to have the time right now. |
Unfortunately, the one at http://wkhtmltopdf.org does not have TOC support as of now. I'll try build_osx.sh when I get a chance. Maybe if I'm feeling ambitious, I'll have a look at the common build script and see if I can contribute. |
That's a bit unlikely as the snapshot posted above by @npinchot on 14-Feb had the changes. |
I think I may have found the source of my trouble. It appears that the command line arguments changed at some point. The top Google hit for "wkhtmltopdf manual" is this: http://madalgo.au.dk/~jakobt/wkhtmltoxdoc/wkhtmltopdf-0.9.9-doc.html I was invoking the program per that manual. But just now, I was able to find a newer version of the manual, which has different arguments for the TOC: http://madalgo.au.dk/~jakobt/wkhtmltoxdoc/wkhtmltopdf_0.10.0_rc2-doc.html I don't know if you have any control over that website, but if so, perhaps a big, bold deprecation notice should be added to the 0.9.9 manual. As it was the top Google hit, it didn't occur to me that a) I was reading an old version, and b) the command line flags had changed. |
The authoritative reference is always the official website. I think that @antialize can possibly remove/redirect those links. |
I rise my hand to this "request" too. Using wkhtmltopdf 0.12.0 in fedora 20. |
@ensensis: which request are you referring to? This issue has been fixed in the development build. |
Perhaps @ensensis is referring to my suggestion: That a deprecation notice be placed on the old manual. The old one, which is no longer correct, is still the top Google hit for "wkhtmltopdf manual." |
Sorry for delay. I was not notified for replies. @ashkulz , yes, you are correct, the issue has been fixed in dev build. Thank you so much |
Hi there. We recently had to upgrade our wkhtmltopdf version in changing servers - we are now running 0.12.2.1 with the latest QT patch, and I am experiencing the missing TOC issue. I understand from the last posts that this should be fixed - is it possible that something in the most recent wkhtml / QT build is causing problems again? There is just a blank page where the TOC is supposed to be. Thanks in advance! |
An update on this issue, for anyone else who might be experiencing it: For interest, is there any other way to have the TOC without a specific header? I checked the latest documentation at http://wkhtmltopdf.org/usage/wkhtmltopdf.txt but couldn't see any such option. |
What was the version you were using previously? |
Previous version was wkhtmltopdf 0.10.0 rc2. A previous developer implemented it and everything was working until our server change, so had no need to change before now |
If you can produce a small, reproducible test case with the latest version -- please report as separate issue. If you want, you can still download the previous version from the downloads page (see archive). |
Version: wkhtmltopdf 0.12.0
OS: Centos 6.4
No TOC is being created and no error generated.
bin/wkhtmltopdf toc http://www.w3schools.com/html/html_headings.asp w3schools.com.pdf
Even when specifying to use xsl file, it has the same results.
bin/wkhtmltopdf toc --xsl-style-sheet toc.xsl http://www.w3schools.com/html/html_headings.asp w3schools.com.pdf
The text was updated successfully, but these errors were encountered: