New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

large gaps and undesirable page breaking for tables #1640

Closed
tenshon opened this Issue Apr 15, 2014 · 33 comments

Comments

Projects
None yet
9 participants
@tenshon

tenshon commented Apr 15, 2014

After the latest patch to fix blank pages that occur with large nested tables, there are no more blank pages at the start of the document, but in certain scenarios there are large gaps between tables within pages (not always a page break), and even medium sized nested tables (<50% page height) are being put onto the next page.

Could we fix this by only breaking the page if the TD (that will be moved to the next page) contains no nested tables? Isn't this really what we're trying to achieve?

@ashkulz

This comment has been minimized.

Member

ashkulz commented Apr 15, 2014

Can you post a sample test case?

@ashkulz ashkulz added this to the future milestone Apr 18, 2014

@ashkulz ashkulz changed the title from Large gaps and undesirable page breaking with new patch to large gaps and undesirable page breaking for tables Apr 24, 2014

@ashkulz ashkulz removed this from the future milestone Apr 24, 2014

@ashkulz

This comment has been minimized.

Member

ashkulz commented Apr 24, 2014

A sample test case would be helpful to understand the exact scenario.

@tenshon

This comment has been minimized.

tenshon commented Apr 24, 2014

It's occurring in one of my client's reports that had a lot of sensitive data in it. When I tried filtering out all the sensitive stuff, it wouldn't exhibit the problem. I'll try again when I get a moment.

@ashkulz ashkulz added the NeedInfo label May 4, 2014

@ashkulz

This comment has been minimized.

Member

ashkulz commented May 9, 2014

Please reopen the issue when you are able to reproduce it.

@ashkulz ashkulz closed this May 9, 2014

@tenshon

This comment has been minimized.

tenshon commented Jun 5, 2014

Sorry this took so long. The issue still exists with the current version.
http://myworkflowfirst.com/hotfix/portrisk.htm
...generates this:
http://myworkflowfirst.com/hotfix/portrisk.pdf

The report is auto-generated and consists of multiple nested tables.

Would it be possible to change the page break logic so that it only pushes the TD to the next page if it DOESN'T contain a table?

@ashkulz - not sure how to re-open, could you do it for me?

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jun 6, 2014

@tenshon: it looks like the HTML was generated from Word, is that correct? You might want to clean it up before passing it on to wkhtmltopdf.

@ashkulz ashkulz reopened this Jun 6, 2014

@tenshon

This comment has been minimized.

tenshon commented Jun 6, 2014

@ashkulz - I think it has additional styles in there for Word because the users wanted to open it in Word or something - not sure, I didn't create it. I could try to create a cleaner HTML to reproduce, but still I don't think wkhtmltopdf should be creating those page breaks...

What do you think of my idea of only breaking if the TD doesn't contain a table?

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jun 6, 2014

With the kind of junk HTML Word generates, the document switches to "Quirks mode" -- which may cause all sorts of rendering issues. Can you try to see if you get different output after cleaning up the HTML so that it renders in "Standards compliance mode"?

quirks-mode

@tenshon

This comment has been minimized.

tenshon commented Jun 6, 2014

@ashkulz - Ok I'll edit the HTML to get it out of quirks mode and let you know.

@tenshon

This comment has been minimized.

tenshon commented Jun 7, 2014

@ashkulz - ok the guy who obfuscated the figures used Word to find/replace, which messed it up.

I've attached a cleaner version here:
http://myworkflowfirst.com/hotfix/portrisk2.htm
http://myworkflowfirst.com/hotfix/portrisk2.pdf

This does look better (much better than the normal generated report) - but still at the bottom of the first page it's moving the table onto the next page. I presume this is because it's wrapped in a TD - but again, if you only pushed TDs to the next page that don't contain tables, wouldn't it fix it?

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jun 7, 2014

I don't think that is really fixable without changing the current approach -- you should never implement any row moving logic unless you have a page-break-inside: avoid defined at the tr level. Then the default behaviour would be as earlier, and you could use CSS to enable the row moving logic for specific tables/rows insead of the current hacky approach.

@ashkulz ashkulz added PatchNeeded and removed NeedInfo labels Jun 7, 2014

@tenshon

This comment has been minimized.

tenshon commented Jun 7, 2014

Is there any way to disable the current logic, in factor of explicitly putting in "page-break-inside: avoid" ?

@tenshon

This comment has been minimized.

tenshon commented Jun 7, 2014

@ashkulz - ok I just read through the previous thread and looked at the code changes that were made, and right - it doesn't look like it can be disabled in CSS. Would you be able to implement the "less hacky" approach of requiring page-break-inside:avoid any time soon? My clients are pretty eager to have the issue resolved, it's a pretty big deal to them...

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jun 7, 2014

Nope, not really. I work on wkhtmltopdf in my spare time, and this doesn't look like something which can be tackled very easily (although the actual changes required may be less, understanding the Webkit code enough to know what needs to be done will take a lot of time).

@tenshon

This comment has been minimized.

tenshon commented Jun 7, 2014

@ashkulz - I appreciate all the work you do on this project - but I don't suppose you consider doing work on wkhtmltopdf on a paid basis do you?

@ashkulz ashkulz closed this in 5dea253 Nov 21, 2014

@ashkulz ashkulz added this to the 0.12.2 milestone Nov 21, 2014

@ashkulz ashkulz added Fixed and removed PatchNeeded labels Nov 21, 2014

@ashkulz

This comment has been minimized.

Member

ashkulz commented Nov 22, 2014

A development snapshot 0.12.2-dev-5dea253 is available, which should fix this issue. Please refer to the breaking changes in this release and report back if your issue is not solved with the above snapshot or there is a regression from existing behavior.

@Ognian

This comment has been minimized.

Ognian commented Nov 29, 2014

@ashkulz OK continuing here. So I tried to check what the difference might be between your html and mine but without success. It is easier said than done, I spent hours trying out css woodoo ...

I think that we need a more deterministic approach by specifying and building the right TOOLS to find this kind of problems so we can fix them.
A first step would be if wkhtmltopdf could provide a debug output focused on this kind of issues
i.e.when page break happens and what where the effective css rules leading to this...

What do you think?

@ashkulz

This comment has been minimized.

Member

ashkulz commented Nov 30, 2014

Strip out all the CSS and see if it still gets cut, then it is a clear content related issue. If not, then go on adding till you find the offending declaration (it can help to use a binary search approach -- remove half of the CSS which is know to cause the problem and repeat this step).

@Ognian

This comment has been minimized.

Ognian commented Dec 1, 2014

See https://gist.github.com/Ognian/813dedee3bd28633784c
I've striped out all css, left only the
thead, tfoot {display: table-row-group; }
generating the pdf via

wkhtmltopdf --page-size A2 --orientation landscape test1.html test1.pdf

and still get a cut line
bildschirmfoto 2014-12-01 um 10 58 32
even if removing thead, tfoot {display: table-row-group; } it is still cut at that line...
So, what do you mean that this is a content related issue? (It is a correct html and it is displayed correct in the browser)....

@ashkulz

This comment has been minimized.

Member

ashkulz commented Dec 1, 2014

@Ognian: it looks like you are using nested tables. If you have a row in the outer table which spans multiple pages, it is going to get cut anyway -- so it makes sense to apply the rule to the inner table only e.g. tr.json-array-table { page-break-inside: avoid } so that its rows are kept together. Still, the line should not get cut even then -- I'll have to investigate it further.

@ashkulz

This comment has been minimized.

Member

ashkulz commented Dec 3, 2014

@Ognian: 715cc3d is a step in this direction, but unfortunately the reported output is not accurate for table cells. That's something to be done in the future 😄

@Ognian

This comment has been minimized.

Ognian commented Dec 3, 2014

tried tr.json-array-table { page-break-inside: avoid } but, the line is still cut; so I hope you'll find something...

@repher

This comment has been minimized.

repher commented Jan 2, 2015

Just stumbled accross this discussion after having the same issue as most of you (overlapping, line cut in half). After installing 0.12.2-dev-5dea253 and adding tr { page-break-inside: avoid; } everything works for me. Even when using thead for repeating headers and extrem combination of rowspan="x".

Thanks for your work!

@hendricius

This comment has been minimized.

hendricius commented Jan 5, 2015

Unfortunately I am having the same issues as @Ognian. The fix @repher mentioned does not work for me.

@ashkulz thanks for the awesome work. I am also not using nested tables. I am just using a default table.

Here is my code. I hope it's ok to post my slim template.

  table.table
    thead
      tr
        th.span-1 WID
        th.span-4 Work Item Description
        th.span-2 Unit
        th.span-1 Price
        th.span-1 Quantity
        th.span-1 Discount
        th.span-1 Subtotal
    tbody
      - section.line_items.each do |li|
        tr
          td= li.wid
          td= li.description
          td= li.unit
          td= humanized_money_with_symbol @report.convert_money(li.price)
          td= li.quantity
          td= number_to_percentage li.discount, precision: 2
          td= humanized_money_with_symbol @report.convert_money(li.budget)
        - if li.custom_description.present?
          tr
            td colspan="7"
              .pre-wrap= li.custom_description

Here is a screenshot of how it looks like:
image

Edit: I removed all stylesheets. Except the following:

.hidden-in-print
  display: none

body
  background: #fff

.main-container
  background: #fff
  padding: 0px

.page-break
  height: 1px
  page-break-before: always

table
  border-collapse: collapse
  tr, td, th
    page-break-inside: avoid !important

It still looks like this:

image

The wkhtml version I am using is: wkhtmltopdf 0.12.2-dev-5dea253 (with patched qt) for OSX

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jan 6, 2015

@hendricius: Please post actual HTML and CSS so that I can verify it.

@hendricius

This comment has been minimized.

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jan 10, 2015

0.12.2 has been released, which includes changes related to this issue.

@Ognian

This comment has been minimized.

Ognian commented Jan 11, 2015

I tried with 0.12.2 but unfortunately it doesn't solve my issues...
Any idea when a 0.13 alpha for OS X will be available, so that I can test with that?
Thanks
Ognian

@ashkulz

This comment has been minimized.

Member

ashkulz commented Jan 11, 2015

0.13 alpha won't help you, see the release plan. I'll have to port all these patches to the 0.13 version (which should be done at 0.13 rc). Your problem is basically that

when page-break-inside: avoid for rows is used, lines within a row may break in between pages

which is a consequence of the way the patch was done -- in RenderBlock::layoutRunsAndFloats the position of the line is not adjusted if it is a table cell which has pagination -- which causes the break. There's no easy way to avoid it.

Please open up a new issue to track this.

@richardwan

This comment has been minimized.

richardwan commented Feb 2, 2015

was having the same issue, everything fixes when i add a style table tr {page-break-inside: avoid}

@ghprod

This comment has been minimized.

ghprod commented Feb 1, 2016

Thanks @repher , your solution works for me

tr { page-break-inside: avoid; }

Its nice ..

image

Even with tfoot

image

👍

@montao

This comment has been minimized.

montao commented Oct 20, 2017

Your "solutions" don't work in my case. I get both your issues (1) Overlapping table headers (2) Lineabreaks which b0rk the table content. I regret becoming a programmer because of CSS.

@akhileshsaipangallu

This comment has been minimized.

akhileshsaipangallu commented Nov 13, 2017

I was facing the same issue add after a lot of trial n errors this css solved the issue

tr {
display: inline-table;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment