Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object Renderer link placement #477

Closed
hbergmey opened this issue May 12, 2020 · 19 comments
Closed

Object Renderer link placement #477

hbergmey opened this issue May 12, 2020 · 19 comments

Comments

@hbergmey
Copy link

hbergmey commented May 12, 2020

As outcome of #475 I am now rendering a tree graph using a custom object renderer based on Graphics2D. but I am now having extreme difficulties to getting the hyperlink boxes for the nodes in my graph correctly, which link to description pages in my document. By seeing where the mouspointer turns into a finger icon in Acrobat Reader I can see, that all link shapes seem to cover almost the whole region covered by the object. It does not matter, where I klick, it opens the same link, so this single link covers the whole region.

The display itself works fine and I am able to scale the graph to fit it in the page width. The graph consists of nodes connected by arrows. Each node is built from a Swing JPanel that lays out JLabels for Node ID, title, node information and outgoing edge ports using GridBagLayout. The nodes themselves are organized in a large JPanel using two nested GridBagLayouts. Again positioning works great and using SwingUtilities.convertRectangle I am able to determine the exact position relative to the main panel of each Title and NodeID label that is supposed to carry a link.

To collect the link shapes, I have extended the JLabel Component with an Interface that provides the url. So during rendering in the custom object drawer, I am able to collect all linked components after layout completes, calculate their bounding shapes relative to the Graphics 2D and create the link map, with shapes as keys and urls as values.

I have verified that the positions of the collected shapes match the drawn Ids and titles by drawing them as red boxes at the end of graph drawing. The cover exactly the areas that are supposed to be the hotspots of the links. But if I return the shape-to-url map and let opengtmltopdf even render just the firdt link, its hotspot covers almost the whole page, instead of the title it is supposed to cover.

I have tried several corrective transformations, even transforms I would expect to result in an extremely small region in the upper left corner on the page, but it does not change anything. It is still the whole region reacting to the click.

Since it does not seem to matter, how large or small I transform the collected link shapes, I think this is a bug and I do not have a clue, how to get it right.

@hbergmey
Copy link
Author

It appears to me, as if the complete object-Element becomes linked, instead of the individual shapes rendered by it's object renderer. For the following screenshot I have placed the mouse pointer on top of the graph object.

grafik
It is scaled extremely small to obfuscate the proprietary info. Sorry for that, but I think, what I would like to show, still becomes apparent.

You see a couple of nodes organized in columns. The red marks depict the link shapes I am returning from drawObject. The object is highlighted in yellow, as this is how Firefox` internal PDF viewer styles hovered links. The object was layouted as follows:

      <object type="custom/decisiontreegraph" treeId="585" title="585"
              style="width:50%;height:200px;">
       </object>

One thing aside: Apart from the whole object being highlighted as link, it is apparent, that the scaling in the PDF does not match the defined style. The graph object is wider than 50%. It appears as if the canvas width is calculated according to landscape orientation, even though in all stylesheets and PageFormats I have set "a4 portrait". The width matches 50% only if I multiply the render scale by 0.707, which is the width-to-height aspect of the DIN norm. 1/sqrt(2)

Back to topic. I have experimentally limited the shape-link export to only the first element in the upper left corner. The link shape fills the whole object frame. Then I have experimentally applied a scale factor of 0.0001 to each link shape. Still the whole object kept being highlighted and as only a single link. Then I reduced the scale even further and found that when I reached a scale of 0.0000033 suddenly the whole link disappeared. Maybe I have reached the limit of float precision.

So what am I doing wrong. Does each element that should be rendered as a link require a kind of explicit data object as child of the object element? Do I have to return a specific kind of Shape implementation instead of a AffineTransform scaled JLabel or it getBounds() result?

@hbergmey
Copy link
Author

I have debugged my way into PdfBoxFastLinkManager.checkLinkArea which returns the Rectangle2D for the hotspot and generates a "key" for the linked target. Actually the rectangle is always the bounding box of the object, not of the linked shape. I do not fully understand the purpose of the "key". As far as I understand, it serves to recognize and prevent overlaps. Or does it define the actual hot spot for the PDF reader?

According to PDFBoxFastLinkManager.placeAnnotation(AffineTransform transform, Shape linkShape, Rectangle2D targetArea,PDAnnotationLink annot) I have the impression, the annotation is always generated for the area of the object, not of the shape. Still I am seeing quadPoints set.
grafik
grafik

@danfickle, @rototor, do you have an idea, whether this makes any sense? The final result shows that the whole Rect works as a link and that the area marked by the QuadPoints seems to be irrelevant.
grafik

@hbergmey
Copy link
Author

One more thing: The QuadPoint values appear to be outside of the Rect in the last screenshot. But maybe these values are ok if the reader knows the transform too. The transform does not seem to be part of the Annotation.
grafik
I do not know enough about the semantics of the QuadPoints to validate these values.

@rototor
Copy link
Contributor

rototor commented May 12, 2020

I just had a quick look, somethings seems to be broken here. My freemarker example with the JFreeChart pie has no longer correct shapes for the pie. See featuredocumentation.ftl
It used to work...

Just for your understanding:

  • A Hyperlink is always an annotation in the PDF.
  • Most PDF Readers only use / respect the rectangle of the Annotation.
  • Only the Acrobat Reader correctly respects the QuadPoints. The QuadPoints define the real shape of the Annotation within the rectangle.
  • The Shape is converted to a list of triangles, which then are exported to form a quad each. See PdfBoxFastLinkManager.mapShapeToQuadPoints().

See also https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf - 12.5.6.5

I'll try to investigate that, but I can not tell you exactly when I will find time. Maybe it got broken when porting to the "Fast" mode.

@hbergmey
Copy link
Author

Thank you for checking. Well AdobeReader unfortunately behaves the same as the Firefox internal PDF viewer. I just chose the latter for the screenshot, because it highlights links.

I am currently diving into the PDF specification about annotations and this rises a number of questions.
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf#%5B%7B%22num%22%3A631%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C88%2C418%2Cnull%5D

For example, F=4 means NoZoom:

Interactive FeaturesCHAPTER 8494Ifthe NoZoom flag is set, the annotation always maintains the same fixed size onthe screen and is unaffected by the magnification level at which the page itself isdisplayed. - p.494

Should the shape of the annotation not zoom with the page?

And Adobe's own specification does not define QuadPoints as supported for annotations of type "link", only for Freetext-Markup.

@hbergmey
Copy link
Author

There's something stange happing with transformations where the QuadPoints are mapped. I understand, why the coordinate system must be mirrored vertically, but I am unsure, where the original scale and translation originate from and I am currently unsure, whether the transformations are applied in the correct order.

I managed to see a few link rects after dropping the translation part from the transform, but the scales where not correct and the order of columns seemed randomized. It appears as if somewhere x and y axis had been confused, scaled and a quantization applied, maybe through casting from double to float or float to int somewhere.

I still do not have a real trace. Any help is much appreciated.

@hbergmey
Copy link
Author

@rototor

I'll try to investigate that, but I can not tell you exactly when I will find time. Maybe it got broken when porting to the "Fast" mode.

The issue is equally present in "Slow" mode, too.

Like always in developer reality, when issues arise, they are instantaneously pressing. :) But I am in no position to press on you. I am already impressed and thankful for how quickly you and Dan responded to my issue.

I will gladly help on solving the issue, but I am having a really hard time, understanding the way the transformation context is developing while traversing the input and in which coordinate space I am on what level of the scene graph. I've already forked openhtml2pdf and managed to compile the library locally. Some tests are failing, though.

@rototor
Copy link
Contributor

rototor commented May 13, 2020

Thanks to git bisect I know for sure that 1ffd271 broke the shape creation. Does not make that much sense from a pure look on the commit, as it seems fine. I'll investigate future.

rototor added a commit to rototor/openhtmltopdf that referenced this issue May 13, 2020
For whatever reason we only want the DPI scaling from the given transform, and
not any other property of the transform.

When adding a link to process later, the transform has at that moment also a translate in it.

Before 1ffd271 the used transform was after the last page was
processed. At that moment the transform only had a scale component and no translate.
@rototor
Copy link
Contributor

rototor commented May 13, 2020

I found a fix for this, see the pull request #480

I'm not 100% sure that this is always the correct fix, but at least it fixes my test case. And it makes "mildly" sense.

@hbergmey
Copy link
Author

hbergmey commented May 13, 2020

I applied that to my local clone and, yes, it works for my graph, too. So it is at least more correct than before. :)

@hbergmey
Copy link
Author

So do you think, this fix could be released soon?

@hbergmey
Copy link
Author

Two more things still worry me:

  • If it only works in Acrobat Reader I might run into trouble, when the documents are viewed in Browsers. So how complex is it, to annotate the link shapes instead of the replaced object and does that cause any other problems, I might not be aware of?
  • How can I render the target page numbers next to the link, as I do not have the CSS counter macro available in object to draw. Otherwise, for printing I will have to provide tables with page numbers for the nodes.

@rototor
Copy link
Contributor

rototor commented May 14, 2020

Two more things still worry me:

  • If it only works in Acrobat Reader I might run into trouble, when the documents are viewed in Browsers. So how complex is it, to annotate the link shapes instead of the replaced object and does that cause any other problems, I might not be aware of?
    This is something i actually thought about when waking up today ;) The annotation rectangle could be reduced to be the intersection of the current annotation rectangle and the bounding box of the shape. That would match rectangle shapes perfectly but also would help for other shapes. I'll look into that.
  • How can I render the target page numbers next to the link, as I do not have the CSS counter macro available in object to draw. Otherwise, for printing I will have to provide tables with page numbers for the nodes.

At the time the objects are drawn all future pages are not yet layouted. So there is no way to know what page number a link target will get at that time.

But if you are willing to go an extra mile you should be able to get the target page numbers:

  • You first create the PDF from your document, at that time without page numbers for the link targets.
  • You can then inspect the PDF using PDFBox and get the page numbers for the links. To understand what to do for that you should use the PDFBox Debugger and inspect the document.
    image
    Ok, here it gets really tricky. The A entry has the reference to the target page, so you can use that to get the page number (document.getPages().indexOf( page)). But you need to identify your source link that resulted in the annotation to be generated. You only have the Quads and Rect for that. And they are scaled and transformed. You should have some internal list from your first run you then use to identify the links. But this will only work if the Rect is the bounding box of the Quads.

I'll try to implement the reduced bounding box for the annotation. After that you could try implementing that.

If generating the PDF twice is no option for you, you will need to build some kind of table of content yourself.

@hbergmey
Copy link
Author

hbergmey commented May 14, 2020

The rectangle intersection sounds like an adequate solution. Just ping me, when I can apply a patch or if you need help.

Yes rendering page numbers is a multi-pass task in dynamic layouts. If I remember correctly, LaTeX requires three passes to ensure line- and page-breaks are corrected after insertion of the page number references, because they themselves change the text length again. And then you still have no guarantee to have the optimal solution.

How does OpenHtmlToPdf accomplish this in general? Placeholders? Two-Pass rendering?

I wonder, whether generating an abstract intermediate tree before materializing the actual PDF stream would help. The intermediate representation could use extended implementations of Rects managed by a factory, which "know" that they are not final until the first pass has completed. That's the method I am using for calculating arrow endpoints in the graph. The nodes have to be layed out first, because I do not even know their sizes and order before. And SwingUtilities can calculate absolute positions for me implicitly flattening the transformation graph from the container hierarchy. I am not afraid to parse the PDF tree, it's just, it is always less error prone to save additional information explicitly when generating it, instead of parsing and interpreting intermediate results in a foreign format.

rototor added a commit to rototor/openhtmltopdf that referenced this issue May 14, 2020
Yes, those classes are copies, so there is duplication. But before extending this method
I rather have it only in one place.
rototor added a commit to rototor/openhtmltopdf that referenced this issue May 14, 2020
@rototor
Copy link
Contributor

rototor commented May 14, 2020

I've implemented the this reduced bounding box in the #480 pull request branch. You can test it there.

I know that the link annotations are placed in the pages at the very end of the process after all pages have been generated. And that the page break logic is rather complex and also modifies the render DOM in place while performing page breaks. I.e. an object that spans multiply pages will be drawn multiple times, i.e. for every page it is on with changed positions. I think that is a little bit dirty that way, but it's that way here for a very long time (that was also in flyingsaurcer that way). No idea how this CSS counter stuff is implemented.

@hbergmey
Copy link
Author

Awesome! Link placement now works in Firefox as well. The yellow highlight marks the hotspot and it is placed precisely where it is supposed to be.
grafik
(Let's ignore FF's zoomed rendering resolution, which is quite underwhelming.)

@hbergmey
Copy link
Author

And Adobe Reader still works, too.

@hbergmey
Copy link
Author

About the printable page numbers: My first step will be a separate references list, because I need a quick solution as part of explicitly splitting the graph for display on several pages with a guaranteed effective font size in print. So the amount of nodes per page will be limited as well.

danfickle added a commit that referenced this issue May 16, 2020
@danfickle
Copy link
Owner

Assumed solved by #480 - please reopen as required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants