Add more thorough searching for :MediaBox entry #468

Closed
wants to merge 1 commit into
from

4 participants

@krishicks

This is a solution, though perhaps not the best one, for this issue in prawnpdf/prawn: #386

I think it might be relevant to change the search in document.state.store to be the following:

def media_box
  if dictionary.data.has_key?(:MediaBox)
    dictionary.data[:MediaBox]
  else
--  document.state.store.detect { |ref|
++  document.state.store.to_a.reverse.detect { |ref|
      ref.data.has_key?(:MediaBox)
    }.data[:MediaBox]
  end
end

But I think that would only matter if you want to search the previous pages in the order they were added (if my understanding of how ObjectStore works is correct).

I'm happy to take suggestions or a full rewrite from someone that knows the internals of Prawn better.

Cheers.

@bradediger
prawnpdf member

The code looks fine to me. @yob any concerns or suggestions?

@yob
prawnpdf member
yob commented Apr 5, 2013

If I'm reading the diff correctly it's just searching for a :MediaBox entry anywhere in the file and using that?

That will work fine for most PDFs with consistent page sizes, but only through luck.

Page objects are leaf nodes in a tree of Pages objects, and walking up the tree until a MediaBox is found is the correct thing to do.

@krishicks

Indeed, it does just search for any MediaBox entry.

@yob: I'm curious to know what a better solution is. Given no MediaBox entry for a PDF, should a proper exception be raised stating that the PDF cannot be used as a template, or is there something else that can be done to work around the fact that it doesn't have one?

@yob
prawnpdf member
yob commented Apr 9, 2013

Am I reading the diff wrong? It looks like it replaces the search up the tree with a "select any MediaBox from the document" search.

Are you certain the PDF you're trying to use as a template has no MediaBox, even in the parent Pages objects? That's quite rare, but I guess possible.

@yob
prawnpdf member
yob commented Apr 9, 2013

Missing a MediaBox is against the spec, but Adobe Acrobat allows it and assumes the MediaBox is 0,0,612,792 (US Letter).

I think we should keep the inheritance of page attributes as per the spec and assume US Letter in the very rare case that a PDF without a MediaBox is used as a template.

@bradediger
prawnpdf member

Thanks for catching that, @yob. Yes, upon further inspection, that is my reading of the diff as well, and does not appear to be correct as it could catch the MediaBoxes of siblings (or unrelated branches) in the page tree.

@bradediger
prawnpdf member

I think we should keep the inheritance of page attributes as per the spec and assume US Letter in the very rare case that a PDF without a MediaBox is used as a template.

Yes, I agree. So does that "very rare case" cover the issue that this ticket was opened to cover, @krishicks?

@krishicks

I'm happy with that result. I wasn't really happy with the fix I proposed, as it still allowed for the case where the method would return nil. The method should never return nil. Making it return [0,0,612,792] as a default sounds fantastic. I'll update the pull request later.

@practicingruby
prawnpdf member

Can someone confirm for me whether this is a purely template related issue, or whether it has uses outside of templates as well?

@practicingruby
prawnpdf member

Closing out as a templates-related issue that needs revision. Please re-open if and when revisions are made //cc @cheba

@krishicks

This is a templates-only problem as only when a template is used is the MediaBox potentially missing.

I see a few problems:

  • Page#init_from_object throws away options set in start_new_page, such as size, layout, margins. This means that it's not even possible to hint when the size is known but the MediaBox is missing
  • Page#inherited_dictionary_value may return nil, which causes Document#generate_margin_box to explode

And a new problem I've discovered with another PDF:

  • The MediaBox returned by PDF::Reader may be a PDF::Reader::Reference which needs to be dereferenced
@krishicks krishicks referenced this pull request Dec 2, 2013
Closed

Better MediaBox Support #593

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment