Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of file-embed link. #640

Merged
merged 2 commits into from
Feb 1, 2021
Merged

Conversation

danfickle
Copy link
Owner

@danfickle danfickle commented Jan 21, 2021

#508 #509 #636

Based heavily on code by @syjer whom I'm indebted to. Thanks.

Still todo:

  • Prevent duplicate file embeds. Possibly create a map of uris to PDComplexFileSpecification and reuse if encountered again. I have to peruse the PDF spec to see if this is allowed.
  • Logging on fail.

Based heavily on code by @syjer whom I'm indebted to. Thanks.

Still todo:
+ Prevent duplicate file embeds. Possibly create a map of uris to PDComplexFileSpecification and reuse if encountered again. I have to peruse the PDF spec to see if this is allowed.
+ Logging on fail.
@syjer
Copy link
Contributor

syjer commented Jan 21, 2021

Thank you @danfickle , I'll close the old PR :)

@danfickle
Copy link
Owner Author

OK, here is a valid file embed link for PDF/A3 standards:

   <a href="file-embed.html"
      download="source.html"
      data-content-type="text/html"
      title="File embedded"
      relationship="Source">

      Link to source code

   </a>

The attributes are:

  • href: The file source. Required.
  • download: The file name as presented in the PDF viewer. Required.
  • data-content-type: The mime type of the embedded file. Defaults to application/octet-stream.
  • title: Optional short description of the file. Strongly recommended for PDF/A3 documents.
  • relationship: Should only be used in PDF/A3 documents. The relationship of the embedded file to the link. Must be one of the following:
    • Source - Example: source markup of the document.
    • Supplement
    • Data - Example tabular data.
    • Alternative - Example audio providing an alternative view of the material.
    • Unspecified - Fallback catch all.

NOTE: If the relationship attribute is present the files modification date is set to the servers current date (Calendar.getInstance()). The modification date is required by PDF/A3.

Useful links in writing this implementation:

@danfickle danfickle marked this pull request as ready for review February 1, 2021 12:26
@danfickle danfickle merged commit c808b63 into open-dev-v1 Feb 1, 2021
@danfickle danfickle deleted the file_embeds_for_508 branch February 1, 2021 12:29
This was referenced Feb 1, 2021
@houssainy
Copy link

houssainy commented Jan 23, 2023

@danfickle
I want to embed an XML file to the generated PDF/A3 file but I don't really want the tag to appear in the generated PDF. So is there a way to hide the <a> tag?

I tried multiple ways to hide the tag but all of them were failing. Also i tried to have an empty tag

 <a href="file-embed.html"
      download="source.html"
      data-content-type="text/html"
      title="File embedded"
      relationship="Source">
   </a>

but this also didn't work and generated PDF didn't contain the expected attachment.

@maciejtoporowicz
Copy link

Hello, does it work in fast mode? It only works for me in slow mode.

In fast mode I can see these logs:

com.openhtmltopdf.load WARNING:: URI file:///<path> with type FILE_EMBED was rejected by resource access controller
com.openhtmltopdf.load WARNING:: Was not able to load an embedded file for embedding with uri file:///<path>

The PDF is created and the link is rendered, but it's not clickable.

@maciejtoporowicz
Copy link

Ok, figured it out. The DefaultAccessController used in com.openhtmltopdf.swing.NaiveUserAgent rejects embedding files.
I don't really understand this mechanism, but I found out that it's configurable. I created my own "AccessController":

private static class AllowAllAccessController implements BiPredicate<String, ExternalResourceType> {
  public boolean test(String uri, ExternalResourceType resourceType) {
    return true;
  }
}

and then configured the pdf renderer to use it:

new PdfRendererBuilder()
    .useExternalResourceAccessControl(new AllowAllAccessController(), ExternalResourceControlPriority.RUN_BEFORE_RESOLVING_URI)
    .useExternalResourceAccessControl(new AllowAllAccessController(), ExternalResourceControlPriority.RUN_AFTER_RESOLVING_URI)
    (...)
    .run();

Now it works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants