Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for FileAttachment annotations #6988

Merged
merged 3 commits into from
Feb 24, 2016

Conversation

timvandermeij
Copy link
Contributor

Fixes #6276.

The file from the unit testing commit can be used for testing, as well as the file in #6276 and the extra file I attach here (thanks to Terry Corbet for providing it): FileAtt_ex_3.8_01.pdf

@Snuffleupagus Since you were so kind to review my previous annotation layer PRs, would you be have time to review this one too? If not, feel free to unassign yourself.

@Snuffleupagus
Copy link
Collaborator

Sure, I'll try to review this as soon as possible!
But one question before doing so: Does this actually work in the Firefox addon/built-in version?
The reason for asking, is that we're not using the "standard" DownloadManager there, but rather a special one, please see https://github.com/mozilla/pdf.js/blob/master/web/firefoxcom.js#L77-L119.

var pdfUrl = combineUrl(window.location.href,
'../pdfs/annotation-fileattachment.pdf');
loadingTask = PDFJS.getDocument(pdfUrl);
waitsForPromiseResolved(loadingTask.promise, function(pdfDocument) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion, to make this code a bit simpler. Remember that you can chain promises, i.e. this could instead become:

waitsForPromiseResolved(loadingTask.promise, function (pdfDocument) {
  return pdfDocument.getPage(1).then(function (pdfPage) {
    return pdfPage.getAnnotations().then(function (pdfAnnotations) {
      annotations = pdfAnnotations;
    });
  });
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, but unfortunately it does not seem to work. If I don't explicitly use waitsForPromiseResolved everywhere, the annotations variable is undefined when we reach the unit test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

      waitsForPromiseResolved(loadingTask.promise, function (pdfDocument) {
        return pdfDocument.getPage(1).then(function (pdfPage) {
          return pdfPage.getAnnotations().then(function (pdfAnnotations) {
            annotations = pdfAnnotations;
            return Promise.resolve();
          });
        });
      });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really strange, but that does not work either. This is the result:

TEST-UNEXPECTED-FAIL | should correctly parse a file attachment | TypeError: annotations is undefined in http://127.0.0.1:63625/test/unit/annotation_layer_spec.js (line 215)

There is #6905 for replacing waitsForPromiseResolved when we upgrade Jasmine, so maybe we can live with it for now.

@Snuffleupagus
Copy link
Collaborator

I did some quick testing with the PDF file from https://bugzilla.mozilla.org/show_bug.cgi?id=1230933, and it appears that there are some edge-cases left to handle. For example, the annotation icons doesn't show up, and the file names have weird initial characters (e.g ..__..__License.rtf).

Also, two quick observations when comparing with Adobe Reader.

  • Using your attached PDF file. When hovering over the icon, there is an box visible with text and file name (perhaps a Popup annotation?), that this patch doesn't seem to have. Is that intentional?
  • Adobe Reader also seem to add files from annotations to the "Attachments" tab. I realize that it might not be simple to implement that, but is it something that we want to try and emulate?

@Snuffleupagus
Copy link
Collaborator

Does this actually work in the Firefox addon/built-in version?

It seems the answer is no :-(
I think that you need to pass in a DownloadManager instance in annotation_layer_builder.js, similar to how the LinkService is handled.

For example, the annotation icons doesn't show up, and the file names have weird initial characters (e.g ..__..__License.rtf).

The missing icons here are probably connected with missing AP entries, so this should be covered by issue #6810.
As far as the file names goes, it seems that the PDF file itself sets them to ..//..//License.rtf. Do we want to follow Adobe Reader here, and ignore the initial characters?

@timvandermeij
Copy link
Contributor Author

Thank you for the thorough checks! Do you mean that we should not move the DownloadManager to the core, but instead keep it in the web folder and handle binding the links in annotation_layer_builder.js? And does the FirefoxCom DownloadManager override the default download manager when it is used? I'll look into the other issues. Indeed the missing AP entries are related to the mentioned issue.

@Snuffleupagus
Copy link
Collaborator

Do you mean that we should not move the DownloadManager to the core, but instead keep it in the web folder and handle binding the links in annotation_layer_builder.js?

In https://github.com/mozilla/pdf.js/blob/master/web/annotation_layer_builder.js#L62, we're passing in a PDFLinkService instance. Shouldn't passing a DownloadManager instance in the same way work, and still let you handle the binding in src/display/annotation_layer.js?
Edit: So, no need to move the DownloadManager code.

And does the FirefoxCom DownloadManager override the default download manager when it is used?

Yes, note how it's not included in Firefox: https://github.com/mozilla/pdf.js/blob/master/web/viewer.js#L61-L63.

@timvandermeij
Copy link
Contributor Author

Oh, I see now! Thanks, I'll make the necessary adjustment for this PR soon.

@timvandermeij timvandermeij force-pushed the fileattachment-annotation branch 4 times, most recently from 9c564a0 to f9785f2 Compare February 15, 2016 14:28
@timvandermeij
Copy link
Contributor Author

I'm going over your feedback to make sure I addressed it all:

  • I have removed the commit that moved the download manager to the core and made the corresponding changes to the other commits to use the already available download manager code and simply pass it to the annotation layer builder. It should now work in the add-on too.
  • I could not get the promise chaining working as indicated in the commit. I think we can keep it like this, unless you have another idea?
  • We now use stringToBytes in the unit test to reduce the amount of code.
  • Support for appearance streams is tracked in Support annotations without appearance streams #6810 and is therefore out of scope for this patch.
  • The issue with the weird initial characters is solved in the third commit. More explanation is available in the commit message. It is in a separate commit because of the move of the utility function, to not make the first commit too large.
  • The popups are now available. I don't know how I missed that, but handling them is very similar to how it's handled for Text annotations. It is not common for FileAttachment annotations to have separate Popup annotations, but there are generators that do that (for example http://www.gnostice.com/nl_article.asp?id=193&t=Embed_Files_As_Attachments_To_PDF_Using_Java), so we handle that too.
  • Adding the attachments to the attachments tab is rather difficult at this point, so I suggest we open a follow-up issue for that after this patch lands. It's not important for the scope of this PR, more like an extra feature.

Could you check the commits again?

@@ -757,7 +757,8 @@ var PDFViewer = (function pdfViewer() {
return new AnnotationLayerBuilder({
pageDiv: pageDiv,
pdfPage: pdfPage,
linkService: this.linkService
linkService: this.linkService,
downloadManager: new DownloadManager()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too sure that this will work with the viewer components.

You probably need to add PDFJS.DownloadManager = DownloadManager; to https://github.com/mozilla/pdf.js/blob/master/web/pdf_viewer.component.js, and pass in a new DownloadManager() when initializing PDFViewer, instead of initializing it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the new commit. Since I'm not too familiar with the components, could you check if this works? I ran node make generic components and inspected the code. I see the DownloadManager class there now, so I think this will work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still wondering if it wouldn't make more sense to pass it into PDFViewer, similar to this: https://github.com/mozilla/pdf.js/blob/master/web/pdf_viewer.js#L42. And the do (in PDFViewer), this.downloadManager = options.downloadManager || null;

The reason for asking is that DownloadManager won't be available in the pageviewer component (the LinkService is stubbed out too, so I wouldn't worry about that).
Similarily, you probably want to check that it's defined before calling it here: https://github.com/mozilla/pdf.js/pull/6988/files#diff-1097e3b131ba22f8e8a6c2f20e3ab5ecR786.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I should be done like you suggested in the new commit.

@timvandermeij timvandermeij force-pushed the fileattachment-annotation branch 7 times, most recently from 909fcd4 to 91e55be Compare February 15, 2016 15:52
This is required to be able to use it in the annotation display code,
where we now apply it to sanitize the filename of the FileAttachment
annotation. The PDF file from https://bugzilla.mozilla.org/show_bug.cgi?id=1230933 has shown that some PDF generators include the path of the file rather than the filename, which causes filenames with weird initial characters. PDF viewers handle this differently (for example Foxit Reader just replaces forward slashes with spaces), but we think it's better to only show the filename as intended.

Additionally we add unit tests for the `getFilenameFromUrl` helper
function.
@timvandermeij
Copy link
Contributor Author

@Snuffleupagus Yes, sorry for the delay on this. I have implemented and tested your two ideas and they look really good. I have also updated the patch to do the same as 266cedd did.

@timvandermeij
Copy link
Contributor Author

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/dba2a25c1982c7a/output.txt

@timvandermeij
Copy link
Contributor Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/eb88712052ad642/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/d35d3b1fc907736/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/d35d3b1fc907736/output.txt

Total script time: 20.44 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/eb88712052ad642/output.txt

Total script time: 21.84 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@Snuffleupagus
Copy link
Collaborator

Looks good, thank you for the patch!

/botio makeref

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/509eeb6d2acb2a1/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/315e64df63bf6ad/output.txt

Snuffleupagus added a commit that referenced this pull request Feb 24, 2016
Implement support for FileAttachment annotations
@Snuffleupagus Snuffleupagus merged commit 41efb92 into mozilla:master Feb 24, 2016
@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/315e64df63bf6ad/output.txt

Total script time: 13.78 mins

  • Lint: Passed
  • Make references: FAILED

@Snuffleupagus
Copy link
Collaborator

/botio-linux makeref

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/f08d181bcc678a8/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/509eeb6d2acb2a1/output.txt

Total script time: 19.80 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/f08d181bcc678a8/output.txt

Total script time: 13.76 mins

  • Lint: Passed
  • Make references: FAILED

@Snuffleupagus
Copy link
Collaborator

/botio-linux makeref

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/7248b3cbd0f6f3c/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/7248b3cbd0f6f3c/output.txt

Total script time: 21.02 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@Snuffleupagus
Copy link
Collaborator

Adding the attachments to the attachments tab is rather difficult at this point, so I suggest we open a follow-up issue for that after this patch lands. It's not important for the scope of this PR, more like an extra feature.

Actually, I'm not sure if this would be too difficult to implement :-)
Given that we use incremental loading/rendering in the viewer, we obviously cannot add all FileAttachments when the document loads (since that would require us to check all pages).

However, very recently, I started re-factoring the sidebar code since it seemed in need of some clean-up.
I just realized that based on the new code I've been writing, appending FileAttachments to the sidebar once they become available (i.e. the corresponding page is rendered) is actually feasible!

The current state of this work is available here: master...Snuffleupagus:PDFSidebar.

@timvandermeij
Copy link
Contributor Author

Thank you for reviewing this! I have opened a new issue for tracking the sidebar feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants