Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Advanced Highlighting #61

Open
valh1996 opened this issue Jul 14, 2022 · 8 comments
Open

[Feature] Advanced Highlighting #61

valh1996 opened this issue Jul 14, 2022 · 8 comments

Comments

@valh1996
Copy link

valh1996 commented Jul 14, 2022

Hi,

I want to do extensive highlighting on the text of my PDF and not all the features are supported by pdfjs-dist.

I would like to add as an option the possibility to have a search with :

  • diacritics insensitive (unsupported), case insensitive (supported), entireWord search (supported)
  • several words to search, e.g. "alex" and "alice (unsupported)

The first solution I propose would be to add PdfFindController to access it via the $ref please? Then I just have to fork the library to add the desired functionality.

The other proposal would be to possibly expose an event (like beforeRender) that would allow us to easily alter the text of the rendered PDF? But I am not sure that this second option is possible.

Can you help me with this feature because i really need it and your package seems to be the "best" available right now please?

@valh1996
Copy link
Author

valh1996 commented Jul 14, 2022

@hrynko
Copy link
Owner

hrynko commented Jul 17, 2022

Hi @valh1996,

Do you think updating PDFJS to 2.13.216 would cover the diacritics insensitive search problem (check this PR)? As for "several words to search", I'm not sure how this is supposed to work.

@valh1996
Copy link
Author

valh1996 commented Jul 18, 2022

Hi @valh1996,

Do you think updating PDFJS to 2.13.216 would cover the diacritics insensitive search problem (check this PR)? As for "several words to search", I'm not sure how this is supposed to work.

Hi @hrynko,

Yes thanks, I think it would be perfect for the diacritics. However, how to access the pdfFindController to execute a find with your package?

I think it should be possible to do it via the ref making pdfFindController public:

pdfEmbedRef.pdfFindController.executeCommand('find', {
  caseSensitive: false,
  findPrevious: undefined,
  highlightAll: true,
  phraseSearch: false,
  query: query
});

As for the "several words to search", the problem is that if I run the above find with the word "Alice", and then re-run it with the word "Alex", then it will overwrite all previous matches with the word "Alice".

But for that, I'm not sure if you can do something at your level. I could for example simply modify this part of the PDF-JS lib with patch-package for exemple.

Therefore, could you please do an update to make access to the find command? And when a new version with the changes on master will be available (for PDFJS > 2.13.216) ?

EDIT : It looks like we have to go through the eventBus now to highlight the text instead of executeCommand. I don't know if you have an example to highlight in this case?

@hrynko
Copy link
Owner

hrynko commented Jul 20, 2022

I've tried exposing something like the following, with no success so far:

import { EventBus, PDFFindController } from 'pdfjs-dist/legacy/web/pdf_viewer.js'
...
const findController = new PDFFindController({
  eventBus: new EventBus(),
  linkService: this.linkService,
})
findController.setDocument(this.document)

I'm not sure if this will work outside of PDFViewer yet, but if you could continue this experiment, I would appreciate a PR.

@valh1996
Copy link
Author

valh1996 commented Jul 22, 2022

I've tried exposing something like the following, with no success so far:

import { EventBus, PDFFindController } from 'pdfjs-dist/legacy/web/pdf_viewer.js'
...
const findController = new PDFFindController({
  eventBus: new EventBus(),
  linkService: this.linkService,
})
findController.setDocument(this.document)

I'm not sure if this will work outside of PDFViewer yet, but if you could continue this experiment, I would appreciate a PR.

Yes that's what I tried too, but as you say it can't work without a viewer I think. I tried to ask the question, but it seems that in this case we have to initialize everything manually...

Wouldn't it be easier to refactor using the simple viewer?

What are the advantages of having created the component outside the viewer?

@hrynko
Copy link
Owner

hrynko commented Jul 23, 2022

What are the advantages of having created the component outside the viewer?

I expected that not using the viewer component could be more flexible and predictable, although it would have some limitations.

Wouldn't it be easier to refactor using the simple viewer?

It might be, but it would require additional refactoring of the component and could lead to unexpected side effects. So if it can be done without using a viewer, I would do it like this. Otherwise, I would postpone it until the next minor release.

@valh1996
Copy link
Author

I tried an alternative solution to make the highlight when rendering the textlayer since we get the text with the exact position. But we lose all the advantages of the PDFFindController.

So, after trying several things, I can't get a conclusive result if you can help me on this please?

@hrynko
Copy link
Owner

hrynko commented Jul 27, 2022

I'm having issues updating PDFJS, so I'd like to resolve them first. Will have another look at the highlighting issue afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants