Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Bug 810636 - Poor copy & paste behavior with pdf.js #2989

Open
jviereck opened this Issue Mar 25, 2013 · 14 comments

Comments

Projects
None yet
Contributor

jviereck commented Mar 25, 2013

See also Bug 810636 on Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=810636 for reference.

Basic problem: The textLayer is build up from multiple <span> elements. If the user selects text across multiple spans and copies the text, there is a newline insert between each span. However, it makes more sense (in most cases) to just insert a whitespace. Firefox has now support to change the clipboardData during the copy/cut event (background: https://bugzilla.mozilla.org/show_bug.cgi?id=407983).

Implementation idea:

  1. Add an onCopy and onCut event listener to PDF.JS
  2. If the event is fired, look at the current selected text in the textLayer
  3. For each selected text in the span, take that text and concat it with a whitespace
  4. Put the resulting string on the clipboarData object by using

This shouldn't be hard to implement and I have a plan for it bug lack the time to do it myself and make sure it lands properly.

Let me know if someone is interested in fixing this.

Contributor

vyv03354 commented Mar 26, 2013

The textLayer is build up from multiple <span> elements.

Actually the textLayer is build up from <div> elements, not <span>. The newline is inserted because the <div> element is a blobk-level element.
Even simply changing <div> to <span> will improve the copy result without adding any JavaScript codes.

Contributor

mduan commented Mar 27, 2013

If we just changed all <div>s to <span>s, wouldn't we have the opposite problem? That there wouldn't be spaces between paragraphs? I guess in general this would still be better behaviour than having newlines where there should not be since it's less common to select multiple paragraphs.

Contributor

vyv03354 commented Mar 27, 2013

Correct, but we can add <br> or something when a newline is expected. The opposite (removing a newline when it is inappropreate) is impossible unless adding onCopy and onCut handlers.
Note that I don't oppose the clipboardData solution. However the more natural markup will work as a fallback even if the clipboardData object is not supported or is disabled (due to a security reason, for example).

Contributor

jviereck commented Mar 27, 2013

+1 for using <span>.

Contributor

SSk123 commented Apr 18, 2013

Hi I am interested in fixing this issue ,can anyone mentor me to fix it ,where should I start looking in the code to start working on it?

I got the span working well but can't find the new line character. Is it getting trimmed somewhere earlier ?

Contributor

timvandermeij commented Sep 22, 2013

@rishib1988 Are you still working on this? If so and you need any help, you can always contact us using IRC. It would be nice to have this feature in PDF.js :)

Contributor

lpy commented Oct 26, 2013

Hello. I am interested in fixing this issues. Could anyone help me? Where should I start to look?

I will try to read viewer.js. Is there anything else?

Contributor

timvandermeij commented Oct 26, 2013

@lpy I think @SSk123 is also working on this, but I'm not 100% sure. If you want help with this, the best thing to do is to contact us using the PDF.js IRC channel (irc.mozilla.org, #pdfjs).

/cc @yurydelendik @SSk123

Contributor

SSk123 commented Nov 9, 2013

@lpy, good to see you are interested in fixing this issue, just wanted to make sure you are working on it, or else I would be happy to work on it :)

Any update on this issue? We have to tell our customers use abode reader because of this problem. Copy/paste is a must have for us.

Contributor

jviereck commented Jul 1, 2014

Just did a short try and replaced divs with spans. This removes the newlines as mentioned above, however, it also removes the space between newlines. E.g. the paragraph:

block, a trace can contain join nodes. Since a trace always only
follows one single path through the original program, however, join 

gets copied to a single line:

block, a trace can contain join nodes. Since a trace always onlyfollows one single path through the original program, however, join

Note the missing space between "onlyfollows".

PS: you can find my work here: jviereck/pdf.js@47b97bf

Member

Snuffleupagus commented Jul 2, 2014

There is also PR #4629, but work on that seems to have stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment