Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Paste Tools and Paste from GDocs #3257

Merged
merged 53 commits into from
Sep 16, 2019
Merged

Introduce Paste Tools and Paste from GDocs #3257

merged 53 commits into from
Sep 16, 2019

Conversation

Comandeer
Copy link
Member

What is the purpose of this pull request?

New feature

Does your PR contain necessary tests?

All patches which change the editor code must include tests. You can always read more
on PR testing,
how to set the testing environment and
how to create tests
in the official CKEditor documentation.

This PR contains

  • Unit tests
  • Manual tests

What is the proposed changelog entry for this pull request?

* [#835](https://github.com/ckeditor/ckeditor-dev/issues/835): Added new `pastetools` and `pastefromgdocs` plugins.

What changes did you make?

I've introduced pastetools plugin, which allows registering paste handlers:

class PasteHandler {
	// Array of external filters that have to be loaded before executing current paste handler handle method.
	filters: string[]

	// Checks if handler can handle current paste event.
	canHandle( evt ): boolean

	// Actual paste handling
	handle( evt ): void
}

I've refactored pastefromword to use the new mechanism. I've also split PfW filter into two:

  • pastefromword/filter/default.js – Word specific stuff,
  • pastetools/filter/common.js – stuff that can be useful also in other filters.

Additionally I've introduced simple API for creating filters: CKEDITOR.plugins.pastetools.createFilter. Every filter definition is divided into two parts:

  • rules – array of functions that will return filter rules. Every rule-creating function receives html (pasted HTML), editor and filter parameters.
  • additionalTransforms – function that transforms pasted HTML before passing it into filter. It receives html and editor parameters.

I've also added pastefromgdocs plugin, with its own filter and paste handler.

Currently pastefromgdocs does not have any tests, except the generic manual one.

I've also had to modify some PfW tests:

  • generated/functions.js – part of it moved to pastetools directory,
  • generated/heuristics.js,
  • helpers.js
  • image.js
  • parsestyles.js – moved to pastetools directory
  • pasteimage.js.

Some API docs are missing or not updated.

There is also small issue with config variables. Currently some parts of common filter (CKEDITOR.plugins.pastetools.filters.common.styles) use some Paste from Word config variables:

  • config.pasteFromWordRemoveFontStyles,
  • config.pasteFromWord_keepZeroMargins.

I'm thinking about introducing their pastetools counterparts:

  • config.pasteTools_removeFontStyles,
  • config.pasteTools_keepZeroMargins.

It would also mean that PfW variables become deprecated and if user set both of them, pastetools ones will take precedence.

I'm also thinking about namespaces. Currently CKEDITOR.plugins.pastetools.filters is used to store filters' config and CKEDITOR.pasteFilters is used to store filters' instances. However it introduces superlong namespaces (CKEDITOR.plugins.pastetools.filters.common.styles.normalizedStyles – ouch…). Not sure if it's fixable in some really sensible way?

I'm also not a big fan of pastetools name. Maybe something like pastehandler, specialpaste, maybe even uberpaste? pastetools isn't really about tools for pasting…

Big part of this PR is based on #2472 by @jacekbogdanski, so big kudos to him, as I could skip a huge part of initial research! 👏👏👏Too bad that simple cherry-picking wasn't possible due to the fact how much current approach is different from the one in mentioned PR.

Closes #835.

@Comandeer
Copy link
Member Author

There is also one more important change from the previous PR: Paste from GDocs is no longer a part of Paste from Word, therefore it doesn't fire pastefromword or afterpastefromword events. Additionally it doesn't have its own paste button. Filtering GDocs content is totally transparent for the user.

@Comandeer
Copy link
Member Author

I've introduced middleware architecture described in #3258.

@Comandeer
Copy link
Member Author

Introducing fixture-based tests revealed that filter has one visible flaw: it removes elements with [id^=docs-internal-guid-]. However in Firefox and Edge such element also contains styles and without them pasted content has changed layout. I'll provide fix in the nearest days.

@Comandeer
Copy link
Member Author

I've updated rules for dealing with elements with GDocs specific id. Instead of removing these elements, I just remove their ids.

I've also added rule to remove all comments from the code.

@Comandeer Comandeer removed their assignment Jul 14, 2019
@f1ames f1ames self-requested a review July 16, 2019 09:17
@f1ames f1ames self-assigned this Jul 16, 2019
Copy link
Contributor

@f1ames f1ames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code

Check my in-code comments.

There is also small issue with config variables. Currently some parts of common filter (CKEDITOR.plugins.pastetools.filters.common.styles) use some Paste from Word config variables:

  • config.pasteFromWordRemoveFontStyles,
  • config.pasteFromWord_keepZeroMargins.

I'm thinking about introducing their pastetools counterparts:

  • config.pasteTools_removeFontStyles,
  • config.pasteTools_keepZeroMargins.

It would also mean that PfW variables become deprecated and if user set both of them, pastetools ones will take precedence.

🤔 I think that would be a good direction. However, I'm not sure if we should deprecated the Word ones as one may still want to use them only for pasting from Word and not from other office-like applications. Do you think this might be common use case? Or we can deprecate, but mention that they still can be used if one wants to affect only Word input.

I'm also thinking about namespaces. Currently CKEDITOR.plugins.pastetools.filters is used to store filters' config and CKEDITOR.pasteFilters is used to store filters' instances. However it introduces superlong namespaces (CKEDITOR.plugins.pastetools.filters.common.styles.normalizedStyles – ouch…). Not sure if it's fixable in some really sensible way?

Yes, it's quite long TBH. We may think of introducing e.g. CKEDITOR.filters.common.styles.normalizedStyles, but then it might cause naming conflicts as different plugins might try to use the same namespace. I think we can live with long namespace like that.

I'm also not a big fan of pastetools name. Maybe something like pastehandler, specialpaste, maybe even uberpaste? pastetools isn't really about tools for pasting…

specialpaste and uberpaste has even less precise meaning that pastetools. If we thinking about changing it I would be for pastehanlder, pastecore or something similar.

Tests

Few test for build versions fails on CI (check recent Travis build) and some unit tests are failing on Edge (checked with Edge 17, but you may also check on Edge 18 if possible):

Screenshot 2019-07-17 at 10 44 00

And two manual tests related to loading filters fails on IE (checked on IE11 and IE8):

ie11 filters

When it comes to generic paste manual test I would add some nested lists to sample GDocs document.

One more thing, do we assume manual test with pasting from GDocs could be also run on browsers which doesn't support GDocs itself? If so it should be mentioned that you could open the document in different browser (however, I am not sure if specific browser would not somehow affect final result). I have tested pasting on IE8 from Chrome and it's almost perfect (👏❤️):

Screenshot 2019-07-17 at 11 35 18

Docs

I have checked API docs and there are a few empty namespaces - not sure what we can do here 🤔

Screenshot 2019-07-17 at 11 47 07

Screenshot 2019-07-17 at 11 47 47

And also an uberpaste leftover:

Screenshot 2019-07-17 at 11 48 08

And some globals:

Screenshot 2019-07-17 at 11 45 01

UI

I was thinking if we should unify pasting buttons as suggested in one issue (somewhere here #3258) - there would be only Paste and Paste as plain text buttons. However, I'm not sure if this would not cause some users to thing that we dropped support for paste from Word (like after dropping paste dialog - #469) - but I think it would be good to extract to a follow-up issue about unifying paste buttons.

plug = {};

/**
* Set of Paste from Word plugin helpers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be "Set of Paste from Google Docs plugin helpers."? 🤔

plugins/pastefromgdocs/filter/default.js Show resolved Hide resolved
return value === 'ltr' ? false : value;
},
'style': function( styles, element ) {
// Returning false deletes the attribute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is probably not needed. Or it can be moved in the beginning of attributes object as it concerns all attributes filters.

},
'style': function( styles, element ) {
// Returning false deletes the attribute.
return Style.normalizedStyles( element, editor ) || false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use falseIfEmpty() function here too?

* @private
* @member CKEDITOR.plugins.pastetools.filters.gdocs
*/
plug.rules = function( html, editor, filter ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also do this like:

CKEDITOR.plugins.pastetools.filters.gdocs  = {
    rules: function...
};

was there any particular reason you went with the plug object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such method was used in the original PR. I've used it here without much reflection TBH. I'll simplify it.

plugins/pastefromgdocs/plugin.js Show resolved Hide resolved
plugins/pastefromgdocs/plugin.js Show resolved Hide resolved
plugins/pastefromword/plugin.js Show resolved Hide resolved
@@ -124,6 +124,21 @@
}

return tests;
},

loadFilters: function loadFilters( filters, callback ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is also defined in tests/plugins/pastetools/_helpers/ptTools.js - https://github.com/ckeditor/ckeditor-dev/pull/3257/files#diff-dc1abc73b7621243c62e4e78abb3be72R19, do we need it in both places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I did it this way to avoid complicated bender imports. However it can be moved to ptTools and removed from pfwTools.

tests/plugins/pastetools/manual/loadingfilters.html Outdated Show resolved Hide resolved
@f1ames
Copy link
Contributor

f1ames commented Jul 17, 2019

Also extracted follow-ups:

@msamsel
Copy link
Contributor

msamsel commented Aug 1, 2019

During work on Past from Google Docs in CKE5, I noticed a few cases which maybe you would also like to check in CKE4.

  1. Content in Safari like this one, after cmd + a, cmd + c and cmd + v in editor doesn't contain guid-... so might won't be recognized as content from google docs. It might be importnatn if there is no general content fixer regardless of data source.
    image
  2. There are cases with lists which have invalid html syntax. Things worth checking:

@Comandeer Comandeer self-assigned this Aug 5, 2019
@Comandeer
Copy link
Member Author

I think that would be a good direction. However, I'm not sure if we should deprecated the Word ones as one may still want to use them only for pasting from Word and not from other office-like applications. Do you think this might be common use case? Or we can deprecate, but mention that they still can be used if one wants to affect only Word input.

I'm afraid that it's not possible with our current architecture, as all filters connected with fonts are moved to the shared filter. It will have to know that the content being filtered is from Word, but I don't know if there is any easy way to pass there such info 🤔

@Comandeer
Copy link
Member Author

Content in Safari like this one, after cmd + a, cmd + c and cmd + v in editor doesn't contain guid-... so might won't be recognized as content from google docs. It might be importnatn if there is no general content fixer regardless of data source.

We have very similar issue with Excel in Safari – #2609 (comment)
I've reported bug to Safari: https://bugs.webkit.org/show_bug.cgi?id=200569

@Comandeer
Copy link
Member Author

Few test for build versions fails on CI (check recent Travis build)

As errors were nonsensical (claiming that basic pastetools methods aren't defined), I triggered a custom build → https://travis-ci.org/ckeditor/ckeditor-dev/builds/569799069

It seems that Travis is unstable again 😞

some unit tests are failing on Edge

Can't reproduce.

@Comandeer
Copy link
Member Author

When it comes to generic paste manual test I would add some nested lists to sample GDocs document.

I've updated tests, however I'm not able to generate sensible fixtures for Edge and IE11 due to unexpected bug in GDocs (seems to be introduced recently): content is copied incorrectly, when there's is an image at its beginning (in fact, in Edge only the image is copied).

And two manual tests related to loading filters fails on IE (checked on IE11 and IE8):

It was connected with the fact that IE does not expose proper MIME type of pasted content. I relaxed the conditional to fire paste handler.

One more thing, do we assume manual test with pasting from GDocs could be also run on browsers which doesn't support GDocs itself? If so it should be mentioned that you could open the document in different browser (however, I am not sure if specific browser would not somehow affect final result).

I've added the note. Probably more troublesome is copying in such browser, not pasting (as the pasted HTML will be the same as in modern browser).

I've also added handling legacy config names.

I haven't touch nested lists normalizations yet.

@Comandeer
Copy link
Member Author

I've fixed issue with pasting fragments of nested lists.

@Comandeer Comandeer requested a review from f1ames August 14, 2019 11:59
@f1ames f1ames self-assigned this Aug 28, 2019
Copy link
Contributor

@f1ames f1ames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests

CI fails for build editor (one test checking handler priorities) - restarted it but the error is the same.

Also, some unit tests still fails on Edge for me:

Screenshot 2019-08-28 at 17 34 12

Screenshot 2019-08-28 at 17 39 33

Screenshot 2019-08-28 at 17 39 42

Screenshot 2019-08-28 at 17 40 11

Tested with http://localhost:1030/#tests/is:unit,path:/tests/plugins/pastefromgdocs,path:/tests/plugins/pastefromword,path:/tests/plugins/pastetools,path:/tests/plugins/pastetext

Manual tests

Copying nested lists fails on IE8:

From Chrome (other type of content is also bolded):
pfgd ie8

From Firefox:

pfgd ie8-ff

Copying to IE8 from Google Docs (from different browser) is rather an edge case IMHO. Please take a look at the above case, but if it requires any significant amount of work we will extract it as a follow-up.


In IE11 the first image is not centered after paste:

Screenshot 2019-08-29 at 10 43 36


I have also checked some cross browser pasting and it works quite well 👍

Docs

Some JSDuck errors during docs building are also visible:

image

plugins/pastefromgdocs/filter/default.js Show resolved Hide resolved
plugins/pastefromgdocs/filter/default.js Show resolved Hide resolved
if ( editor.config.forcePasteAsPlainText === true ) {
// If `config.forcePasteAsPlainText` set to true, force plain text even on Word content (#1013).
data.type = 'text';
} else if ( !CKEDITOR.plugins.clipboard.isCustomCopyCutSupported && editor.config.forcePasteAsPlainText === 'allow-word' ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal is to leave allow-word to work as before (so only for Word) and extract new flags to a separate issue so we can handle it later if necessary, WDYT?

// In browsers using pastebin when pasting from Word, evt.data.type is 'auto' (not 'html') so it gets converted
// by 'pastetext' plugin to 'text'. We need to restore 'html' type (#1013) and (#1638).
data.type = 'html';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok 👍 Could you extract it to a separate issue then? I think we could cover also additional allow-* flags (as mentioned in https://github.com/ckeditor/ckeditor-dev/pull/3257/files#r303850250 thread) in this new issue, WDYT?

'test Basic gdocs firefox': CKEDITOR.env.ie && CKEDITOR.env.version <= 11,
'test Basic gdocs safari': CKEDITOR.env.ie && CKEDITOR.env.version <= 11,
// 'test Basic gdocs edge': CKEDITOR.env.ie && CKEDITOR.env.version <= 11,
// 'test Basic gdocs ie11': !CKEDITOR.env.ie || CKEDITOR.env.version > 11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the above are commented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3257 (comment)

I've updated tests, however I'm not able to generate sensible fixtures for Edge and IE11 due to unexpected bug in GDocs (seems to be introduced recently): content is copied incorrectly, when there's is an image at its beginning (in fact, in Edge only the image is copied).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that GDocs resolved the issue, so I updated fixtures.

@Comandeer
Copy link
Member Author

The bug with list is not fixable on our side… GDocs just pass such data to us: last list item is outside the list. Sample data: https://gist.github.com/Comandeer/d851b67e47c81baeef1e416dc4e1521b

As you can see, the last element is after </ul>, enclosed inside span.

There is also no way to detect such situation, as there are no signs that such span is from the list. Even if we assume that all spans after lists are in fact last list elements (which is a very silly assumption), we still don't know, on which level of nesting the last item was.

I've updated docs with deprecated info for both CKEDITOR.plugins.pastefromword namespace and several PfW config variables. I've also updated @since tags for elements in CKEDITOR.plugins.pastetools as it didn't make any sense to have something added in 4.13.0 with @since 3.1.0 tag. However there is one issue with such approach:

CKEDITOR.config.pasteTools_removeFontStyles added and deprecated in 4.13.0

Despite this little issue, I feel that the current shape of docs is much more readable and sensible than before.

@f1ames f1ames assigned f1ames and unassigned Comandeer Sep 16, 2019
@f1ames f1ames self-requested a review September 16, 2019 13:00
Copy link
Contributor

@f1ames f1ames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Good job 🎉🍰


The bug with list is not fixable on our side… GDocs just pass such data to us: last list item is outside the list. Sample data: https://gist.github.com/Comandeer/d851b67e47c81baeef1e416dc4e1521b

As you can see, the last element is after </ul>, enclosed inside span.

There is also no way to detect such situation, as there are no signs that such span is from the list. Even if we assume that all spans after lists are in fact last list elements (which is a very silly assumption), we still don't know, on which level of nesting the last item was.

👍 Let's leave it this way then.

I've updated docs with deprecated info for both CKEDITOR.plugins.pastefromword namespace and several PfW config variables. I've also updated @since tags for elements in CKEDITOR.plugins.pastetools as it didn't make any sense to have something added in 4.13.0 with @since 3.1.0 tag. However there is one issue with such approach:

CKEDITOR.config.pasteTools_removeFontStyles added and deprecated in 4.13.0

Well, looks kind of strange but that's the way it is, we have introduced deprecated config option (because of duplicating already deprecated one) to have backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Paste from Google docs
3 participants