Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paste from google docs support #2472

Closed
wants to merge 37 commits into from
Closed

Paste from google docs support #2472

wants to merge 37 commits into from

Conversation

jacekbogdanski
Copy link
Member

@jacekbogdanski jacekbogdanski commented Oct 11, 2018

What is the purpose of this pull request?

New feature

Does your PR contain necessary tests?

All patches which change the editor code must include tests. You can always read more
on PR testing,
how to set the testing environment and
how to create tests
in the official CKEditor documentation.

This PR contains

  • Unit tests
  • Manual tests

What changes did you make?

This is the early PR of pastefromword plugin refactoring into a single plugin which will handle different paste sources. Currently, except Word this plugin handles pasting from Google Docs. A new plugin has been named uberpaste, but be aware that the name may change.

This PR will require a lot of additional polishing like deprecating pastefromword docs, renaming variables, updating lang files etc. which I'm aware of - however, changes are pretty wide so I think it's good time to start talking about this feature.

I also didn't spend much time on testing tables because they are a target to change #2219.

Closes #835

@Comandeer Comandeer self-requested a review November 13, 2018 10:11
@Comandeer Comandeer self-assigned this Nov 13, 2018
@jacekbogdanski jacekbogdanski self-assigned this Apr 15, 2019
@jacekbogdanski jacekbogdanski removed their assignment Apr 15, 2019
@jacekbogdanski
Copy link
Member Author

jacekbogdanski commented Apr 15, 2019

I rebased this PR into the latest upstream and did some refactoring.

@Comandeer Comandeer self-assigned this Apr 16, 2019
Copy link
Member

@Comandeer Comandeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the fact that in the current shape uberpaste is basically pastefromword with different name. Actually I was thinking about something different:

  1. uberpaste is just a mechanism for registering new paste handlers.
  2. pastefromword registers paste handler for pasting from Word.
  3. pastefromgdocs registers paste handler for pasting from GDocs.

Some other configuration like uberpaste and pastefromoffice (Word + GDocs) is also possible. The point is: uberpaste IMO should be very generic and allow creating more specific plugins.

Additionally I don't like the idea of paste handlers that can register themselves. I'd move register to uberpaste plugin itself:

editor.plugins.uberpaste.register( {
    cleanFn: function() {},
    canHandle: function() {}
} );

Also the whole mechanism of handling images – can't it be moved into its own paste handler or into helper in pastefromword? It doesn't seem to fit generic PasteHandler class. With this and above changes we can even eliminate it in favour of pure objects.

@jacekbogdanski
Copy link
Member Author

jacekbogdanski commented Apr 18, 2019

How you would like to refactor the file structure of the original pastefromword? Currently, we have only a single filter file loaded asynchronously when needed, to avoid bundling it for built version (less bundle size). We have a bunch of helper methods inside pastefromworld/filter/default.js and also filter definition which is shared between gdocs and word filters, making it a bit complicated.

If I understand you correctly, you would like to see 3 different plugins: uberpaste, pastefromworld, pastefromgdocs. As it seems reasonable, when we started talking about google docs support, we decided that we want a single monolithic plugin which will handle all our paste integrations with external editors. The reason was simple - your proposition will require a lot of work with the same result for the end user. It's also a very small chance that a user wants to have support for MS Word but not Gdocs (and vice versa). pastefromword plugin has been left only for compatibility reasons.

Additionally I don't like the idea of paste handlers that can register themselves

The idea was to avoid loading filter file if paste event cannot be handled by Word or Gdocs handlers. If we move handling verification methods (canHandle) into cleaning one, we can just remove the whole async file loading as it won't make sense anymore.

Additionally, some filter definition can be shared between google docs and MS Word, like extracted table rules. How would you like to share such rules between plugins? The simplest way seems to extract additional plugin like pastetools, but it also makes this structure more complicated and we will have to rewrite (and document) helper methods inside pfw filter. We will end up with 5 plugins for pastefromword refactoring which cannot be easily loaded asynchronously.

I also had the same concerns as you when doing this refactoring. Probably we should revisit our decision if we want to implement it "easy way" proposed by this PR or do it correct fixing tech dept of PFW plugin - which also will take much more time.

@f1ames
Copy link
Contributor

f1ames commented Apr 24, 2019

@Comandeer @jacekbogdanski From what I understand the PR adds handling GDocs content into PFW logic and moves the whole thing to uberpaste plugin? The PFW plugin is itself quite complicated so adding additional mechanisms would make it uber-complex :P

TBH I don't remember the initial approach which we took while starting implementing pasting from GDocs support, but my first thought was that it will be a separated plugin. Then I thought that in the same manner we will have PasteFromLibreOffice, PasteFromX, PasteFromY which might be quite confusing (however, gives good separation). Adding everything to one plugin makes it easier and simpler (but introduces some additional complexity and tech debt).

So we have few options:

  1. Go with UberUberPaste which will be a single plugin containing all logic for all supported 3rd-party like-office apps.
  2. Go with @Comandeer proposal, add uberpaste which is the main entry for paste handlers and then create separate plugin for GDocs and refactor PFW plugin so it utilizes uberpaste. And then for each new app supported, additional plugin will be added.
  3. Create a separate plugin for handling GDocs (and eventually other office apps) and extract common functions/tools from PFW (without refactoring it) - so we will have a PFW plugin which is quite complex and rest in a different single plugin.

I'm not sure TBH, refactoring PFW seems like big, unnecessary task, but having everything in one plugin doesn't seem like a good solution either. We can go with 3rd proposal which is somehow in-between. WDYT?

@Comandeer
Copy link
Member

Actually I'd be happy even with the first proposal from @f1ames list (single uberpaste plugin), but with exposed PasteHandler class and ability to register new, custom handlers.

The idea was to avoid loading filter file if paste event cannot be handled by Word or Gdocs handlers. If we move handling verification methods (canHandle) into cleaning one, we can just remove the whole async file loading as it won't make sense anymore.

I don't see why these two things are connected. The whole inner mechanism of paste handler would be the same, the only thing changed is the place of registering the handler.

How would you like to share such rules between plugins?

Every PasteHandler would have info about used filters. I don't see why several plugins can't use the same filter or use several filters (e.g. our original PfW one and some custom one).

@f1ames
Copy link
Contributor

f1ames commented Apr 25, 2019

@Comandeer Could you elaborate a little bit on your solution? How exactly you would like it to work. The general approach seems reasonable, but I'm not sure if I have the same concept in mind after reading your description.

Should the filters by somehow extracted and then reused by code handling Word, GDocs in their paste handler? So you will have main plugin file which enables registering paste handlers, then some common filters and then for each supported app (Word, GDocs) the handler will specific filters should be registered?

@jacekbogdanski
Copy link
Member Author

jacekbogdanski commented Apr 25, 2019

I don't see why these two things are connected. The whole inner mechanism of paste handler would be the same, the only thing changed is the place of registering the handler.

If you move registering handler into the handler file with canHandle method, you will have to load a file to check if pasted HTML can be handled by the handler which we should avoid or any paste event will load handler file. PFW filter weight around 27KB (minified) so it's probably worth to load it conditionally.

Create a separate plugin for handling GDocs (and eventually other office apps) and extract common functions/tools from PFW (without refactoring it) - so we will have a PFW plugin which is quite complex and rest in a different single plugin.

And we will end up with two unmanageable (PFW + common tools/filters ) files instead of one :D Nevertheless, it probably will help us to not touch PFW but use it for GDOC purposes.

Maybe we could write a bit more advanced file loader for PFW, so it can load files like uberpaste/tools, uberpaste/pfw, uberpsate/gdcs conditionally based on the information of pasted content (tools loaded always before office like editors)? In that case, we will hide complicated code under 'private' namespace, so no refactoring PFW except extracting private tools, and loaded files will be even smaller because there will be no need to load PFW file for GDOCS content? Also, it could contain additional uberpaste/default file with filters shared between editor filters. It's probably still better to async load (2-3) files for the first time if needed than including huge filter bundle. WDYT?


Actually, we could even end up with some kind of micro plugins which only contain canHandle method and information about required filters. Filters will be loaded asynchronously when needed. Something between 2/3 of @f1ames comment.

So, instead of single uberpaste, we will have pastefromword, pastefromgdocs and pastetools plugins. pastefromword and pastefromgdocs will require pastetools plugin and use it like:

pastetools.register( {
   filters: 'pastefromword,common',
   canHandle: function( html ) { ... },
   // Note shared tools for `clean` function argument, also loaded asynchronously by
   // `pastetools`
   clean: function( tools ) { ... }
} );

pastetools will firstly check for pastefromword inside plugins/pastefromword/filters location, and then plugins/pastetools/filters for common or other defined by filters. Obviously, filters will be only loaded when canHandle is positive. Handling images could also be defined by separate micro pasteimage plugin like pfw and gdocs.

@Comandeer
Copy link
Member

If you move registering handler into the handler file with canHandle method, you will have to load a file to check if pasted HTML can be handled by the handler which we should avoid or any paste event will load handler file.

I was thinking about moving it into uberpaste plugin, not filter itself. TBH your proposed solution (pastetools.register) is exactly what I had in mind.

@f1ames
Copy link
Contributor

f1ames commented Apr 26, 2019

@jacekbogdanski It seems reasonable to me. The one thing I would like to avoid is to much refactoring in PFW so just modify what is needed to work with pastetools. We may eventually extract refactoring to another ticket if you decide along the way that it should be done. Same for handling images, could be a separate ticket - also remember that some image handling is specific for Word only and some mechanisms may be common for more than one source app.

I have only some doubts about common filters? Do we need these ATM? Will they contain the common logic for both PFW and GDocs? If there will be a the common logic there remember that some cases may require a specific order of content processing (e.g. when pasting from Word - transforming p to li and then do some further processing on lis). But I'm not sure if we will encounter any such issues here, but it's worth to remember.

@jacekbogdanski
Copy link
Member Author

They share the same table filter currently - https://github.com/ckeditor/ckeditor-dev/blob/b9752a95f05abce86b4c0dc10cf06450bc66b521/plugins/uberpaste/filter/default.js#L12-L27 but it may change soon with #2219

So, to sum up we will go with #2472 (comment) proposition about creating a middlemen pastetools plugin which will handle module loading and choosing the correct paste plugin handler.

@jacekbogdanski
Copy link
Member Author

jacekbogdanski commented Jul 10, 2019

As the whole approach changed it should be safe to close this PR now. We already have new PR for Paste From World / Google docs support - #3257 Please, follow it for updates on the feature state as this PR won't be further developed. Also, see #3258 for more information about changes into Rich Text paste mechanism.

(Keeping branch alive for reference for #3257)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Paste from Google docs
3 participants