New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TinyMCE Paste Filter doesn't carry over formatting from MS Word or Google Docs #1866
Comments
@jlahijani I don't have MS Word, but did try with Google Docs. I wasn't able to duplicate it. When a copy/paste a GoogleDocs document full of headlines, lists and links, they all come through properly in TinyMCE. Is it possible that the document you are copying/pasting is using [for example] font-sizes rather than the editor for things like headlines? |
With the update to 3.0.235 (which updated TinyMCE to the newest version), it looks like the issues with MS Word (in Windows, didn't test macOS) are now fixed. Google Docs is still imperfect. I will look into it further. |
@ryancramerdesign I made a video demonstrating a bug with pastefilter and some other considerations when pasting from MS Word: |
I made a couple mistakes in my video:
|
Diving deeper into this... it seems based on some quick research there doesn't exist some sort of open-source JS code that handles this stupid age-old problem. However I experimented with ChatGPT to write the necessary JS and it looks promising. This is what I got with two very basic prompts which were:
Result...
Anyway, that is to say the tricky stuff with regards to detecting a list and wrapping it in a ul tag... GPT knows how to program that and probably all other sillyness with Word formatting which may be helpful. Remember it can simply be asked to convert it to jQuery style as well. |
One other library that may be helpful is Summernote Cleaner, which is a 3rd party plugin for Summernote rich text editor. I'm sure their cleaner is pretty advanced although I have tested it. May be worth looking into: |
…option to the Markup Toggle settings. Plus refactoring of the pasteFilter JS in attempt to fix processwire/processwire-issues#1866 which should improve pasting from MS Word.
Thanks @jlahijani That video was helpful. While I don't have MS Word to duplicate the issue, I was able to copy the Word markup out of your video and substitute it in pasteFilter to see how it would clean it up. I found that it cleaned it up reasonably well but left the conditional comments and |
Btw, I don't think we can do anything with the word ordered/unordered lists, as it's MS Word that's converting them to |
Regarding the other conversion methods, those rely on having the markup in the DOM. In our case, we are operating on the raw HTML/text, as that's what TinyMCE gives us, plus it's probably not safe to place into the DOM at this stage. Once TinyMCE inserts it into the editor, we could always go back and manipulate as DOM elements, which is possible, but probably outside the scope of the pasteFilter. |
That's a good point and one I didn't consider. I will test the changes a bit further when time permits as well as Google Docs (and a little more Word). I will also provide the raw HTML that gets pasted so you don't have to rewrite that by hand. |
I made two videos about Google Docs: Raw code video 1:
Raw code video 2:
|
@jlahijani Thanks. Just looking at the first example to start. But here is the input markup from Google Docs. It's strange because it doesn't seem like there's any bold or italic retained in it, and instead the entire batch of markup is wrapped in a
And here it is after TinyMCE inserts it into the editor. Meaning, it's gone through TinyMCE's content filtering rules, which disalllow things like block level elements wrapped with inline elements, which is why the
The part that we've got some control over is what converts the original input (1) to 2 above. But it looks to me like we might have a garbage-in-garbage-out scenario here, at least with regard to the bold and italic. I'll have a look at the second bit of code next. |
@jlahijani Here's the same data for example 2:
I'm thinking pasteFilter should replace |
When using TinyMCE with the default paste filter settings, it doesn't include the formatting when pasting from MS Word or Google Docs.
However it does work correctly when copying and pasting from something like the rich text editor here (which I'm telling clients to paste to then copy from as a temporary work-around):
https://html-cleaner.com/
The text was updated successfully, but these errors were encountered: