Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paste from Microsoft Word #729

Closed
sei-jdshimkoski opened this issue Apr 15, 2019 · 12 comments
Closed

Paste from Microsoft Word #729

sei-jdshimkoski opened this issue Apr 15, 2019 · 12 comments

Comments

@sei-jdshimkoski
Copy link

A common use case for a WYSIWYG editor is to allow users to paste from Microsoft Word and match the styling as found in the document.

EditorJs does not have this functionality.

Is pasting from Microsoft Word on the roadmap?

@gohabereg
Copy link
Member

Hi @sei-jdshimkoski

The problem is that Editor.js is not actually WYSIWYG. Surely some plugins might look like WYSIWYG components but the way of rendering is up to you and HTML output (or any other) might look completely different from how it looks in the editor.

I've tried to handle paste from MS Word, the data from Word comes in RTF format (text/rtf). It can be parsed, converted and passed to the plugins on the client but there is too much data, browser just can't process it and gets crushed.

If you are able to help with that and maybe make some investigation it would be really helpful!

@sei-jdshimkoski
Copy link
Author

Thank you for the information. I guess this is one of the few tradeoffs that EditorJs needs to make at the moment.

I will try to do some investigation to see if I can figure out some sort of solution to this issue. If I figure anything out, I'll report back.

Thanks again.

@rtpHarry
Copy link

Perhaps a compromise would be a "work area" where you can paste in the document just to have it right there while you rebuild it into sections inside the editor.

I guess it would be a single rich text block but it wouldn't render clientside, its just for reference.

@Ximore
Copy link

Ximore commented May 31, 2019

Pasting from MS Word causes the text to be converted to an embedded image, but if I copy from word, paste into Pages on mac, and then copy from Pages and paste in editor, everything works great, and each paragraph and headline is created as individual blocks of content.

If I open the Word Document inside Pages directly (not just copying the content) Pages will inform me that some changes have been made:
Pages informed about changes to Word file.

So if that is the case, it makes sense that editor.js processes Word content as an image, cause it may include bitmap data or similar due to the background in the document.

UPDATE 02-06-2019

I have tested from a word document on Windows, and the editor.js works fine with copying from Word on Windows. It doesn't work on Mac, but did some further testing.

I created a file in TextEdit, with some dummy text. Then I copied it and ran osascript -e 'the clipboard as record' | less in the terminal and got the following:
Simple text in TextEdit.
The content saved in the clipboard.

When pasting the text into an MS Word document and copying the text again to the clipboard, I saw some different results:
Simple text in MS Word
The MS Word content saved in the clipboard

When copying from MS Word, even just two words, I get a huge long list of data in the clipboard (Note that you can't even see the end flag in the bottom of the terminal.. The list is very long.). This is obviously nothing to do with Editor.js but more about the way Microsoft chooses to copy content in their apps on Mac. The funny thing is, that it works perfectly on Windows. No problem in copying the content from Word on Windows to Editor.js.

UPDATE 09-06-19
I tested it all again today, and suddenly it all worked.. That was until I realised that I was using the Safari browser. If using the Safari Browser, the MS Word copy paste works very well - with the exception of bold text not being interpreted correctly. But pasting in Google Chrome on mac causes it to be shown as an embedded image.

@jakekara
Copy link

I've made a few observations trying to get Copy + Paste from MS Word into Chrome on MacOS, and here's what I've found. My only goal has been to support bold, italic and anchor tags with href property. I do not want to copy in any other markup or images.

When I copy + pasted from MS Word into Chrome, nothing would happen. However it works properly in both FireFox and Safari without any intervention from me. So I did a little digging.

If I copy and paste from MS Word, the ClipboardEvent.clipboardData.types is ["text/plain", "text/html", "text/rtf", "Files"]

If I copy the same text into TextEdit then into Chrome, ClipboardEvent.clipboardData.types is ["text/plain", "text/html", "text/rtf"]` (note there is no "Files" type).

I focused on parsing the "text/html" data. I don't know about MS Word's formatting, but I saw there are <!--StartFragment--> and <!--EndFragment--> surrounding the content of what I'm trying to paste, still with a fair bit of inline styling junk I don't care about. At first I regexed out everything outside of these tags, but I found that doesn't seem to be necessary. Instead, I found that using the API that is passed to the tool, I could call API.sanitizer.clean() against this string populate the result into the block's innerHTML.

@bsodmike
Copy link

Hi all,

Has there been any further progress on this issue?

Thanks!

@gabrielmoterani
Copy link

Hey guys,

Any update about this issue?

@bsodmike
Copy link

Hey guys,

Any update about this issue?

Probably not as helpful given the context but the team I'm with just switched everything over to TinyMCE v5 - it doesn't use the block approach but is surprisingly feature complete and handles content from Word amazingly well.

@Teebo
Copy link

Teebo commented May 25, 2021

@bsodmike I have seen Tiny pretty awesome tool, can I get a JSON Like output from it (Or that is what you meant by it not using the block approach)?

@bsodmike
Copy link

@bsodmike I have seen Tiny pretty awesome tool, can I get a JSON Like output from it (Or that is what you meant by it not using the block approach)?

As per the docs, JSON output should be possible.

@quaidesbalises
Copy link

@bsodmike what is it exactly ?

@mhmttosun
Copy link

Hi,
When writing online it is a common behavior to copy some text and paste in editor. Clean word paste will encourage usage of this awesome editor. Some editors (CKEditor, TinyMCE, Froala, WordPress Gutenberg etc..) supports copy and paste from MS Word. People may want to make use of their word files content. I am not mean copying lots of word pages and pasting editor. There should be character limit on pasted data to prevent crash.

@codex-team codex-team locked and limited conversation to collaborators Jan 17, 2022
@talyguryn talyguryn converted this issue into discussion #1876 Jan 17, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants