-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with data processor noticed during initial implementation #2305
Comments
\_test\_ Is rendered by GitHub and <p>_test_</p> And there is no problem with that, but when converting this HTML back to markdown it is represented as: _test_ So when another conversion will happen it will be converted to emphasis. |
It sounds like many of these issues are not really important for us, right? I mean, they're mainly about white-spaces and does not make a difference when rendered. If I'm right, is it worth tracking them here? Maybe we should at least mark those that we're ok with. |
I agree – whitespaces in HTML can be ignored. White spaces in MD are more tricky cause they can affect how GH renders the content later on. OTOH, I don't know how well the engine cleans up insignificant whitespaces in loaded HTML so we may have useless text nodes between And I'd write that this is the engine's problem, not MdDP's, but I'm unsure about it. If the DP produces a view (not a DOM), should that view be already normalised? I guess not, the normalisation should happen later, inside the engine. So perhaps we don't have to care about excessive whitespaces in the DPs. tl;dr: I agree with Fred, but we must remeber that the engine may not be ready yet to handle some things. |
The DP role is creating valid HTML. |
I reported a ticket for this: https://github.com/ckeditor/ckeditor5-engine/issues/606. |
I moved the comment to the issue linked above https://github.com/ckeditor/ckeditor5-engine/issues/606. |
I don't see how DP must be related to "view". It's all about "data". If for some reason it has been connected to the creation of a If DP deals with whitespace sequences that need to be preserved, it has the job to make it the HTML way, for example by replacing such spaces with a non-breaking space character. It should certainly not expect that such spaces would be magically preserved. All the above is very important because DPs will, many times, be created on top of "(Some Format) to HTML" conversion libraries. |
If you create such a converter, then you use the HtmlDP for the final HTML->view transformation, as @szymonkups did in https://github.com/ckeditor/ckeditor5-markdown-gfm/blob/71a5d43888ab0202016b7ec3c3274775c2794c63/src/gfmdataprocessor.js#L27. If every DP was meant to convert formatX<->DOM, then we'd lose a lot of flexibility, because why requiring that a DOM is an intermediary step? There can be DPs converting formatX straight to the view. At the same time, we realised that only upon rendering to the DOM (or reading back) we know how to handle some specific, DOM-related quirks, like spaces rendering. See @scofalik's comment for more info. To sum up – how's this working in MdDP's case? An external library converts MD to HTML (string), then we take that string and pass through HtmlDP, which takes care of normalising spaces (the MD->HTML converter can try to format the HTML – e.g. add indentation – and this must be cleared). |
This gives the taste that all DPs will end up doing the same trick that the MdDP did, even if just for spaces normalization... the thin line between flexibility and over-engineering. |
I don't feel we are over-engineering anything. We are clearing whitespace at the only possible and sane place in the code. Please take a look at discussion here https://github.com/ckeditor/ckeditor5-engine/issues/379#issuecomment-249537890 To quickly sum it up:
This causes two conclusions:
I understand that MD processor works like it works beacuse @szymonkups wanted to use existing library, which is fine. It certainly feels like other processors may follow this way. Why we don't want to keep trash in view?
In other words, it's easier for us to work on clean model and view and keep the data that user input there. The less magic we do is better. Now we know that all spaces in the model and view are important and renderer has to only change some of them to |
The above is what I was referring to, when I talked about flexibility.
This doesn't solve the problem that you brought to discussion:
Anyway, there is nothing to be changed on the approach you guys want to take. Let's just assume that we are eventually complicating things a bit.... but really just a bit ;) |
By complicating you mean doing it right? :) Edit: I am not mocking you, I really don't see how can we do it right in other way... and I don't understand your approach, maybe I am missing something, but AFAICS what you propose would just bring more problems. |
I was ready for a "Whatever" comment :D |
We encountered several issues if we tried to 'sync' up the same editor data across multiple windows (or could be tabs for other users). For example the user-entered paragraphs would go missing in mozilla/notes#407 and list styles would get confused due to missing line breaks that signify a "new list": mozilla/notes#421 I had to fork the plugin and make |
Doesn't it mean that you switched to storing HTML and need this data processor only to load legacy data? |
@Reinmar A lot of users asked for "Markdown" support so we want to process their markdown input such as |
But do you mean that those users enter markdown in the visual mode? Like with the autoformatting? Because if that's only about a new content, then autoformatting should be enough. If about loading a content created with the previous editor, then vladikoff/ckeditor5-markdown-gfm@407840c looks fine. |
There has been no activity on this issue for the past year. We've marked it as stale and will close it in 30 days. We understand it may be relevant, so if you're interested in the solution, leave a comment or reaction under this issue. |
We've closed your issue due to inactivity over the last year. We understand that the issue may still be relevant. If so, feel free to open a new one (and link this issue to it). |
Here's a list of issues that I've found during the work on data processor initial implementation. Some of the issues are just differences between HTML rendered by GitHub and
marked
library.GitHub is sometimes adding new lines between rendered HTML elements but
marked
library does not. This is actually not a big deal (I think) since we don't want to have text nodes in our view containing only newline char. For example, this blockquote:> foo bar
Is rendered as:
but
marked
library renders it as:Every paragraph is proceeded by two newlines. This markdown:
this is paragraph # this is header
is rendered by GitHub as:
but
marked
renders it as:Additional newlines are also added after rendering each
<br/>
element:GitHub is adding newline at the end of each rendered code block:
Where
marked
renders without it:To-markdown
does not process empty code blocks, so this from the view:will be converted to an empty markdown string.
All reference links are correctly converted to
<a>
elements. But when converting back they are not represented as reference links but direct ones. For example:Is converted to view element:
So when it is converted back to markdown it looks like this:
Marked
is trimming white spaces from list items. For example:is rendered on GitHub as:
but marked trims white spaces from the beginning:
The same happens with ordered lists.
Some strange things happens with nested strong and emphasis. For example, this markdown:
is rendered by GitHub as:
but
marked
renders it as:The text was updated successfully, but these errors were encountered: