-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WP_HTML_Block: keep innerHTML as a string #2
Conversation
HTML blocks are essentially unidentified, unsupported, or simply DIY blocks. As such, we don't want to parse their contents. Instead, we want to preserve the contents verbatim.
@mcsf I'm not sure how I feel about this approach which ends up doing more application-level parsing. It makes me nervous about feature creep and bloat, though I'm not diametrically opposed to it. The parser (or maybe we should more aptly call it a lexer at this point) already handles {
type: 'block',
blockType: 'html',
value: '…'
} |
@dmsnell isn't the current parser trying to break the html block into children, etc? {
"type": "HTML_Tag",
"name": "div",
"attrs": {
"class": "custom-stuff"
},
"startText": "<div class=\"custom-stuff\">",
"endText": "</canvas>",
"children": [
{
"type": "Text",
"value": "\n "
},
{
"type": "HTML_Tag_Open",
"name": "canvas",
"attrs": {},
"text": "<canvas>"
}
]
} I was thinking we would parse HTML blocks as a single string value. |
@mtias yeah we're parsing inside of the blocks and allowing for nested blocks. In Slack @mcsf and I discussed something I did with the Simplenote parser which I plan to add here, which is a |
Whether we pre-parse it very much depends on what’s the plan to do with the HTML fragment. Is it used for the ultimate source-of-truth? Is it used as a cache for the rendered component? Or sometimes one, sometimes the other? Do we need to parse it back to structured data? What should the experience of a developer working writing a block be? |
@nb good question. personally I like to consider the stored post content as the serialized form of the data, which happens to use a bulky and displayable syntax - HTML. at this point I wouldn't see much reason why we couldn't also enforce a certain wellformedness in that HTML to make things smooth with the experience. the developer's job would be to guarantee that nothing is funny about the HTML he or she chooses to serialize. still, I think that having the structure is more valuable than having the raw text. if I were writing some block I would prefer to check something like |
What happens if somebody edits the HTML by hand?
Does this happen on the server-side?
This decision has huge implications on both developer and user flows. If the HTML fragment is not the ultimate source of truth what happens if a user in a legacy editor changes it? Do we reject their changes, do we warn them, do we disallow editing in a block-enabled editor? |
When would it have to? If all the changes to a post are persisted by way of generating HTML from the blocks node tree and saving that HTML to
That was the main decision from the start: the HTML definitely is the source of truth, for all sorts of compatibility (back-, forward-) reasons, to properly and minimally degrade. |
Here’s a challenge – is there a way to retain most of the user properties without sacrificing developer experience? If we think of a component as
For me, a developer writing the code for a new block, writing a |
Closing since this is old and I'm not sure it's relevant anymore… feel free to reopen |
HTML blocks are essentially unidentified, unsupported, or simply DIY blocks. As such, we don't want to parse their contents. Instead, we want to preserve the contents verbatim.
yields