Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try adding more permissive block validation modes #19188

Closed
wants to merge 7 commits into from

Conversation

gwwar
Copy link
Contributor

@gwwar gwwar commented Dec 17, 2019

One option as part of iterating on block-invalidation #7604 is to create a less strict validation mode. As a first pass I'm calling this "strict" vs "no-save-error".

  • "strict" follows current behavior, where we try and see if blocks serialize mostly exactly to expected output
  • "no-save-error" which basically allows anything provided that the block save does not throw an error. (mimics behavior of "attempt block recovery" without prompting the user)

Screen Shot 2019-12-18 at 1 41 35 PM

I've added unit tests that mimic real bug reports I've spotted in the wild.
Screen Shot 2019-12-18 at 12 54 14 PM

  1. Most folks don't understand what this error means
  2. This error is almost all of the time is not caused by the user. (Can be a side-effect of older customizations that modify post content directly, downgrading editor versions, and so on).

New Option (Not Final)
Screen Shot 2019-12-18 at 12 34 12 PM

strict no-save-error
Screen Shot 2019-12-18 at 12 35 16 PM Screen Shot 2019-12-18 at 12 35 51 PM

Overall I'm thinking there are two main cases for block validation:

Block Development

  • Current behavior makes a lot of sense while creating new blocks. It's very easy to mess up serialization, and data hydration without these guards and warnings in place.
  • In the process of creating a new block, can I add a block, modify the it, and reload the editor without the editor complaining in a tightly controlled environment?
  • We shouldn't have any surprises here with attributes, and it can help catch non-deterministic behavior like surprises with floating point rounding errors (we should round to some precision value).

Production or For Everyday Use

  • We should be more permissive for this use case. Basically when someone sets up a site (likely with misc customizations that are not Gutenberg aware) it's incredibly easy for block invalidations to be triggered on trivial changes, by adding unexpected attributes or CSS classes
  • For this first pass, I think letting the block try to salvage and update serialized content here is fine.
    Post updates still require a manual save, and folks can use revision history to roll back if needed. With the current data structures, it's in my opinion too expensive to define what is a mostly harmless change, and what is not, even if said changes look trivial to human eyes. This would require faster and more accurate parsing and likely very strict definition of types.
  • We should only show this UI when the block is broken and cannot properly serialize (throws an error on save). At that point it's reasonable to ask if we should change this to an HTML block or try to convert to something else.
  • For super-users that want control, using HTML blocks directly is still possible
  • Folks of course might note that this is too permissive, but I'm open to discussion.
  • Validation, here I suspect is also not very performant on large posts. I'll see if I can move the sanity check (see if things throw on save) async if folks are open to this, as a follow up.

Testing Instructions

  • Install the Classic Editor Plugin
  • Add some invalid block content to a post and publish. An easy one is adding an extra CSS class or modifying a header block and updating the "levels" attribute to a different number.
  • Load the post in the block editor, verify that we see the block invalidation warning
  • In the Editor Options Modal > Toggle "Disable Block Validation"
  • Reload the editor by pressing refresh
  • Verify that the Block is valid
 npm run test-unit ~/yourcheckout/gutenberg/packages/blocks/src/api/test/validation.js

@gwwar gwwar self-assigned this Dec 17, 2019
@talldan talldan added [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation [Type] Enhancement A suggestion for improvement. labels Dec 17, 2019
@gwwar gwwar force-pushed the try/block-validation branch 3 times, most recently from 4f1ac4b to 343eedf Compare December 18, 2019 21:07
@gwwar gwwar changed the title WIP: Try adding more permissive block validation modes Try adding more permissive block validation modes Dec 18, 2019
@gwwar
Copy link
Contributor Author

gwwar commented Dec 19, 2019

Leaving a note for myself to experiment with when the block serialization "wins" for mismatching block attributes vs serialized content.

@kwight
Copy link

kwight commented Dec 19, 2019

Tried editing block posts in the classic editor to add unexpected classes and attributes; worked well!

@gwwar gwwar marked this pull request as ready for review December 20, 2019 01:46
@bfintal
Copy link
Contributor

bfintal commented Dec 20, 2019

Tried editing block posts in the classic editor to add unexpected classes and attributes; worked well!

But won't those unexpected classes & attributes get stripped out once the block is updated? Won't that lead to unwanted surprises to the user?

@mmtr
Copy link
Contributor

mmtr commented Dec 20, 2019

Tried editing block posts in the classic editor to add unexpected classes and attributes; worked well!

But won't those unexpected classes & attributes get stripped out once the block is updated? Won't that lead to unwanted surprises to the user?

That's right. For instance, the following markup:

<!-- wp:heading -->
<h4 class="heading" data-attr="h4" style="font-style:italic;">Title</h4>
<!-- /wp:heading -->

changes to the markup below when loading the post on the block editor:

<!-- wp:heading {"className":"heading"} -->
<h2 class="heading">Title</h2>
<!-- /wp:heading -->

But I think this is kinda expected since this is what actually "attempt block recovery" does and we're trying to mimic behavior here. But yeah, a user might expect the previous block content to be preserved when disabling the block validation so I wonder if a different name for the option (something like "auto-recover invalid blocks") would help setting clear expectations to users.

@gwwar
Copy link
Contributor Author

gwwar commented Dec 20, 2019

@bfintal we see these validation errors because the HTML doesn't match block attributes. So without user prompting we then have a choice of preferring the HTML markup, or the block attributes and letting the block re-serialize. In this PR, especially since the user is explicitly editing in the block-editor, I'm proposing that we favor the Block Attributes and let the block do it's best job at salvaging the block content.

Won't that lead to unwanted surprises to the user?

71122479-8bf2a480-2195-11ea-9ed5-63b1c3d8f437

It might lead to markup changes, but as is, I feel like the current messaging and warning both looks scary to folks and that many don't understand the options and tradeoffs presented. Like why did this break, and what do these buttons do? "attempt block recovery" is usually the best option in many cases, but this one is buried under the more options dialog.

Updates to make the validation smarter and a little less strict are also very sensitive to performance. See an example in the following PR of what this looks like:

#19207

so I wonder if a different name for the option (something like "auto-recover invalid blocks") would help setting clear expectations to users.

@mmtr totally open to names and folks brainstorming on alternative approaches. I shouldn't be allowed to name anything.

@gwwar
Copy link
Contributor Author

gwwar commented Dec 20, 2019

As a general note I'll be out for the next two weeks, so folks are welcome to help drive this if the community is interested in this approach. Otherwise I'll be back in the new year 💖

@bfintal
Copy link
Contributor

bfintal commented Dec 23, 2019

@gwwar & @mmtr since any manual edits (that would cause invalidation) would always get removed, it may be a good idea to disable the "Edit as HTML" block option entirely when this is turned on.

@youknowriad
Copy link
Contributor

We've been very resistant to introduce such behavior in previous occasions because block authors will be tempted to always use it in their blocks and forgetting that markup can change over time which means previously saved versions of these blocks could be broken (style-wise). Handling deprecated versions will be completely ignored which means potential harm for the end-user in favor of the developer. I agree that the current behavior is more constraining.

Let's try to get thoughts from other people here @mtias @aduth

@gziolo gziolo added the [Feature] Extensibility The ability to extend blocks or the editing experience label Dec 23, 2019
@aduth
Copy link
Member

aduth commented Jan 2, 2020

Can you clarify: Do you expect this is an option that would be used by end-users moreso than by block developers? It's a surprising take to me, since based on the discussion from #7604, my reading was that it's the plugin authors which tend to struggle with this behavior in the course of implementing their blocks. I would tend to agree with you that block developers should want these warnings to serve as indications that a save failing to rehydrate on the next edit can cause data loss and is worth attention, but it's this rationale which has applied to both developers and end-users as the reason it's so strict in the first place.

We should be more permissive for this use case. Basically when someone sets up a site (likely with misc customizations that are not Gutenberg aware) it's incredibly easy for block invalidations to be triggered on trivial changes, by adding unexpected attributes or CSS classes

The thing here which is alarming to me is not that the editor warns about them, but that it is so easy to introduce the invalidations, And as such, disabling the validation seems to mask an underlying problem. You mention a few general examples, but: (1) are these actions often taken by users, or faults of a specific piece of code? and (2) are there some concrete cases you can share?

While I sympathize with the frustrations surrounding the block invalidation, I have to imagine that the alternative would be much worse: that by bypassing an invalidation, it would make data loss a certainty, since the invalidation is a direct result of a procedure which checks that the editor can recreate the same content from source.

I'm glad that we are exploring improvements to this, and I'd say that if this is a path we continue to explore, we would want to be very considerate of how we communicate the ramifications to the user. You mention that a block invalidation can be a jarring and confusing experience for users, but likewise I have to think that if we present this as an option, it won't be very obvious what repercussions could result, and that most people will select it as simply to "make the warnings go away". At which point, if we're okay with that, it begs the question why the validation ought to exist at all (other than for development). I'm not arguing that we should remove it, but as a thought experiment it seems it could be a natural conclusion of the proposal.

@gwwar
Copy link
Contributor Author

gwwar commented Jan 2, 2020

Do you expect this is an option that would be used by end-users moreso than by block developers?

In my opinion, I think we should be strict during block development and remove this in production. Most folks are confused by this and treat this more as "the editor is broken". Any view of this in the wild tends to get categorized as a high priority bug, for what is often a very trivial data error.

In addition to this, I think we can make block migration and bulk block upgrades easier for folks to wrangle, but those would be separate PRs.

The toggle here is mostly to help display the proof of concept. So, as a TLDR; I don't think end users should ever see this, but it's useful during block development to catch common coding mistakes.

You mention a few general examples, but: (1) are these actions often taken by users, or faults of a specific piece of code? and (2) are there some concrete cases you can share?

Usually never by the user. The main cause is usually older custom code and the block editor making the incorrect assumption that nothing else modifies post-content. It's also easy to create this situation when copying content to other sites when block plugin versions don't match.

I have plenty of WordPress.com specific examples, which I'd be happy to share, but I'm not sure if it'd be appropriate to link here.

@vietvho
Copy link

vietvho commented Jan 3, 2020 via email

@aduth
Copy link
Member

aduth commented Jan 6, 2020

@vietvho I don't disagree that this should be made to be simpler, and it's efforts such as this one which are exploring improvements to this experience (as well as many previous pull requests which improve validation leniency, such as #19207, #7538, #13512, #11771, #10474, and others).

The problem is and always has been a technical one: The block editor cannot make sense of the saved post content, and if it tried to regenerate it, it would destroy the user's content (@bfintal summarized this quite well at #7604 (comment)). Since the block's code has changed from how it was when the post was originally saved, it is impossible to interpret meaning from the previous content (at least not without some indication of how the changes affect the content, in our case using a block deprecation). We choose to prevent the destruction from happening and present the user with a warning message.

Tying it back to the current conversation, I think some of the options to explore will continue to be:

  • Improve communication and action steps surrounding an invalid block. Per @gwwar's comment at Try adding more permissive block validation modes #19188 (comment), I agree in the expectation that an invalid block will appear as an error to many users, when often it does not need to be such a scary event. The problem is, even when many times these are trivial issues to resolve, from a framework level we often cannot categorize what is and isn't trivial so far as what content destruction we deem allowable. But sometimes we can, in which case...
  • ... we can consider those as additional "tolerances" in the validation procedure, such as the examples above. I think for these, we need more data for the sorts of invalidations which occur where it is very commonly a trivial issue that we can programmatically resolve. I think we will find that often what we consider to be trivial is not always trivial. For example, one of the ideas floated in the past is to consider differences in a CSS class as negligible. However, even the simple addition or removal of a CSS class could have a dramatic negative impact on how the content would be displayed on the front of a site, so it's not easy to call this a "trivial" tolerance.
  • Simplifying the developer experience surrounding block deprecations. I would grant that the current deprecation experience is cumbersome, albeit necessary. It was disappointing to me to see in Gallerys ids are saved as numbers #19163 how much code is necessary to simply enforce an attribute value as being numeric. Here, I would like to see some explorations around (1) allowing more of a block definition to be inherited from the current implementation of a block and (2) possible tooling for continuous iterations on a block in a single development session (something like a hot module replacement?).

I'm still not convinced that there is any value in allowing developers to "turn off" the validation. At best, the content is mangled in such a way that there is little difference to starting with a fresh post (or at least deleting and recreating the block) every time one makes a change in their block. At worst, it could mislead the developer to unknowingly introduce a breaking change to their block which would destroy (or at least invalidate) user content when the updated block is released.

@gwwar
Copy link
Contributor Author

gwwar commented Jan 6, 2020

I'm still not convinced that there is any value in allowing developers to "turn off" the validation.

Data (especially data in living formats) can be extremely messy in the wild, I'm still of the opinion that the editor should try and recover as much as it can, in an opinionated way here since folks are explicitly using the block editor. Bugs will still surface, but severity will then naturally bucket into the appropriate categories. Instead of seeing this warning, a property might not stick.

Overall, we should aim at making the block development and maintenance less of a burden (It's very easy imo to get the in-memory representation, serialization and hydration wrong while building a block). And I hope I can help contribute to that in the future.

I'll close this out for now, and aim for a PR to see if folks are interested in changing the primary CTA. I don't think most folks actually want to convert content to an HTML block, or they would have specifically used one from the start.

@gwwar gwwar closed this Jan 6, 2020
@dmsnell
Copy link
Member

dmsnell commented Jan 6, 2020

This has been an insightful conversation.

The problem is and always has been a technical one:

To me this doesn't seem accurate. Parsing block attributes has been a technical problem but the fragility built into Gutenberg has been a problem for end-users and remains a psychological one. As @gwwar described, the need to disable the warning isn't about letting developers get past the gate - it's about giving the editors relief when every time they open their post their blocks are invalid-yet-legitimate.

That is, there's two different problems contained in this issue and I don't feel like it's right to dismiss the user-focused issue because the developer-focused one is harder to solve.

I'm still not convinced that there is any value in allowing developers to "turn off" the validation.

Originally I had recommended that we remove some hard-coding and install a filter into block validation. @aduth what are your thoughts about running a filter that gives site administrators the ability to bypass these invalidations? I can agree with you about block developers turning off validation but systems administrators and site-building agencies have legitimate reason to limit the fear and inconvenience the warnings send to their customers.

This isn't a hypothetical problem - the concrete issue is that currently valid blocks are being flagged as invalid when re-running the save() function automatically produces exactly what the user expected.

At best, the content is mangled in such a way that there is little difference to starting with a fresh post (or at least deleting and recreating the block) every time one makes a change in their block.

This reads a little hyperbolic. Experience tells us that at best it's like attempting to run the block given the possibly-corrupted-but-most-likely-benignly-differnt attributes. At best the output is the same for every meaningful purpose.


I'm sad that this continues to be such a pain point in Gutenberg. So much of this I think could be solved if we stopped considering validation/stopped looking at it as 100% or 0%. If blocks specify their own attribute parser then they can be so much more resilient to external changes and we could preserve non-conflicting external changes to the attributes. Something like a parser could obviate and simplify the deprecation interface.

// some block
parseAttributes( { attributes, jsonAttributes, sourcedAttributes } ) => {
  return {
    // prefer updates to JSON
    code: jsonAttributes.code || sourcedAttributes.code,
    // handle changes to attributes
    language: jsonAttributes.language || jsonAttributes.mode,
    // more complicated parsing - hard to capture in a schema
    classes: intersection( sourcedAttributes.classes, [ 'code-block', 'code-js', 'code-php' ] ),
  }
}

Nevertheless, our lack of providing an escape hatch to site operators means we're forcing the end-users to deal with unnecessary and unnerving nags until we have found a technical solution that we like. Do you have objections to opening up such a hatch with a filter in the meantime (one that the project can officially leave unimplemented)?

@gwwar
Copy link
Contributor Author

gwwar commented Jan 6, 2020

Here's my follow up on moving around some CTA options here: #19449

@gwwar
Copy link
Contributor Author

gwwar commented Jan 6, 2020

Something like a parser could obviate and simplify the deprecation interface.

+1 and also adding block boundaries. We often run into trouble with runaway regexes (older customizations) also trying to helpfully transform content in block markup in unexpected ways.

Also was there any past discussions on why we don't have block versioning?

I probably need to provide a code example later but it might be worth an exploration for letting blocks version (with either explicit or implicit schemas), transforms can then happen in a chained pipeline. Eg one transform function only needs to worry about changing a data shape from A to B, and we might have another transform function that changes something from B to C. Changing a shape from A to C then, is just a matter of running those functions together. The benefit to this is that specific transform functions can also do a best guess for missing data or defaults. Add in a server piece, and we can then declutter and remove deprecations from client code, while only running the transforms it needs to, while either preparing block content or running a mass migration. I've built similar things in the past for other editors.

@mtias
Copy link
Member

mtias commented Jan 7, 2020

In my opinion, it tends to become unproductive to discuss the ontological nature of validation at large. Offering ways to diverge (escape hatches and so on) at a site level seems prone to end up causing more issues for end-users than it'd alleviate in the current situation. Minimizing the occasions a user sees one of these dialogs while ensuring there is no meaningful data loss remains the heart of the issue. We should attempt to break it down and solve it generally as best as possible before offering ways to short circuit it, which will necessarily add overhead.

The main problem is that when a user sees one of these messages they have no clue what to do. This will remain true no matter what we consider to be the right defaults. We have attempted to improve the user experience when things fail by providing diffing (toyed with visual diffing as well) and various options to choose from. Arguably, this will never be better than making an educated guess for the user and providing a way to roll back if things are not what they expected — which is worth emphasizing, will happen no matter what we do.

This isn't a hypothetical problem - the concrete issue is that currently valid blocks are being flagged as invalid when re-running the save() function automatically produces exactly what the user expected.

Defining what the user expected can be hard, and is what @aduth alludes to. The original balance was tuned to limit data loss of any kind while the block editor was being developed. It stands to reason that those levers can be tweaked now in a more pragmatic and less purist way. This requires at the very minimum to compile a list of situations where validation fails and we think that it shouldn't (because the change is considered trivial, not meaningful user data, and so on), otherwise it feels we are talking about all or nothing dichotomies. (I also remember discussing a threshold of amount of changes to be considered before showing something to the user.)

Maybe then we see that it can indeed be conceptualized as a technical problem if it's possible to discriminate between cases where changes don't matter.

Other thoughts that come to mind are ways to tie expectations of shape-validation with the elements used in save attributes, which is where the block author can provide more information about their intentions. For the sake of discussion, if there was a component called <BlockWrapper> there's more we can assume about incidental differences compared to what we can infer from a generic <div>.

Also was there any past discussions on why we don't have block versioning?

Quite a few. Generally, Gutenberg wants to avoid versioning leaking to user space and its content (that is, marking the current version for a block in comment attributes or similar mechanisms) in any capacity. The array of deprecations is essentially a list of versions to attempt to match with but it doesn't have an explicit reference to the current shape of the content. Would be great to discuss some other ideas if you have thoughts on it!

@aduth
Copy link
Member

aduth commented Jan 7, 2020

@dmsnell It's disappointing to me if a reading of my comment would lead one to the conclusion that I am dismissing the user-facing impact of this problem. To be clear, I believe this is of critical importance. There is still value in outlining the technical reasons for the situation we're in today, in that I would hope for it to provide some context for discussing a solution in a way which doesn't lose sight of the ultimate goal. I too would be much happier if I would never be presented with an invalidation warning. In these conversations, we have a tendency to minimize its reason for existing; that the majority of cases amount to "negligible differences". But what about the situations where it is not negligible? Should it be the responsibility of the framework to make such a judgment call, and on what basis?

The problem I see with providing an option to disable the validation (whether that be presented to end-users or developers) is that it may train users that it is safe to ignore those disparities. But what do we expect should happen in a situation where the differences are not intended, and not negligible? Assuming that the user has noticed the impact (considering that the framework is no longer informing them), I personally feel this would leave them in a much more distressing situation than the one we're seeking to address, because (a) it's more likely to be noticed only after the fact (after the content changes are published) and (b) they were never informed about the change, nor are they presented with any options for resolving the problem.

To your point about "100% or 0%", we might consider this validation disabling as one of those two extremes. Even in writing this, I can see how there might be some more middle-ground approaches to add to what's already been discussed. For example, we could consider some way to allow all or more of the invalidations to pass through, while still providing some sort of cue to the user in a way which is more informative ("this happened") than it is consequential ("you must make a choice"), optionally with follow-up actions for when it was not an intended effect.

@gwwar
Copy link
Contributor Author

gwwar commented Jan 7, 2020

Thanks for the feedback @mtias @aduth and @dmsnell ! I think we can probably say skipping or disabling validation here (even as a custom filter) doesn't seem tenable to folks at the moment. It is nice that we all agree that this can be improved on for both users and the developer experience via different iterations 💖

Quite a few. Generally, Gutenberg wants to avoid versioning leaking to user space and its content

Oh interesting, @mtias any older PR threads or posts I could catch up on?

Would be great to discuss some other ideas if you have thoughts on it!

I would be happy to ideate a bit once I understand some of the technical design constraints we want to stick with.

As a start for clearer user options, I'll play around a bit with making the "resolve" modal provide more context in #19449

@youknowriad youknowriad deleted the try/block-validation branch January 9, 2020 09:01
@joshm33333
Copy link

joshm33333 commented Jan 12, 2020

just want to add my 2 cent here
i am an active supporting developer for a finite number of websites and clients. that means i am here and ready to solve problems. i often make small changes to blocks that the user would never notice.

i would ideally see one of two solutions, in my own perfect world anyway:

the best solution being a way to recover blocks with server side code. what i really want is a recover all button, but of course that wouldn't benefit the user. in wordpress's current state, there is absolutely no way to parse invalid blocks. or is there.. ? do let me know if i missed something big here.

the second best solution would be much easier. Just pull the recover button out of the dropdown. As a developer, i dont EVER want my users looking at html. most of them are 70 years old and will fire me if they get confused. if i could use css to disable every option except recover, that would solve the problem for me, but because of the way the dropdown is built, this isnt possible... once again please correct me if im wrong :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation [Feature] Extensibility The ability to extend blocks or the editing experience [Type] Enhancement A suggestion for improvement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.