Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: ChangeType in TextChange #486

Closed
JordanMartinez opened this issue Apr 10, 2017 · 28 comments
Closed

Refactor: ChangeType in TextChange #486

JordanMartinez opened this issue Apr 10, 2017 · 28 comments

Comments

@JordanMartinez
Copy link
Contributor

To address #322, I reimplemented the solution in #402 by using more readable code via ChangeType, which was through lazy evaluation initially. This caused regressions, as discovered in #472, and was later evaluated immediately in #470 to help find these regressions. Besides simplifying the logic, I used ChangeType because I thought developers would be able to use it in their code to quickly determine what kind of change the change was. By doing this, I realize now that I was solving the first problem and indirectly trying to solve another problem (adding convenience methods to TextChange that allows one to quickly determine what kind of change it was).

And that is the issue: my solution to the second problem was not fully thought out because it was not the issue I was trying to solve. I didn't realize it then (but do realize it now because of the regressions, both known and yet unknown, that have appeared since 0.7-M3) that a RichTextChange can be a number of different changes (insertion, deletion, replacement, text style change, or a paragraph style change)--change tyes which ChangeType does not cover.

So I see two options for addressing this bug:

  1. Refactor the code to always create a TextChange that can get exactly what kind of change type it is
  2. Change the way the code knows that a TextChange is an insertion or deletion type of change and thus whether to merge it with another TextChange.

I'm against the former because it will be more invasive to the current API and using a TextChange does not always require knowing what kind of change it is. For example, sometimes one just needs to know how much to adjust the caret based on how much content was removed and inserted.
I think the latter is best because it will immediately address all possible (known and currently unknown) regressions related to #458.

To still maintain code readability, I don't think we should just implement methods like TextChange#isInsertion and TextChange#isDeletion. Rather, I think ChangeType (which I think needs to be better named since it doesn't address all the changes a TextChange could be and since it pollutes the namespace that developers might want to use later in their own code) should be as follows:

// Enum needs a better name!
public class ChangeType {
    INSERTION,
    DELETION,
    NEITHER // indicates it is neither an insertion/deletion

And then the TextChange#mergeWith code would still read:

// this is either a insertion/deletion
if (this.getType() != ChangeType.NEITHER 
    // the two types match: both are insertion or both are deletion
    && this.getType() == that.getType()) { 
    // merge code
}

Unfortunately, I can't think of a better name to address the namespace pollution issue I raised above. Does anyone else have a better name?

However, #472 did show that sometimes RTFX is inefficient because it tries to make an empty replacement. I wonder if there are others areas that could be further optimized so that replace always replaces something and ignores calls that ultimately do nothing. For example, ignoring "style change" calls that don't actually restyle the content because that content already has that style.

@alt-grr
Copy link

alt-grr commented Apr 11, 2017

Does anyone else have a better name?

NONTEXTUAL? FORMATTING?

@JordanMartinez
Copy link
Contributor Author

Thanks for the ideas. However, neither makes much sense to me at first glance. And if one needs to explain it, that's also problematic.

As I thought about this today, I thought the nature of the data is the type of change stored. However, that data is used is to determine whether a merge should occur. So I thought, why not call it MergeType?

@JordanMartinez
Copy link
Contributor Author

@kuc Before merging 487, I'd be curious to hear your thoughts on the naming.

@alt-grr
Copy link

alt-grr commented Apr 12, 2017

@JordanMartinez OK, I will look on it in a free moment...

@JordanMartinez
Copy link
Contributor Author

@kuc Thanks for your feedback! Closed by #487.

@JordanMartinez
Copy link
Contributor Author

@TomasMikula #487 should resolve any known and unknown regressions introduced by #458. Could you make a new milestone release?

@TomasMikula
Copy link
Member

The choice that only two insertions or only two deletions will be merged is somewhat arbitrary.
(The previous implementation was not arbitrary in the sense that it was an extremal choice: it merged whenever possible.)

We should avoid making arbitrary choices, at least in the core.

I understand that this is in an attempt to provide more reasonable default behavior. This should be possible to do on top of the universal core, though. Specifically, limiting merges by type of change can be implemented in the function passed to undo manager constructor (it doesn't have to just directly invoke PlainTextChange::mergeWith). On the other hand, users will not be able to reuse the new PlainTextChange::mergeWith if they do want to merge an insertion followed by a deletion.

Aside, I'm skeptical that MergeType will be of much use to the user.

@JordanMartinez
Copy link
Contributor Author

The choice that only two insertions or only two deletions will be merged is somewhat arbitrary.
(The previous implementation was not arbitrary in the sense that it was an extremal choice: it merged whenever possible.)

We should avoid making arbitrary choices, at least in the core.

I understand that this is in an attempt to provide more reasonable default behavior. This should be possible to do on top of the universal core, though. Specifically, limiting merges by type of change can be implemented in the function passed to undo manager constructor (it doesn't have to just directly invoke PlainTextChange::mergeWith). On the other hand, users will not be able to reuse the new PlainTextChange::mergeWith if they do want to merge an insertion followed by a deletion.

In other words, #322 is (in some ways) a symptom of the developer not having complete control over how the undo/redo works. Thus, the approach of #402 / #458 is incorrect and something like #333 would need to be implemented (meaning, giving the developer more control as to how undo/redo works).

Aside, I'm skeptical that MergeType will be of much use to the user.

It wasn't supposed to be of much use, just provide the bare minimum functionality to fix #322 and fix any regressions introduced by #458 so that a milestone release could be made that has all the new functionality since 0.7-M3 without any of the regressions of #458. As stated above in my opening comment, we could make a TextChange store exactly what kind of change it is (option 1), but that would require more work. Since a regression-free milestone release was my goal, I didn't do it. However, I do think it would be worth implementing in a later release.

@JordanMartinez
Copy link
Contributor Author

Specifically, limiting merges by type of change can be implemented in the function passed to undo manager constructor (it doesn't have to just directly invoke PlainTextChange::mergeWith).

True, but this was private and in StyledTextAreaModel until #463. To expose it now, we just need to make it public and final. However, that is inefficient as the constructor will construct a default UndoManager using TextChange::changeWith, which will be overridden by the updated one via setUndoManager(UndoManagerFactory). An UndoManager cannot be created before the area is created (as #333 suggests) now that ESD is the model. So, it makes more sense to include a BiFunction parameter in the constructor that is used as the mergeWith method when the constructor calls create[Plain/Rich]UndoManger().

On the other hand, users will not be able to reuse the new PlainTextChange::mergeWith if they do want to merge an insertion followed by a deletion.

But if they do use the original TextChange::mergeWith, that will still cause #322 at some point. I'm guessing you're suggesting that we update the original version of that method to stop that from happening?

@TomasMikula
Copy link
Member

TomasMikula commented Apr 13, 2017

#322 has to be fixed independently of what merging strategy is implemented. We shouldn't get an exception with any strategy.

I see a couple of ways to fix #322.

Solution A

Notice that with the current arguments passed to UndoManagerFactory#create

    <C> UndoManager create(
            EventStream<C> changeStream,
            Function<? super C, ? extends C> invert,
            Consumer<C> apply,
            BiFunction<C, C, Optional<C>> merge);

the UndoManager has no way of knowing whether the result of merge is an empty change. When merge results in an empty change, the UndoManager has no other choice than to store it anyway and, when asked to undo, apply the inverse of that change (which will be empty as well). It then expects this (inverted) empty change to arrive in changeStream.

For RichTextFX to work properly with this behavior of UndoFX, the only possibility is to accept edits which are no-ops and emit corresponding empty changes.

This solution provides the most guarantees to clients (such as the UndoManager): every edit will result in a change event.

Solution B (requires UndoFX changes)

Another option is to give the UndoManager a way to know whether merge resulted in an empty change. This could be done for example by adding an extra parameter Predicate<C> isEmpty to the factory method create above. Undo manager would then test changes for emptiness and automatically discard empty changes without trying to apply them.


Note also that the two solutions can coexist; we could do both.


I would also like to mention that the set of changes resembles a group. (We don't quite have a group, because not any two changes can be merged.) What this means is that

  • merge is associative, i.e. for any changes a, b, c:
    merge(a, merge(b, c)) = merge(merge(a, b), c)
  • merging any change a with an empty change e leaves a unaffected:
    merge(a, e) = a
    merge(e, a) = a
  • every change has an inverse:
    merge(a, invert(a)) = e
    merge(invert(a), a) = e
    where e is the empty change.

These are useful properties to keep in mind when implementing merge.

@JordanMartinez
Copy link
Contributor Author

For RichTextFX to work properly with this behavior of UndoFX, the only possibility is to accept edits which are no-ops and emit corresponding empty changes.

This solution provides the most guarantees to clients (such as the UndoManager): every edit will result in a change event.

While I get the idea behind Solution A, accepting and emitting empty changes seems silly; an empty change, by its definition, isn't a change at all, just wasted memory/time. Isn't it just better to return immediately on calls of replace(n, n, /* empty document */) where 0 <= n <= length()? There might be good reasons to do a no-op replace call in a developer's project, but I can't imagine such a scenario.
On another hand, how long does it take for a no-op replace call to return? On very large documents, could this be a performance issue? (I doubt it would be too much of a hit because ReactFX's efficient FingerTree lies behind the implementation.)

At the very least, Solution B should be implemented because this issue will arise in any project that uses UndoFX (whether that be RTFX or something else).

@TomasMikula
Copy link
Member

When someone's calling replace(n, n, /* empty document */), it is likely because they didn't care to check whether the document is empty (it will not be empty in general, and they don't care for the few cases when it is). If they are going to do N replace operations, some of which potentially empty, they typically have to "allocate budget" for N full replace operations anyway.

But yeah, to give us more flexibility, I would leave it undefined whether empty changes will be emitted. Then users cannot count on empty changes being emitted, neither can they count on any emitted change being non-empty.

At the very least, Solution B should be implemented because this issue will arise in any project that uses UndoFX (whether that be RTFX or something else).

Not if there is a notion of an empty change and empty changes are being silenced.

But I agree it should be implemented anyway.

@JordanMartinez
Copy link
Contributor Author

When someone's calling replace(n, n, /* empty document */), it is likely because they didn't care to check whether the document is empty (it will not be empty in general, and they don't care for the few cases when it is).

Right, which is why the replace method can check if the document is empty and return immediately if it is. The user does not have to worry about that.

If they are going to do N replace operations, some of which potentially empty, they typically have to "allocate budget" for N full replace operations anyway.

"Allocate budget?" Not sure what you mean by that. Regardless, doesn't the "return immediately on empty document" approach deal with anyway?

Not if there is a notion of an empty change and empty changes are being silenced.

What do you mean by "empty changes are being silenced?" You seem to be implying that there are projects that might have an empty change (the first half of your statement). But it's the second half that leaves me puzzled.

@TomasMikula
Copy link
Member

TomasMikula commented Apr 13, 2017 via email

@JordanMartinez
Copy link
Contributor Author

CPU budget. That they have to account for the worst case: none of the operations being no-op.

Oh ok.

It should have been "Only if" instead of "Not if". By being silenced I mean not emitted.

Ok, I thought that's what you meant. Then, couldn't one return false on the UndoManager's isEmpty (or the perhaps better-named isNoOp) predicate if they didn't want empty changes being silenced:

Predicate<Change> isNoOpChange = c -> false;

@TomasMikula
Copy link
Member

Ah, my email reply didn't get formatted well. But you figured out what was quoted and what was new, anyway :)

Then, couldn't one return false on the UndoManager's isEmpty (or the perhaps better-named isNoOp) predicate if they didn't want empty changes being silenced:

Predicate<Change> isNoOpChange = c -> false;

Yes, but the discussion was about the current UndoFX. You said

this issue will arise in any project that uses UndoFX

and I said (well, meant to say) that only if that project has a notion of empty changes and silences them.

@JordanMartinez
Copy link
Contributor Author

Ah, my email reply didn't get formatted well. But you figured out what was quoted and what was new, anyway :)

Yeah... I was a bit confused about the formatting at first 😄

Yes, but the discussion was about the current UndoFX.... and I said (well, meant to say) that only if that project has a notion of empty changes and silences them.

Ah, yeah, I was talking about the updated version of UndoFX

@JordanMartinez
Copy link
Contributor Author

I don't have the time right now to re-read through this issue, but right now replace returns immediately if the replacement is an empty document. I'm simply making note of this so that, if that is an issue, I don't forget to remind everyone of it so we can fix it later.

@TomasMikula
Copy link
Member

Released UndoFX 1.3.0 with the required feature.

@JordanMartinez
Copy link
Contributor Author

Referring to your comment, I still think Solution A is silly, but I agree that its necessary in case others want to implement their own merge method. So, we'll need to remove the return statement that gets called if an empty document is the parameter in replace. After that, we can use the new "ignore identity changes" feature.

However, one question arises. In a RichTextChange, what is the identity change since style changes also count as a change? We can't use initialParagraphStyle or initialTextStyle because those can change after initialization.

Before I forget, I should reopen this issue.

@TomasMikula
Copy link
Member

I still think Solution A is silly, but I agree that its necessary in case others want to implement their own merge method.

Not if they also provide a compatible isIdentity function?

However, one question arises. In a RichTextChange, what is the identity change since style changes also count as a change?

Is there a problem with defining identity change as a change where the inserted document equals the removed document?

@JordanMartinez
Copy link
Contributor Author

Is there a problem with defining identity change as a change where the inserted document equals the removed document?

Ah, that would be it.

@JordanMartinez
Copy link
Contributor Author

Now that #495 is merged, should we now revert back to the previous merge method? Or is that the new issue to tackle?

@TomasMikula
Copy link
Member

TomasMikula commented Apr 28, 2017

I think that at the very least MergeType doesn't need to exist as a type.

I realize I had a bad memory of the previous implementation when I said

The previous implementation was not arbitrary in the sense that it was an extremal choice: it merged whenever possible.

It didn't merge whenever possible, but only when the next change continued at the place where the previous one finished. I think this was a somewhat reasonable default, and did not suffer from #493.

As a matter of principle, I still think providing the most extreme choice is a good idea, but doesn't need to be addressed now.

In the future, we might want to support non-contiguous changes, so don't spend too much time on something that might be replaced anyway.

@JordanMartinez
Copy link
Contributor Author

I think that at the very least MergeType doesn't need to exist as a type.

Agreed.

I think this was a somewhat reasonable default, and did not suffer from #493. As a matter of principle, I still think providing the most extreme choice is a good idea, but doesn't need to be addressed now.

So, what I'm reading is we should revert back to the original merge method now and figure out a more extreme choice later on. Those who don't want to use that method can implement their own.

In the future, we might want to support non-contiguous changes

How would that work?

@TomasMikula
Copy link
Member

How: The representation of a change would basically be a list of things we today call a change.

Why: Ability to represent global find/replace as a single undo item.

@JordanMartinez
Copy link
Contributor Author

How: The representation of a change would basically be a list of things we today call a change.

I assumed such would be the case when #222 gets implemented (List<TextChange<S>>), so that one could rename a method. But the context of the merge method and the way you phrased it sounded like you were going to do some voodoo magic or something 😆

@JordanMartinez
Copy link
Contributor Author

Closed by #496

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants