Skip to content

editing: move text input to document-space composition#595

Draft
waywardmonkeys wants to merge 3 commits intolinebender:mainfrom
waywardmonkeys:plain-editor-changes
Draft

editing: move text input to document-space composition#595
waywardmonkeys wants to merge 3 commits intolinebender:mainfrom
waywardmonkeys:plain-editor-changes

Conversation

@waywardmonkeys
Copy link
Copy Markdown
Contributor

Rework PlainEditor’s text-input/composition model so host integrations can operate in document UTF-8 byte space instead of emulating Parley’s old raw preedit internals.

This changes the internal composition model from a single hidden-preedit span to two distinct states:

  • hidden preedit text, which is excluded from PlainEditor::text()
  • visible composing regions over existing document text, which remain part of PlainEditor::text()

On top of that model, add a host-facing editing surface on PlainEditorDriver for:

  • setting document selection
  • setting composing regions
  • insert/replace with optional replacement ranges
  • atomic composition updates
  • surrounding delete in document space
  • explicit composition commit/clear
  • semantic delete-to-edge helpers for future edit-command integration

Move encoded-offset conversion over visible editor text into Parley itself via SplitString::to_utf8_range and TextIndexEncoding, so adapters do not need to allocate or duplicate range-conversion logic.

Also:

  • remove the legacy mutating compose APIs
  • replace raw_compose with composition
  • rename ime_cursor_area to text_input_area
  • migrate examples/vello_editor to the ui-events text-input path

@waywardmonkeys
Copy link
Copy Markdown
Contributor Author

This is related to endoli/ui-events#114 and linebender/xilem#1700

Rework `PlainEditor`’s text-input/composition model so host integrations can
operate in document UTF-8 byte space instead of emulating Parley’s old raw
preedit internals.

This changes the internal composition model from a single hidden-preedit
span to two distinct states:

* hidden preedit text, which is excluded from `PlainEditor::text()`
* visible composing regions over existing document text, which remain part
  of `PlainEditor::text()`

On top of that model, add a host-facing editing surface on
`PlainEditorDriver` for:

* setting document selection
* setting composing regions
* insert/replace with optional replacement ranges
* atomic composition updates
* surrounding delete in document space
* explicit composition commit/clear
* semantic delete-to-edge helpers for future edit-command integration

Move encoded-offset conversion over visible editor text into Parley itself
via `SplitString::to_utf8_range` and `TextIndexEncoding`, so adapters do
not need to allocate or duplicate range-conversion logic.

Also:

* remove the legacy mutating compose APIs
* replace `raw_compose` with `composition`
* rename `ime_cursor_area` to `text_input_area`
* migrate `examples/vello_editor` to the `ui-events` text-input path
@nicoburns
Copy link
Copy Markdown
Collaborator

This changes the internal composition model from a single hidden-preedit span to two distinct states:

  • hidden preedit text, which is excluded from PlainEditor::text()
  • visible composing regions over existing document text, which remain part of PlainEditor::text()

My understanding is that the "visible composing region" corresponds to the pre-existing functionality, and the "hidden preddit text" is new in this PR? For my understanding: what is the use case for a hidden pre-edit / when does this happen?

Move encoded-offset conversion over visible editor text into Parley itself via SplitString::to_utf8_range and TextIndexEncoding, so adapters do not need to allocate or duplicate range-conversion logic.

This makes a lot of sense to me. A pain that we have to deal with UTF16, but we do.

@waywardmonkeys
Copy link
Copy Markdown
Contributor Author

This changes the internal composition model from a single hidden-preedit span to two distinct states:

  • hidden preedit text, which is excluded from PlainEditor::text()
  • visible composing regions over existing document text, which remain part of PlainEditor::text()

My understanding is that the "visible composing region" corresponds to the pre-existing functionality, and the "hidden preddit text" is new in this PR? For my understanding: what is the use case for a hidden pre-edit / when does this happen?

It’s actually the other way around: Parley historically only had the hidden-preedit model. This PR adds visible composing-region support so set_composing_region can correctly mark existing document text as composing without hiding or deleting it.

Hidden preedit is still needed for the common IME draft-text flow (winit/web/AppKit/UIKit composition snapshots), where temporary composition text should affect layout/caret/underline behavior but should not yet be exposed as committed document text:

  • the IME is showing temporary composition text
  • it is not committed document content yet
  • canceling composition should remove it
  • committing composition should make it normal document text

@nicoburns
Copy link
Copy Markdown
Collaborator

Hmm... I see. I think I was confused because in the existing mode, the text is usually rendered as visible on screen. Perhaps there is a better naming scheme for this? "temporary" vs. "committed"? "staged" vs. "comitted"?

Same question (for my intuition) around "when does this happen"? Is this for things like "just comitted word is marked as still marked as a compose area so that things like spellcheck can act on it"?

@waywardmonkeys
Copy link
Copy Markdown
Contributor Author

I think I was confused because in the existing mode, the text is usually rendered as visible on screen.

Yeah, understandable. Hidden from PlainEditor::text(). Although some desktop environments allow you to hide pre-edit text visually, I think.

Copy link
Copy Markdown
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty reasonable to me. I've not dug into the weeds of the semantics of visible text areas, and I feel like we still need to be careful about editing when there is a "hidden preedit". I've not carefully reviewed the example or test changes.

The utf-16 and such helpers are really nice.

I also don't hugely like the term hidden for this. I understand who it's hidden from, but it doesn't feel intuitive for the app to be writing things in terms of "hidden from the app".

pub text: &'source str,
/// The UTF-8 byte offset in [`PlainEditor::text`] where the composition is
/// currently located.
pub document_offset: usize,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to expose whether or not this Composition is transient here? I don't have a concrete case where it would be useful, but it feels a bit weird to have "this may be one or the other, and we don't tell you which one".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a single platform or host could plausibly exercise both modes, because the distinction here is really about how the host describes the current composition state, not a permanent property of a platform. The common preedit-snapshot path maps to transient composition, while protocols that explicitly mark an existing document range as composing map to the visible-region case.

But this isn't yet an area where I understand the downstream needs well enough to predict what people would need (since we're in early days of building out some of this new pathway).

So I did consider exposing that distinction on Composition, but I held off because:

  1. I don’t have a concrete caller need yet, and
  2. I’m not sure “transient vs visible-region” is actually the most useful public question for embedders.

The API I wanted to preserve here was “what composing text is active, and where is it in document space?” If we later find that hosts genuinely need to branch on the composition kind, we can add that.

Comment thread parley/src/editing/editor.rs Outdated
Comment on lines +322 to +323
/// Delete the selection or to the start of the physical line.
pub fn delete_to_line_start(&mut self) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method name and docs combination have some dissonance. I see that it is the behaviour from doing , at least on this mac. I presume that there's an equivalent macOS event, which we should probably link to?
I unfortunately am struggling to come up with a concise name for "delete to the start of the line, unless the user has a selection in which case ignore that command and clear their selection instead"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a commonly implemented shortcut for this on Windows/Linux, even if it isn't interpreted by the system, we should probably mention this here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I’ve tightened the docs here so they say explicitly that these delete the current selection if there is one, and otherwise delete to the relevant boundary. I also called out the AppKit semantic command correspondence for the line variants, since that’s the clearest existing source for these operations.

Comment thread parley/src/editing/editor.rs Outdated

// --- MARK: IME ---
/// Set the IME preedit composing text.
/// Delete the selection or to the start of the physical line.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the term physical line here; I'd previously been using "soft line" for it.

Comment thread parley/src/editing/editor.rs
Comment thread parley/src/editing/editor.rs Outdated
/// This starts composing. Composing is reset by calling [`clear_compose`](Self::clear_compose).
/// Alternatively, the preedit text can be committed by calling [`finish_compose`](Self::finish_compose).
/// Returns `false` if the provided range is reversed, out of bounds, or
/// does not land on character boundaries in the document text.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this is utf-8 character boundaries. If so, we should specify behaviour around landing inside grapheme clusters (or whatever the right term for "reasonable selection boundaries" is).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is UTF-8 scalar-value boundaries, not a grapheme-cluster guarantee. I tightened the docs to say that explicitly.

Comment on lines +1487 to +1496
if compose.range.is_empty() {
self.layout = builder.build(&self.buffer);
self.layout.break_all_lines(self.width);
self.layout
.align(self.width, self.alignment, AlignmentOptions::default());
self.selection = self.selection.refresh(&self.layout);
self.layout_dirty = false;
self.generation.nudge();
return;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding it hard to work out how this is different to the "no compose" path. That should be commented.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I added a comment there clarifying that a collapsed composition still exists as editor state, but there is no visible range to underline in the layout.

Comment thread parley/src/editing/split_string.rs Outdated
@@ -0,0 +1,161 @@
// Copyright 2026 the Parley Authors
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this copyright year is right.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, because originally this file contained some other code that was all fresh / new and then the actual SplitString definition got moved in subsequently.

Comment on lines +128 to +130
if utf16_count == target {
return Some(byte_offset);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me when this condition could ever be true...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment there. That branch is for the case where the requested UTF-16 boundary lands exactly at the seam between the two visible segments.

Copy link
Copy Markdown
Member

@DJMcNab DJMcNab Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was trying to express is that:

  1. If target is zero, that is caught at the very start of the method.
  2. There is no case where code could run between the "upper" and "lower" iterations of this exact same block, so if this would happen, it would be caught by the "lower" check.

Comment thread parley/src/editing/split_string.rs Outdated
Utf8Bytes,
/// Offsets are UTF-16 code-unit indices.
Utf16CodeUnits,
/// Offsets are Unicode scalar-value counts.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Rust chars, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Comment thread CHANGELOG.md Outdated
Comment on lines +23 to +25
- `PlainEditorDriver` now exposes UTF-8 document-range editing helpers for host-side text input and composition handling, plus delete-to-edge helpers for semantic edit-command integration.
- `PlainEditorDriver::commit_composition` to explicitly commit an active composition.
- `SplitString::to_utf8_range` and `TextIndexEncoding` to convert encoded offsets over visible editor text without allocating.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should have the PR number and author attribution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines +327 to +340
pub fn delete_to_line_start(&mut self) {
self.refresh_layout();
if self.editor.selection.is_collapsed() {
let range = self
.editor
.selection
.line_start(&self.editor.layout, true)
.text_range();
self.editor
.replace_range_with_selection(self.font_cx, self.layout_cx, range, "");
} else {
self.delete_selection();
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this function should just be the if case, and the user should be expected to do the if-else themselves?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could have a generic delete_selection_or(cb: impl FnOnce(&mut PlainEditorDriver) method?

Comment on lines +387 to +393
/// Insert a newline at the current selection.
///
/// This is useful for hosts that surface newline insertion as a semantic
/// editing action rather than ordinary text input.
pub fn insert_newline(&mut self) {
self.insert_or_replace_selection("\n");
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a standalone function in this API? Couldn't hosts just call self.insert_or_replace_selection("\n"); themselves? macOS exposes this as a separate action because normal insertions can trigger actions, but that isn't the case with PlainEditor's API.

Same for tab.

/// input system for candidate box placement. This bounds the area of the
/// active composition if present, otherwise it bounds the selection on the
/// focused line.
pub fn text_input_area(&self) -> BoundingBox {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is this suitable for passing to Winit as the "ime_cursor_area"?

Comment on lines +1345 to +1347
fn visible_text_len(&self) -> usize {
self.buffer.len() - self.hidden_composition_range().map_or(0, Range::len)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we document that this returns the length in bytes?

self.buffer.len() - self.hidden_composition_range().map_or(0, Range::len)
}

fn document_byte_to_raw_byte(&self, index: usize) -> Option<usize> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these methods could do with doc comments explaining what they do.

(range.start <= range.end && text.get(range.clone()).is_some()).then_some(range)
}

fn transformed_range_start(index: usize, replaced: &Range<usize>, inserted_len: usize) -> usize {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tranformed from what to what?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants