Skip to content

Conversation

@krassowski
Copy link
Collaborator

@krassowski krassowski commented Nov 29, 2025

@krassowski krassowski added the enhancement New feature or request label Nov 29, 2025
@krassowski krassowski marked this pull request as ready for review November 29, 2025 10:20

# for very different strings, just replace the whole content;
# this avoids generating a huge number of operations
if matcher.ratio() < 0.6:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 0.6 because:

As a rule of thumb, a ratio() value over 0.6 means the sequences are close matches

As per https://docs.python.org/3/library/difflib.html#sequencematcher-examples

Instead of ratio() we could use quick_ratio() or real_quick_ratio(). In case if the ratio is above the threshold this does not matter because we would end up computing matching_blocks anyways (this is computed and cached by a call to either raito() or get_opcodes(); in that case calling quick_ratio heuristic could actually end up taking more time as we end up doing more work. It all boils down to whether we think that we are more likely to see smaller updates or larger updates. One very cheap heuristic could be using len of both strings which is what real_quick_ratio does.

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great, I just have minor comments.

operations = matcher.get_opcodes()
offset = 0
for tag, i1, i2, j1, j2 in operations:
if tag == "replace":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we require python 3.10+, maybe use match?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b61975ef55b4e8011016e7fbdd83f601e5ae84df.

del self._ysource[i1 + offset : i2 + offset]
offset -= i2 - i1
elif tag == "insert":
self._ysource[i1 + offset : i2 + offset] = value[j1:j2]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not using Text.insert?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Text.insert in ca32120.

as `__setitem__` also checks if index is a number or slice
and then checks the range of the slice; we can skip those
knowing that `i1 == i2` in the `insert` opcode.
@davidbrochart davidbrochart merged commit 87e2052 into jupyter-server:main Dec 1, 2025
22 checks passed
@krassowski krassowski deleted the granular-file-reload branch December 1, 2025 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not reload text document if text was apened out-of-band?

2 participants