Skip to content
This repository has been archived by the owner on Mar 1, 2019. It is now read-only.

Handle UTF-16 code unit offsets in file changes API #24

Merged
merged 5 commits into from
Nov 8, 2018

Conversation

Xanewok
Copy link
Collaborator

@Xanewok Xanewok commented Nov 2, 2018

This adds the ability to change text with ranges defined in UTF-16 code units (per LSP spec, needed for the RLS).

Since we embed index base in Span type, we could also embed the 'text atom' there as well. Should we do that?
The tricky bit is that conversion between those requires a source buffer - to change from 0-based to 1-based indexing we only need to add/subtract 1 but to translate column offset we need actual string of characters for which the range is defined.
Since we need the source buffer, I decided that it'd be best to implement that API on VFS directly, because it operates on the buffer directly.

Should I come up with more tests? Previously crashing UTF-16 test case now works and RLS using this patched vfs version doesn't crash and reformats succesfully the test case from rust-lang/rls#1104.

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Since we embed index base in Span type, we could also embed the 'text atom' there as well. Should we do that?

I think we should. It seems like it would be easy to go wrong otherwise. Also, the SpanAtom seems more like a property of the span or length or whatever, rather than of the change, so I think it would be good to tie those together.

The tricky bit is that conversion between those requires a source buffer - to change from 0-based to 1-based indexing we only need to add/subtract 1 but to translate column offset we need actual string of characters for which the range is defined.

We don't need to provide conversion methods as part of the type, they can be in the VFS or wherever.

Should I come up with more tests? Previously crashing UTF-16 test case now works and RLS using this patched vfs version doesn't crash and reformats succesfully the test case from rust-lang/rls#1104.

No, this looks fine.

@Xanewok
Copy link
Collaborator Author

Xanewok commented Nov 6, 2018

Done!
Similar to Indexed trait in rls-span, I introduced Split trait implemented for UnicodeScalarValue and Utf16CodePoint void structs.

However, this seems like ergonomics loss - while this binds range length and span under a single type, this introduces more type-level machinery and makes the actual types more verbose; also we can't process now series of changes with different Split markers, because we have &[Change<S>] with a single type and Change::AddFile also has to carry this marker trait, although it's unrelated.

Maybe it'd make more sense to implement an enum ChangeText with those two USV/UTF-16 variants instead? This wouldn't infect outer types and corresponding function signatures with another generic parameter. @nrc what do you think?

@nrc
Copy link
Member

nrc commented Nov 6, 2018

Maybe it'd make more sense to implement an enum ChangeText with those two USV/UTF-16 variants instead? This wouldn't infect outer types and corresponding function signatures with another generic parameter. @nrc what do you think?

This sounds like a good compromise to me

@Xanewok
Copy link
Collaborator Author

Xanewok commented Nov 6, 2018

Pushed a commit with second approach. Does this look good now?

EDIT: For comparison, previous commit with type parameter is at Xanewok@62ee1b9.

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart from the little nit about byte_in_str

src/lib.rs Outdated Show resolved Hide resolved
@nrc nrc merged commit e66ae60 into rust-dev-tools:master Nov 8, 2018
@nrc
Copy link
Member

nrc commented Nov 8, 2018

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants