Problems with blank-masking #367

birkenfeld · 2015-08-26T21:15:38Z

The code chunking iterator works in terms of byte offsets. But the mask_comments function (and ´mask_sub_scopes` too, probably) construct a new string based on differences of byte offsets.

That means that if a comment or string contains non-ASCII characters, the "masked-out" string will have more characters than the original source, since multi-byte characters got replaced by multiple single-byte characters (spaces).

I can't say if this causes problems such as wrong indices in the matches...

A related problem is that if a character literal contains an escape sequence, the "masked-out" version of the code is not valid Rust anymore, since e.g. '\'' gets masked to ' ' (with two spaces). Again, I can't say if that causes an actual problem.

The text was updated successfully, but these errors were encountered:

phildawes · 2015-08-30T15:54:50Z

Good catch.
I think internally it makes sense if we hold the byte offset in the file, so replacing a multibyte char with the same number of spaces is the right thing to do from the perspective of masking.

I'm not sure what is best to do about character literals with an escape sequence - I assume the parser balks at multi space char literals (I haven't tested). Maybe we should replace with a 1 space char literal and an extra space after the closing quote?.

phildawes added the bug label Sep 2, 2015

micbou mentioned this issue Jun 9, 2016

Escaping issue in Rust completer ycm-core/ycmd#520

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with blank-masking #367

Problems with blank-masking #367

birkenfeld commented Aug 26, 2015

phildawes commented Aug 30, 2015

Problems with blank-masking #367

Problems with blank-masking #367

Comments

birkenfeld commented Aug 26, 2015

phildawes commented Aug 30, 2015