Skip to content

Commit

Permalink
tabs_in_doc_comments: Fix ICE due to char indexing
Browse files Browse the repository at this point in the history
This is a quick-fix for an ICE in `tabs_in_doc_comments`. The problem
was that we we're indexing into possibly multi-byte characters, such as '位'.

More specifically `get_chunks_of_tabs` was returning indices into
multi-byte characters. Those were passed on to a `Span` creation that
then caused the ICE.

This fix makes sure that we don't return indices that point inside a
multi-byte character. *However*, we are still iterating over unicode
codepoints, not grapheme clusters. So a seemingly single character like y̆ ,
which actually consists of two codepoints, will probably still cause
incorrect spans in the output.
  • Loading branch information
phansch committed Apr 6, 2021
1 parent e315437 commit 1573d10
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 11 deletions.
28 changes: 17 additions & 11 deletions clippy_lints/src/tabs_in_doc_comments.rs
Expand Up @@ -104,30 +104,29 @@ fn get_chunks_of_tabs(the_str: &str) -> Vec<(u32, u32)> {
// tracker to decide if the last group of tabs is not closed by a non-tab character
let mut is_active = false;

let chars_array: Vec<_> = the_str.chars().collect();
let char_indices: Vec<_> = the_str.char_indices().collect();

if chars_array == vec!['\t'] {
if char_indices.len() == 1 && char_indices.first().unwrap().1 == '\t' {
return vec![(0, 1)];
}

for (index, arr) in chars_array.windows(2).enumerate() {
let index = u32::try_from(index).expect(line_length_way_to_long);
match arr {
['\t', '\t'] => {
for entry in char_indices.windows(2) {
match entry {
[(_, '\t'), (_, '\t')] => {
// either string starts with double tab, then we have to set it active,
// otherwise is_active is true anyway
is_active = true;
},
[_, '\t'] => {
[(_, _), (index_b, '\t')] => {
// as ['\t', '\t'] is excluded, this has to be a start of a tab group,
// set indices accordingly
is_active = true;
current_start = index + 1;
current_start = *index_b as u32;
},
['\t', _] => {
[(_, '\t'), (index_b, _)] => {
// this now has to be an end of the group, hence we have to push a new tuple
is_active = false;
spans.push((current_start, index + 1));
spans.push((current_start, *index_b as u32));
},
_ => {},
}
Expand All @@ -137,7 +136,7 @@ fn get_chunks_of_tabs(the_str: &str) -> Vec<(u32, u32)> {
if is_active {
spans.push((
current_start,
u32::try_from(the_str.chars().count()).expect(line_length_way_to_long),
u32::try_from(char_indices.last().unwrap().0 + 1).expect(line_length_way_to_long),
));
}

Expand All @@ -148,6 +147,13 @@ fn get_chunks_of_tabs(the_str: &str) -> Vec<(u32, u32)> {
mod tests_for_get_chunks_of_tabs {
use super::get_chunks_of_tabs;

#[test]
fn test_unicode_han_string() {
let res = get_chunks_of_tabs(" 位\t");

assert_eq!(res, vec![(4, 5)]);
}

#[test]
fn test_empty_string() {
let res = get_chunks_of_tabs("");
Expand Down
8 changes: 8 additions & 0 deletions tests/ui/crashes/ice-5835.rs
@@ -0,0 +1,8 @@
#![rustfmt::skip]

pub struct Foo {
/// 位
pub bar: u8,
}

fn main() {}
20 changes: 20 additions & 0 deletions tests/ui/crashes/ice-5835.stderr
@@ -0,0 +1,20 @@
error[E0658]: custom inner attributes are unstable
--> $DIR/ice-5835.rs:1:4
|
LL | #![rustfmt::skip]
| ^^^^^^^^^^^^^
|
= note: see issue #54726 <https://github.com/rust-lang/rust/issues/54726> for more information
= help: add `#![feature(custom_inner_attributes)]` to the crate attributes to enable

error: using tabs in doc comments is not recommended
--> $DIR/ice-5835.rs:4:10
|
LL | /// 位
| ^^^^ help: consider using four spaces per tab
|
= note: `-D clippy::tabs-in-doc-comments` implied by `-D warnings`

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0658`.

0 comments on commit 1573d10

Please sign in to comment.