Skip to content

find_str returns false positives #16878

@mfronczyk

Description

@mfronczyk

It seems that for UTF-8 strings the find_str method can return the byte index of the first matching substring, but there is no matching substring in fact. Here's the code:

let m = "śśdóahżłjnckpyzjah";
let n = "hah";
println!("({})", m.as_bytes());
println!("({})", n.as_bytes());
let pos = m.find_str(n).unwrap();
let substring = m.slice(pos, pos + 3).as_bytes();
println!("{}", substring);

Compiled with "rustc parser.rs"

And it returns:

([197, 155, 197, 155, 100, 195, 179, 97, 104, 197, 188, 197, 130, 106, 110, 99, 107, 112, 121, 122, 106, 97, 104])
([104, 97, 104])
[106, 97, 104]

Which indicates that the string "hah" was found at the end of the m string, but it isn't there - the end is "jah".

Rust version: rustc 0.12.0-pre-nightly (2e92c67 2014-08-28 23:56:20 +0000) 64bit
OS: OS X 10.9.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions