-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANSI control characters not treated as zero width #24
Comments
It's unclear to me where the README file says that: the control characters are marked as "neutral" in https://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt and "neutral" is a narrow character in the spec (see http://www.unicode.org/reports/tr11/#ED7). The purpose of this crate is to follow the spec; not try to implement rendered column width in terminals. |
@Manishearth : Sorry, I guess it wasn't the README, but instead the docstring for Line 102 in b58e85b
|
Hmmm, that's interesting. I'm not sure where that came from, will need to go through the spec a bit more when I have time |
It seems like the issue might be that colors are actually represented by sequences of characters, rather than with any individual character... The example from the description, rendered with: format!("Widths: {:#?}", s.chars().map(|c| format!("{:?}: {:?}", c, c.width())).collect::<Vec<_>>()) ...looks like:
|
Oh, yeah, this is behaving as expected then (the |
Ok, thanks: that makes sense. For whomever might come across this issue next, this is what I think counting characters with use vte::{Parser, Perform};
use unicode_width::UnicodeWidthChar;
fn count_blocks(s: &str) -> usize {
struct BlockCounter(usize);
impl Perform for BlockCounter {
fn print(&mut self, c: char) {
self.0 += c.width().unwrap_or(0);
}
fn execute(&mut self, byte: u8) {
if byte == b'\n' {
self.0 += 1;
}
}
}
let mut block_counter = BlockCounter(0);
let mut parser = Parser::new();
for b in s.as_bytes() {
parser.advance(&mut block_counter, *b)
}
block_counter.0
} |
Hey folks!
According to the
README.md
, control characters should be treated as zero width, but it seems like ANSI color sequences are not currently. Code like the following (using strip_ansi_escapes) will fail for strings containing ANSI control characters:...such as:
"\u{1b}[1m========\u{1b}[0m"
Is this expected?
The text was updated successfully, but these errors were encountered: