-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle ANSI color codes #140
Comments
Yes! this would be very useful for me too, though in my case the codes are not ANSI color codes but other game-specific formatting. A general way to mark parts of the input text as color/formatting would be nice. Such spans would be ignored for purposes of width estimation, formatting and breaking, but would be emitted in the output in the same place as they were in the input. |
Hey @laanwj, thanks for the comment! It hadn't occurred to me that ignoring certain spans could be useful in other ways than simply ignoring the traditional color codes. So we should be able to have a user-defined function tell us where the spans of text are and where the spans of ignored data is. I just tried to see if it would be possible to word-wrap Markdown text for output in a console, and it's fairly complicated: extern crate pulldown_cmark;
extern crate textwrap;
use pulldown_cmark::{Event, Parser, Tag};
use std::io::Write;
#[derive(Debug)]
enum Span<'a> {
Text(String),
Markup(&'a [u8]),
}
const RESET: &'static [u8] = b"\x1b[m";
const BOLD: &'static [u8] = b"\x1b[1m";
const ITALIC: &'static [u8] = b"\x1b[3m";
const UNDERLINE: &'static [u8] = b"\x1b[4m";
fn main() {
let s = r#"**Markdown** is a [lightweight *markup* language](https://en.wikipedia.org/wiki/Lightweight_markup_language) with plain text formatting syntax."#;
let mut text = String::new();
let mut spans = vec![];
let mut markup_stack = vec![];
println!("//// events:");
for event in Parser::new(s) {
println!("{:?}", event);
match event {
Event::Text(t) => {
text.push_str(&t);
spans.push(Span::Text(t.into_owned()));
}
Event::Start(Tag::Strong) => {
markup_stack.push(BOLD);
spans.push(Span::Markup(BOLD));
}
Event::Start(Tag::Emphasis) => {
markup_stack.push(ITALIC);
spans.push(Span::Markup(ITALIC));
}
Event::Start(Tag::Link(_, _)) => {
markup_stack.push(UNDERLINE);
spans.push(Span::Markup(UNDERLINE));
}
Event::End(Tag::Strong) | Event::End(Tag::Emphasis) | Event::End(Tag::Link(_, _)) => {
markup_stack.pop();
spans.push(Span::Markup(RESET));
spans.extend(markup_stack.iter().map(|m| Span::Markup(m)));
}
_ => {}
}
}
println!();
println!("//// without markup:");
let wrapper = textwrap::Wrapper::with_splitter(30, textwrap::NoHyphenation);
let filled = wrapper.fill(&text);
assert_eq!(filled.len(), text.len(), "fill should not change length");
println!("{}", filled);
println!();
println!("//// with markup:");
let mut offset = 0;
let mut out = std::io::stdout();
for span in &spans {
match span {
Span::Markup(m) => {
out.write_all(m).expect("write failed!");
}
Span::Text(t) => {
// This uses the fact that `filled` has the same
// length as `text`, except that some ' ' have been
// replaced with newlines '\n'. We can thus share
// offsets between the two strings.
print!("{}", &filled[offset..offset + t.len()]);
offset += t.len();
}
}
}
println!();
} This shows output like this: This kind of code only works if there is no hyphenation going on and it probably also only works if the input text has no repeated spaces. I'm already thinking of changing textwrap to work on a list of |
Oh wow, hadn't thought of that: as long as the textwrap doesn't change the text, it doesn't have to know about the formatting at all, the string is divided in the same places! It's unfortunate that this doesn't extend to hyphen-ization or multiple spaces—I guess even that could be fixed up after the fact with a reconciliation pass—though a solution that works on tokens, as you say, would be more elegant. |
I hadn't though of it before you gave me the idea :-D I think you could take the above idea pretty far, but it'll be much nicer if there was some builtin support. I'll see what I can come up with... |
might be useful to take inspiration from how cursive handles this for their TextViews, as it more or less exactly implements the above; they have a let mut styled = StyledString::plain("Isn't ");
styled.append(StyledString::styled("that ", Color::Dark(BaseColor::Red)));
styled.append(StyledString::styled(
"cool?",
Style::from(Color::Light(BaseColor::Blue)).combine(Effect::Bold),
)); then their word wrapper which operates on them is here: https://github.com/gyscos/Cursive/blob/master/src/utils/lines/spans/lines_iterator.rs#L13 |
- add bolding for packages so they jump out - refer to "root" as "this project" - special case unknown packages Bolding currently means the linewrapping is messed up. Hopefully mgeisler/textwrap#140 will get sorted out and we can add more liberal use of visual hints without messing up the wrapping! 🤞
@mgeisler Any update on this? Would like to get it fixed on clap. |
ANSI escape sequences are typically used for colored text. The sequences start with a so-called CSI, followed by some "parameter bytes" before ending with a "final byte". We now handle these escape sequences by simply skipping over the bytes. This works well for escape sequences that change colors since they don't take up space and since they continue to work across any line breaks we insert. See https://en.wikipedia.org/wiki/ANSI_escape_code for details. Fixes: #140.
Hi @pksunkara, thanks for reminding me. I haven't looked more into this since I wrote the Markdown example above. However, I took a look now and found that it was relatively easy to add support for this, please see #179. |
ANSI escape sequences are typically used for colored text. The sequences start with a so-called CSI, followed by some "parameter bytes" before ending with a "final byte". We now handle these escape sequences by simply skipping over the bytes. This works well for escape sequences that change colors since they don't take up space and since they continue to work across any line breaks we insert. See https://en.wikipedia.org/wiki/ANSI_escape_code for details. Fixes: #140.
Textwrap currently doesn't know about ANSI color codes for colored terminal output. It simply treats these escape codes as any other character and this can mess up the output as shown here: clap-rs/clap#1246.
The custom width functionality of #128 could perhaps be used to solve this problem, but I feel it would be nice if this was a builtin feature (perhaps an optional feature depending on how invasive it will be to handle this).
The text was updated successfully, but these errors were encountered: