Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

man syntax doesn't highlight bold functions correctly #652

Open
LunarLambda opened this issue Sep 5, 2019 · 33 comments
Open

man syntax doesn't highlight bold functions correctly #652

LunarLambda opened this issue Sep 5, 2019 · 33 comments

Comments

@LunarLambda
Copy link
Contributor

LunarLambda commented Sep 5, 2019

Terminals tested: alacritty, mate-terminal, urxvt

bat --version: 0.12.0 (Installed via cargo install bat)

$MANPAGER: bat --paging=never -pl man [1] [2]

[1]: I disabled paging to make sure it's not a problem with less(1).
[2]: The documentation suggests setting MANPAGER to sh -c "col -b | bat -pl man" however I found using col actually just garbled the output even more, see screenshot further down.

Output with MANPAGER='bat -pl man'
image

Output with MANPAGER='' or MANPAGER='less'
image

The issue seems to be with highlighting functions / page references (foo(...)) when bold output is used.

When using col -b as suggested, it becomes even worse:

Output with MANPAGER='sh -c "col -b | bat -pl man"'
image

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

Thank you for the detailed bug report!

I'm going to assume that you are using man sprintf in your examples(?).

To figure out what's going on in detail, we can actually use bat -A to show what exactly man outputs:

MANPAGER="bat -A" man sprintf

After finding the corresponding section, we can take a look at how man prints bold text. It is both fascinating and infuriating. Instead of using ANSI escape sequences, it prints

p␈pr␈ri␈in␈nt␈tf␈f

for a bold printf (bat -A shows instead of the \b backspace character). I believe this is how "bold" was done in the times of typewriters. You would hit backspace and then just re-type the same character to give it more weight.

On todays terminal emulators, that doesn't actually work. If you use MANPAGER="" or MANPAGER="cat", no bold text will be shown. To make sure, we can also call

printf "p\bpr\bri\bin\bnt\btf\bf\n"

which will just print printf on the terminal.

Interestingly, less has a special feature that shows such sequences in bold. Quoting from man less: "Also, backspaces which appear between two identical characters are treated specially: the overstruck text is printed using the terminal's hardware boldface capability. Other backspaces are deleted, along with the preceding character". This is why we see a bold face printf, when we call

printf "p\bpr\bri\bin\bnt\btf\bf\n" | less

There is also a similar feature for underlined text:

printf "p\b_r\b_i\b_n\b_t\b_f\b_\n" | less

Back to bat. When I initially played with this, I noticed that these backspace characters were causing problems when intermixed with bats syntax highlighting. Imagine we have

int printf(const char* format, ...);

in a man page and the whole line is printed in bold (beginning of man sprintf). The syntax highlighter will try to highlight certain special characters like the opening parenthesis (. However, that breaks the backspace-for-bold-font-trick and actual backspace characters will start appearing in your output.

For this reason, I originally used col -b (col --no-backspaces), which turns something like "p\bpr\bri\bin\bnt\btf\bf into printf:

▶ printf "p\bpr\bri\bin\bnt\btf\bf\n" | bat -Ap         
p␈pr␈ri␈in␈nt␈tf␈f␊

▶ printf "p\bpr\bri\bin\bnt\btf\bf\n" | col -b | bat -Ap
printf␊

Unfortunately, I missed that col -b "also replaces any whitespace characters with tabs where possible". This is what breaks the table layout in the above example. Fortunately, we can switch this off via cols -x/--spaces option.

The following works for me:

MANPAGER="sh -c 'col -bx | bat -p -lman'" man sprintf

image

I think we should update the instructions in the README to suggest col -bx.

Unfortunately, it looks like your col command does things a little differently. I couldn't exactly reproduce your screenshots above. My version is:

▶ col --version 
col from util-linux 2.34

@LunarLambda
Copy link
Contributor Author

I have col from util-linux 2.33.2.

Unfortunately MANPAGER='sh -c "col -bx | bat -plman"' man sprintf yields the following

image

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

In this case, it does not seem like col is the problem. Could you please post the output of alias bat and the output of the following bash script?

set -x

bat --version
bat --config-file
bat --cache-dir
less --version

bat "$(bat --config-file)"
ls "$(bat --cache-dir)"

set +x

echo "BAT_PAGER = '$BAT_PAGER'"
echo "BAT_CONFIG_PATH = '$BAT_CONFIG_PATH'"
echo "BAT_STYLE = '$BAT_STYLE'"
echo "BAT_THEME = '$BAT_THEME'"
echo "BAT_TABS = '$BAT_TABS'"
echo "PAGER = '$PAGER'"
echo "LESS = '$LESS'"

@LunarLambda
Copy link
Contributor Author

++ alias bat
bash: alias: bat: not found
++ bat --version
bat 0.11.0
++ bat --config-file
/home/luna/.config/bat/config
++ bat --cache-dir
/home/luna/.cache/bat
++ less --version
less 551 (POSIX regular expressions)
Copyright (C) 1984-2019  Mark Nudelman

less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Home page: http://www.greenwoodsoftware.com/less
+++ bat --config-file
++ bat /home/luna/.config/bat/config
[bat error]: '/home/luna/.config/bat/config': No such file or directory (os error 2)
+++ bat --cache-dir
++ ls --color=auto /home/luna/.cache/bat
ls: cannot access '/home/luna/.cache/bat': No such file or directory
++ set +x
BAT_PAGER = ''
BAT_CONFIG_PATH = ''
BAT_STYLE = ''
BAT_THEME = ''
BAT_TABS = ''
PAGER = ''
LESS = ''

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

Hm, nothing unusual there.

It would be great if you could show two other screenshots:

One for:

MANPAGER='sh -c "col -bx | bat -plman --color=never"' man sprintf

and one for

MANPAGER='sh -c "col -bx | bat -Ap"' man sprintf

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

1:
image

2:
image

These are once again using alacritty, but I got the same results with various vte-based terminals (gnome-terminal, etc), and urxvt.

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

I've got an idea. What does type man or which man say for you? Is it calling /usr/bin/man or is it some shell function wrapping the real man (and possibly trying to add some colors itself)?

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

/usr/bin/man, nothing special here.

I'm using Zsh, but little to no configuration (no oh-my-zsh, any aliases replacing commands, etc...)

file $(which man) reports a ELF exe, so no wrapper script there either.

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

Okay. So the output is definitely already messed up when it reaches bat (messed up = contains parts of ANSI escape sequences like 1m, 24m etc.). It could be either man itself (does MANPAGER="" man sprintf show colors for you?) or col -bx.

If col is the problem, you could check the output of

MANPAGER="bat -Ap" man sprintf

directly. It should contain plenty of backspace characters, but no ANSI escape sequences.

Thank you very much for following along!

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

MANPAGER="" man sprintf shows bold and underline text (no pager though)

MANPAGER="bat -Ap" man sprintf shows this...
image

Oh thank you for taking on the issue, bat has become an inexpendable tool for me (so much so I have an alias b='bat -pn', haha)

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

I also ran it with MANPAGER="cat -A"

Plenty of ansi sequences, but no backspaces, very weird...

^[[1m -> bold on
^[[0m -> bold off
^[[4m -> underline on
^[[24m -> underline off
^[[22m -> color off/bold off

image

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

Ok. It looks like your version of man actually uses ANSI escape sequences already.

It might be worth going through man man or man --help to see if there is anything to turn this off. Might also be worth to check the values of man-related environment variables (eg MANOPT).

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

man itself has no such option.

Using a very hacky strace oneliner I got the execution chain for a man invocation. One of these programs will probably have an option for it, however I can't actually find anything right now...

image

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

grotty can use the old format (using backspaces) by passing the -c option or setting GROFF_NO_SGR

grotty -c -b -u would use the old format (no SGR sequences), and supresses overstriking and underlining for bold/italic respectively. However, I have no clue how to propagate that option through the entire chain short of writing a wrapper script around grotty...

Perhaps just being able to pass -c would be enough.

@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2019

Hm. We could try to remove ANSI codes from the output (instead of using col -bx). See this page, for example. It won't be pretty 😄

Might make sense to move this to a separate script that can be used as MANPAGER.

In the future, we could potentially also try to find a proper/better solution by pre-processing within bat.

@LunarLambda
Copy link
Contributor Author

LunarLambda commented Sep 6, 2019

Well.

MANROFFOPT="-c" MANPAGER="sh -c 'col -bx | bat -plman'" man sprintf Finally worked. No bold or underlined text, but it finally displays correctly :D

While this presents a working solution for now, I'd suggest either keeping this issue open, or opening a new one, as this is rather hacky. (although it was fun learning experience about the joys of old unix tech!)

image

@sharkdp
Copy link
Owner

sharkdp commented Oct 15, 2019

I'd like to close this. It is now described in the README, and I currently don't see a better solution.

@sharkdp sharkdp closed this as completed Oct 15, 2019
@LunarLambda
Copy link
Contributor Author

Understandable ^^

@xeruf
Copy link
Contributor

xeruf commented Jun 21, 2020

You should mention in the README that bold highlighting is unsupported - I was quite confused, and this issue doesn't really go into that.

@sharkdp
Copy link
Owner

sharkdp commented Jun 22, 2020

Seriously? This issue "doesn't really go into that"? We have spent hours to debug this and have written extremely detailed comments that document everything.

You should mention in the README that bold highlighting is unsupported

Nobody "should" do anything here, but I agree that it's probably a good idea to add that. Contributions to the documentation are always welcome.

@xeruf
Copy link
Contributor

xeruf commented Jun 23, 2020

Hey, sorry if that was phrased unappreciative. I did read the comments and it was quite informative, but to me seemed mostly concerned with the problems of the control characters used for boldness messing up the output.

What I was wondering is whether this could actually be changed to interpret boldness. I am writing a man page myself and would like to see it as the end users see it, so I currently have to use less, but much prefer the overall look of bat :)

@sharkdp sharkdp reopened this Jul 25, 2020
@sharkdp
Copy link
Owner

sharkdp commented Jul 25, 2020

I'm going to reopen this, as there might actually be a way to solve this, if we write a man preprocessor within bat.

@damien
Copy link

damien commented Nov 18, 2020

I ran into this as well using Windows Terminal with bat as a man pager. The settings recommended by @LunarLambda in #652 (comment) resolved my problem. 👍

@xeruf
Copy link
Contributor

xeruf commented Jul 1, 2021

Program versions

Arch Linux
man 2.9.4
col from util-linux 2.37
bat 0.18.1

Comparison

MANPAGER='less' man printf

image

MANPAGER='bat -pl man' man printf

image

MANPAGER="sh -c 'col -bx | bat -pl man'" man printf

image

Neither MANROFFOPT nor adding/removing -b for col seem to change anything for me.

Conclusion

Adding colors is nice, but since bat right now does not display the essential highlightings, I am considering to switch back to less or find an interactive man viewer where I can follow links.

@avimehenwal
Copy link

for me, working on fedora 35 export MANROFFOPT="-c" helped
Thankyou @xeruf @LunarLambda

macedigital added a commit to macedigital/dotfiles that referenced this issue Dec 30, 2021
You need to set additional option when man already uses ANSI escape sequences. Providing `-c` option via environment variables `MANROFFOPT` fixes wrong rendering of man pages via bat.

See sharkdp/bat#652 (comment).
@leppaott
Copy link

leppaott commented Feb 4, 2022

for me, working on fedora 35 export MANROFFOPT="-c" helped Thankyou @xeruf @LunarLambda

Same here, maybe could be added to README?

victor-gp added a commit to victor-gp/cmd-help-sublime-syntax that referenced this issue Mar 15, 2022
Backspace formatting for bold characters, done by relics such as man or
less (bless them anyways). Graciously explained in [1].

Presents the same zebra-coloring problem as delta, ditto the previous
commit message.

Maybe I should re-think my approach to these, try to scope everything
with a token themed with the default fg color... But not today.

[1] sharkdp/bat#652 (comment)
@ian-h-chamberlain
Copy link

ian-h-chamberlain commented Jan 6, 2023

I've done a little more digging into this, as I have one Linux system and macOS where I'm running into this. Ideally, both color and bold/underline would be output as ANSI codes and bat would happily interpret them, but groff appears to still be generating X^HX even when it also uses color output!

It seems that in Debian it might be possible to achieve with GROFF_SGR=1 or editing /etc/groff files:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750202
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963490

So far I have not found a working option for macOS or the CentOS system where I'm still seeing the issue, but I'm working on trying an option to "dumb replace" them, something like

# doesn't work quite right...
MANPAGER="sed -r 's/(.)\x08\1/\033[1m\1\033[0m/g' | bat -plman"

This StackExchange also seems to have lots of relevant details here, which makes it seem like lots of the options here are distro-dependent unfortunately... Maybe preprocessing in bat really would make it simpler 😢

Edit: one more resource explaining some different behavior on Arch (where everything seems to work... better? differently? for me at least) and Debian

@xeruf
Copy link
Contributor

xeruf commented Jan 10, 2023

update from my side: Using nvim/emacs as man viewer now as these can follow links as well ;)

@ian-h-chamberlain
Copy link

ian-h-chamberlain commented Jan 13, 2023

Okay, phew! I dug in a little more and got a usable sed command, but unfortunately there still seems to be an issue with --language Manpage even using ANSI codes instead of overstrike.

Here's the command I'm using:

sed=gsed # needed on macOS it seemzs
# sed=sed # linux

export MANPAGER="$sed -E 's/(.)\x08\1/\x1b[1m\1\x1b[22m/g' |
	$sed -E 's/_\x08(.)/\x1b[4m\1\x1b[24m/g' |
	bat -p"
man sprintf

This displays non-colored but correctly decorated pages, as you might expect! less, cat etc. should also work here.

Screen Shot 2023-01-13 at 09 30 55

However, when using bat --language Manpage, it seems the color of the syntax highlight gets garbled with the bold/underline codes, similar to the OP report:

export MANPAGER="$sed -E 's/(.)\x08\1/\x1b[1m\1\x1b[22m/g' | 
	$sed -E 's/_\x08(.)/\x1b[4m\1\x1b[24m/g' |
	bat -plman"
man sprintf

Screen Shot 2023-01-13 at 09 29 54

Is it expected that bat would correctly handle the syntax highlighting intermingled with the source data having control characters? If so, I'd propose that as the actionable item here, and have it be the user's responsibility to ensure the input manpage data is "normalized" (i.e. using all ANSI or all overstrike decorations). Thoughts?

@thomcc
Copy link

thomcc commented Aug 3, 2023

On macOS this happens if you use the man binary provided by brew's man-db package. I don't remember why I added it, so brew uninstall man-db brought me back to using the system man implementation, which is more well-behaved about escape sequences.

Not sure if that's viable for anybody else, but removing it was a huge QoL improvement for me (back to bat's highlighting, and no more broken escapes written in my manpages), so I figured I'd mention it here in case someone else in the same situation hits it.

Example of what the brokenness looked like, since it doesn't quite seem the same as the others, although it's basically the same problem.

(Before)

LOCATE(1)                                BSD General Commands Manual                                LOCATE(1)

1mNAME0m
     1mlocate 22m— find filenames quickly

1mSYNOPSIS0m
     1mlocate 22m[1m-0Scims22m] [1m-l 4m22mlimit24m] [1m-d 4m22mdatabase24m] 4mpattern24m 4m...0m

1mDESCRIPTION0m
     The 1mlocate 22mprogram searches a database for all pathnames which match the specified 4mpattern24m.  The data‐

(After)

LOCATE(1)                                   General Commands Manual                                  LOCATE(1)

NAME
     locate – find filenames quickly

SYNOPSIS
     locate [-0Scims] [-l limit] [-d database] pattern ...

DESCRIPTION
     The locate program searches a database for all pathnames which match the specified pattern.  The database
     is recomputed periodically (usually weekly or daily), and contains the pathnames of all files which are
     publicly accessible.

Both had some amount of bat highlighting, but with the extra text it was just unreadable before.

kverb added a commit to kverb/dotfiles that referenced this issue Oct 19, 2023
See problem and solution on bat's
[github issue #652](sharkdp/bat#652 (comment))
kverb added a commit to kverb/dotfiles that referenced this issue Oct 20, 2023
See problem and solution on bat's
[github issue #652](sharkdp/bat#652 (comment))
@yshui
Copy link

yshui commented Jan 14, 2024

do we have to use col -b? it removes boldness from text. can't bat parse the backspace characters and make the text bold?

@danbulant
Copy link

As piping man to bat works for me, unlike using man pager, fish users may try the following:

function m --wraps man
  man $argv | bat -pl man
end

In case it helps someone

@Heyian
Copy link

Heyian commented Apr 12, 2024

I'm using fish too and adding this line to the fish config file worked as well.

set -x MANROFFOPT "-c

You also need the line below to use bat as stated in the doc, I'm putting it here in case it didn't come with your initial config and you missed it in the docs.

set -x MANPAGER "sh -c 'col -bx | bat -l man -p'"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests