-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn ansi escape sequences into html tags #238
Conversation
Much better! Thanks! |
I just setup a new laptop and apply this patch again. I found another corner case with ^ symbols, so PR is updated. Anything missing to apply it? just check with |
In w3mman bash I spotted this problem:
With your latest version. We'll get there. :-) |
Can you explain how you "translate" a missed character into a w3mman2html.cgi entry? It would be a shame if this fix does not get implemented. |
Let me check your last gotcha:
Looks I forget to add [ and ] to possible inside chars. It should be fixed now. Basically I use:
and compare with
Basically the code replaces any number of characters from printchar, surrounded by these ansi escape sequences into HTML tags. Anyway check what is different in some fedora distros to cause this would be a better solutions |
Thanks! I found a few more. So [0m [1m [4m and [22m
|
Symbol I see other paths to fix with:
working on it |
Added more symbols like |
Thanks for the updates, almost there. Since it happens at the ends of lines I suspect it has something to do with the line-breaks. This is with
It also happens if you don't set COLUMNS, but isn't as visible, since it happens in the wrapped line. Setting COLUMNS makes it stand out. |
Ah yes, wrapped texto do not includ new line, fixing it |
See if columns create a splitted word man is wrapped with start and end sequence: here that pattern is here: so I think word splitted are correctly covered. I tested and works on my side, can you try again: |
The diff hasn't changed, and I still see the same problem.
|
Yes for me It works. I can only think the missing symbol is that lower dash, as you can see It is for me -. I added _ previously but yours looks small, have to check what unicode that is |
A fair point, whilst using The UTF-8 char is: ‐ Here is the hexl output:
So Does that help? |
cool I think know we have a solution that works for any char. anything that is not an escape. Let me know if that works now |
It looks like it should, much appreciated! |
Added option for
|
Yet another one bites the dust. 😊 |
Merged, thanks for your contribution. |
I've found another gem in maildirmake(1) from the maildrop package:
Which results in:
|
This is problematic because currently nested syntax is not allowed:
There's a line for |
Fixed by setting GROFF_NO_SGR. Note that Debian disable the use of SGR escape sequences by default. |
Looking much better, thanks!
|
So probably that invalidates all need for the merged changes on cgi? |
Reverted this pull request. |
2023-01-21 Tatsuya Kinoshita <tats@debian.org> * NEWS: Update NEWS to 0.5.3+git20230121. 2023-01-15 Tatsuya Kinoshita <tats@debian.org> * scripts/w3mman/w3mman2html.cgi.in: Add GROFF_NO_SGR=1 to w3mman2html.cgi for non-Debian groff. Bug-Debian: tats/w3m#238 Bug-Debian: tats/w3m#201 * scripts/w3mman/w3mman2html.cgi.in: Revert "Turn ansi escape sequences into html tags". This reverts commit 44af9271e0e984544762e2212549f134c86b4418. cf. tats/w3m#238 2023-01-12 Tatsuya Kinoshita <tats@debian.org> * fm.h, rc.c: Do not expand config value of tmp_dir. * config.h.dist, config.h.in, configure, configure.ac, rc.c: Use faccessat for rc_dir and tmp_dir. * local.c: Allow writeLocalCookie even when no_rc_dir. * main.c, rc.c: Call wtf_init in sync_with_option. * rc.c: Avoid modifying read-only rc_dir. * fm.h, main.c, proto.h, rc.c: Make tmp_dir if not found. 2023-01-09 Tatsuya Kinoshita <tats@debian.org> * NEWS: Prepare NEWS for w3m 0.5.3+git202301XX. * doc-de/FAQ.html, doc-jp/FAQ.html, doc/FAQ.html: Remove obsolete documents. * doc-de/FAQ.html, doc-de/MANUAL.html: Wrap long lines to avoid Lintian warnings. 2023-01-07 Tatsuya Kinoshita <tats@debian.org> * file.c: Only read a first title. * file.c, fm.h: Revert "Only read title when in head". This reverts commit 0189e8aa5c4c4919a9bbc4dcbe0e521aada51e3c. Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020215 2023-01-06 Tatsuya Kinoshita <tats@debian.org> * file.c: Indentation fix for HTMLtagproc1. 2023-01-06 Robert Alm Nilsson <robert@robalni.org> * file.c, fm.h: Only read title when in head. Origin: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020215 2023-01-06 Tatsuya Kinoshita <tats@debian.org> * libwc/charset.c: Avoid locale sensitive tolower in wc_charset_to_ces. 2023-01-06 Sertaç Ö. Yıldız <sertacyildiz@gmail.com> * libwc/charset.c: Fix charset declaration parser fails with turkish locale. Origin: https://bugzilla-attachments.redhat.com/attachment.cgi?id=160014 Bug-Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=249675 * history.c: Use st_mtime instead of st_mtim.tv_sec to compile on macos. cf. tats/w3m#247 2023-01-06 Rene Kita <mail@rkta.de> * html.c, html.h, tagtable.tab: Recognize link targets in dfn elements. Refactor html.c. Align in html.c. Origin: tats/w3m#259 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1018696 * Makefile.in, form.c, main.c, util.c, util.h: Handle failed system calls. * display.c, display.h, file.c, form.c, main.c, proto.h, terms.h: Move declarations to appropiate header files. Origin: tats/w3m#257 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=398989 * entity.js, etc.c, table.c, tests/allentity.expected: * tests/allentity.html: Skip soft hyphen when reading token. Fix generated HTML for entity test. Origin: tats/w3m#256 Bug-Debian: tats/w3m#224 Bug-Debian: tats/w3m#258 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=830173 * file.c: Check LESSOPEN to avoid undefined behaviour. Refactor lessopen_stream. Origin: tats/w3m#254 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991608 2023-01-05 Markus Hiereth <translation@hiereth.de> * po/de.po: Update German message catalogue. Origin: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011945#10 2023-01-05 Rene Kita <mail@rkta.de> * buffer.c: Exit with error if a new buffer can't be allocated. Origin: https://git.sr.ht/~rkta/w3m/commit/1f88544c1a009ed2088ff20973bcfffe6cbcb5de Bug-Debian: tats/w3m#232 Bug-Debian: tats/w3m#233 * history.c, history.h: Merge history file if it was modified after start. * history.h, proto.h: Move declarations to the appropriate header file. * history.c: Add comment to explain placement of the ifdef. * history.c, proto.h: Let loadHistory return an error code. * history.c: Use 'goto fail' to remove code duplication. Origin: tats/w3m#247 Bug-Debian: tats/w3m#176 2023-01-05 Alberto Fanjul <albertofanjul@gmail.com> * scripts/w3mman/w3mman2html.cgi.in: Turn ansi escape sequences into html tags. Origin: tats/w3m#238 Bug-Debian: tats/w3m#201 2023-01-04 Tatsuya Kinoshita <tats@debian.org> * po/de.po, po/it.po, po/ja.po, po/sv_SE.po, po/w3m.pot, po/zh_CN.po: * po/zh_TW.po: Update PO strings. * doc/MANUAL.html, doc/README.img, libwc/wc_types.h, main.c, rc.c: English fixes. cf. tats/w3m#241 2023-01-04 Rene Kita <mail@rkta.de> * rc.c: Remove unused variable. * table.c: Remove a warning for bzero with GCC 12. * file.c: Fix potential null pointer dereference. * .github/workflows/build.yml: Don't error out on deprecated declaration warnings. Origin: tats/w3m#255 cf. tats/w3m#252 2023-01-04 nico <smnicolas@gmail.com> * doc/MANUAL.html, doc/w3m.1, fm.h, main.c, rc.c, terms.c: Add high-intensity colors option and cli flag. Origin: tats/w3m#251 cf. tats/w3m#250 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626291 2023-01-04 Trafficone <trafficone@gmail.com> * doc/README.SSL, doc/README.keymap, doc/README.menu: Translate from doc-jp. * doc/README.cookie, doc/README.func, doc/README.img, doc/README.m17n: * doc/README.passwd: Clarified wording. Minor grammar changes. Origin: tats/w3m#241 2022-12-25 Tatsuya Kinoshita <tats@debian.org> * configure: Update configure with acinclude.m4. 2022-12-25 Sam James <sam@gentoo.org> * acinclude.m4: Fix configure tests broken with Clang 16. Origin: tats/w3m#248 2022-12-25 Rin Okuyama <rokuyama.rk@gmail.com> * image.c, terms.c: For sixel, no need to round image size to multiple of character size. Origin: tats/w3m#246 * image.c: Display resized image for OSC 5379 (mlterm). Origin: tats/w3m#245 2022-12-25 Rene Kita <mail@rkta.de> * doc/README.siteconf: Say what the comment character is. Use the comment character in Examples. Origin: tats/w3m#237 * main.c: Retry if loading of a file fails when argv_is_url. Origin: tats/w3m#235 Bug-Debian: tats/w3m#210 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=537761 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946440 2022-12-25 NRK <nrk@disroot.org> * image.c: remove duplicate declaration. * cookie.c, entity.c, file.c, frame.c, func.c, image.c, linein.c: * mailcap.c, main.c, rc.c, rc.h, table.c, terms.c, terms.h: * w3mbookmark.c, w3mhelperpanel.c: fix all -Wmissing-prototypes warnings. * file.c, history.c, history.h, indep.c, indep.h, mailcap.c, proto.h: * rc.c, terms.c, url.c: fix some -Wstrict-prototypes warnings. Origin: tats/w3m#234 2022-12-25 Rene Kita <mail@rkta.de> * .github/workflows/build.yml: Add GitHub Action to build source when pushing. Origin: tats/w3m#228 2022-12-21 Tatsuya Kinoshita <tats@debian.org> * po/de.po, po/it.po, po/ja.po, po/sv_SE.po, po/w3m.pot, po/zh_CN.po: * po/zh_TW.po: Update PO strings. 2022-12-21 Rene Kita <mail@rkta.de> * etc.c, fm.h, history.c, rc.c: Add option to set directory for temporary files. Origin: tats/w3m#219 cf. tats/w3m#130 2022-12-21 Yash Lala <yashlala@gmail.com> * rc.c: Use `Strnew_charp()` to create `char *` instead of `strdup()`. * rc.c: refactor: Substitute some clunky code with a `strdup()`. * doc/FAQ.html, doc/MANUAL.html, doc/w3m.1, rc.c: Set `rc_dir` based on `W3M_DIR` environment variable. Origin: tats/w3m#207 cf. tats/w3m#130 2022-12-20 Tatsuya Kinoshita <tats@debian.org> * etc.c: Fix potential overflow in checkType. * etc.c: Fix m17n backspace handling causes out-of-bounds write in checkType. [CVE-2022-38223] Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1019599 Bug-Debian: tats/w3m#242
fixes #201