Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any reason SHORT_STRINGS is undef'd on OpenBSD? #54

Closed
alzwded opened this issue Oct 10, 2022 · 9 comments
Closed

Any reason SHORT_STRINGS is undef'd on OpenBSD? #54

alzwded opened this issue Oct 10, 2022 · 9 comments

Comments

@alzwded
Copy link
Contributor

alzwded commented Oct 10, 2022

I built 6.24.01 on OpenBSD by updating the port, and it built with 7bit ASCII (b7) instead of wide strings (wide). There were also a lot of warnings along the lines of "format string is wchar_t* but argument was Char*".

Other than removing lines 57-58 from config_f.h, I don't see any other way of enabling wide strings on this system. And that would require a patch file in the port.

I've built it with wide strings, and it seems fine after a couple of minutes of testing. The OpenBSD 7.1 port of 6.22.04 is built with wide string support, and that one is definitely fine with unicode.

Is there any good reason for turning wide string support off on OpenBSD, or can that elif defined(__OpenBSD__) undef SHORT_STRINGS block be removed entirely?

git blame says merge patch from pkgsrc which is vague; I could understand building tcsh via pkgsrc to maybe have some strange issues with unicode, and that's why it was set the way it is, but that system wouldn't be my first pick to build tcsh on OpenBSD...

I propose deleting those two lines.

--- config_f.h.old      Mon Oct 10 21:30:53 2022
+++ config_f.h  Mon Oct 10 21:16:39 2022
@@ -54,8 +54,6 @@
 # else
 #  undef SHORT_STRINGS
 # endif
-#elif defined(__OpenBSD__)
-# undef SHORT_STRINGS
 #else
 # define SHORT_STRINGS
 #endif

Or maybe add some extra checks to make sure wide is the default on systems like i386/amd64 etc. (or, in general, systems that tend to have a keyboard attached to them, where I'd want to type into tcsh)

@suominen
Copy link
Member

I thought I'd hunt down the history of the patch in pkgsrc:

http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/shells/tcsh/patches/Attic/patch-ab?rev=1.12&content-type=text/x-cvsweb-markup&only_with_tag=MAIN

Apparently this patch had to do with making the Meta key work. So that would be the thing to check, I guess.

@alzwded
Copy link
Contributor Author

alzwded commented Oct 11, 2022

Hmm... that's been around for a long time.

I use combinations like M-Bksp, M-p, M-d (and so on) quite extensively, and those seem to work as expected with Meta realized as an Esc prefix. I don't have either of the wsconsctl .metaesc or the X options turned on, so I wouldn't expect Alt to be treated as Meta in either environment (and it isn't).

I think I'll experiment more tonight with the ASCII build and combinations of wsconsctl and X settings to treat Alt as Meta, maybe something jumps out at me.

The thing is, with SHORT_STRINGS enabled, it builds cleanly; with them turned off, there are warnings around xprintf("%S") statements, which aren't a good sign. Maybe it used to be an issue at some point in the past, but it's no longer the case?

@suominen
Copy link
Member

I also use the Esc prefix for Meta, and I only have one OpenBSD VM solely for very infrequent tcsh testing, and I'm one of those "terminal + screen" people without X11, so I don't really have practical first-hand knowledge about this.

But I'm pretty sure the commit message is referring to using a Meta modifier key (such as Alt) resulting in 8-bit input.

It could very well be that this is ancient history that no longer needs to be kept. Those compiler warnings could also be the reason, although that's not what the commit log says.

Thank you for taking the time to test.

@alzwded
Copy link
Contributor Author

alzwded commented Oct 11, 2022

Okay, then I'm definitely testing with the Alt-as-meta settings.

Offtopic: I also use screen extensively, but I also like having some sort of X/window manager for mouse select and a web browser. And BSD's seem to lack a Romanian keyboard layout for the console, and I've yet to figure out where in the source tree those go, but I'll get there eventually...

@suominen
Copy link
Member

For the off-topic part:

  • OpenBSD: Looks like keyboard types are built-in to the wscons(4) driver
    • kbd(8) (for me kbd -l only works as root)
    • wsconsctl(8)
    • /etc/kbdtype
    • /etc/examples/wsconsctl.conf
    • There appears to be a user type keyboard...
  • NetBSD: Look at /etc/wscons.conf -- it has further references
    • Also look in /usr/share/wscons/keymaps
  • FreeBSD: See kbdcontrol(1) and look in /usr/share/syscons/keymaps

@alzwded
Copy link
Contributor Author

alzwded commented Oct 11, 2022

So I did some testing with:

  • tcsh 6.24.01 compiled with #define SHORT_STRINGS (wide)
  • tcsh 6.24.01 compield with #undef SHORT_STRINGS (7b, i.e. what you get from git without any changes)
  • in both cases I've run ./configure --disable-nls-catalogs, per /usr/ports/shells/tcsh/Makefile
  • on OpenBSD 7.1 i386

Console

wsconsctl keyboard.encoding=hu.metaesc
# picked hu to have some keys produce multibyte unicode

with SHORT_STRINGS defined:

  • Alt-p, Alt-f, Alt-b, Alt-d work as expected
  • Esc+pfbd do the same thing
  • did some echo asd | xxd with those alt-based key presses in between, there were no phantom characters, probably because...
  • tcsh does not deal with unicode characters properly, it's very messed up (like the Ő or Ó or Ö keys; the first one is definitely a two byte sequence)
  • typing into cat rendered some of those unicode characters, but not ideally
  • I've also tried /bin/ksh, which seemed to do a better job at unicode, but still not perfect
  • I don't think this system deals with unicode in the console properly, at all; maybe there's some settings that need to be turned on which I don't know about; I had LC_ALL set to something UTF-8, I tried messing around with set term, I looked through /etc/example/* and I couldn't find anything obvious

tl;dr no difference in meta key processing between the 7b and wide builds; but unicode input is broken in the wide build

X

xterm

I've tried a couple of setxkbmap -option altwin:... options; confirmed I typed them correctly with setxkbmap -print. I've also added xmodmap -e "keysym F1 = Meta_L" ; xmodmap -e "keysym Alt_L = Meta_L Alt_L".

xterm seems completely oblivious. This is regardless of shell. The only way to trigger M-p, M-f etc is with Esc. Obviously I noticed no differences in meta handling between the 7b and wide builds (but, unlike the console, unicode input works perfectly here, in the wide build).

Maybe xterm ignoring setxkbmap/xmodmap to add a Meta_L key is (another) user error; however, my Caps_Lock = Control_L mod is taken into account, so I don't know what to believe... Well, I blamed xterm itself and picked a different program...

qterminal

I've used xmodmap to make F1 Meta_L and Alt_L both Meta_L and Alt_L.

In both 7b and wide builds:

  • F1 works as a meta key modifier (e.g. F1-p works, with both keys pressed at the same time)
  • Likewise, Alt_L works as a meta key
  • Esc works as a meta key
  • No differences between the 7b and wide builds

Conclusion

I don't know how to interpret what I've experienced. I don't think turning SHORT_STRINGS on or off at build time has any impact on Meta key processing on PC. But I am concerned unicode input is more borked in the console compared to ksh. What would be the better default? Who knows.

If you read through this whole ramble, hey, thanks for your time!

Later edit

Ah, I think this is why unicode is kinda broken in the console.

If logging in via the text console, add export LC_CTYPE="en_US.UTF-8" to your ~/.profile.
The text console's UTF-8 support is a work in progress, and some non-ASCII characters may not display properly.

With that out of the way, I believe there is no issue with tcsh itself when built with SHORT_STRINGS enabled on this system. I guess the console unicode support will get better in the future. Although I should test some more. I was setting LC_ALL instead of LC_CTYPE, and there was a mention somewhere in the man pages to only set LC_CTYPE and that LC_ALL is insecure and so on.

@alzwded
Copy link
Contributor Author

alzwded commented Oct 15, 2022

I've written a small test program and also went through the OpenBSD kernel sources, and yes, input gets ASCII-fied by the vt100 emulator in the text console (which would be in line with the comment in their FAQ saying support is partial).

I've also tested over ssh (and this is where the actually-supported UTF-8 support comes in... ...I think) and all is well with tcsh and the meta key.

With that, I am fairly sure that tcsh on modern OpenBSD with SHORT_STRINGS defined interprets the meta key correctly:

  • meta as esc
  • wscons .metaesc option
  • xmodmap defined Meta_L/Meta_R in X

Let me know which you prefer:

  • include the patch to remove the __OpenBSD_ ifdef in the default config_f.h header
  • close this issue and insist porters or tcsh fans patch it out
  • have me do some more super-specific testing based on something I've missed

Cheers

@zoulasc
Copy link
Member

zoulasc commented Oct 15, 2022 via email

@alzwded
Copy link
Contributor Author

alzwded commented Oct 16, 2022

Wow, someone had some free time to fix those warnings 😄 Thanks!

@alzwded alzwded closed this as completed Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants