Skip to content

Experimental option: build and link static ncurses with builtin terminfo (no system dep!) #1271

Closed
wants to merge 1 commit into from

6 participants

@geoff-codes

So I spent some time (too much time, like a week) finally figuring out how to build ncurses such that it is completely independent from external terminfo, and can be linked like any normal static library.

Short version:
This pull adds a configure flag --with-builtin-terminfo which calls a script in build_tools which in turn downloads, configures, and sets this up for fish in one go. There should be no other functionality change. A more lengthy rationale follows.

Background/refresher:
I realize most of you will already know all this, but I thought it might be worth recapping. Fish is super pretty and can do things like fast auto-systax-highlighting because unlike other shells, fish draws to the terminal (exclusively?) by way of [n]curses library, which basically provides an abstraction over "what heinous escape sequence is needed to do some crazy thing like move the cursor arbitrarily, switch colors, etc." Curses is a UNIX (though not POSIX) standard, although the standard is quite limited. There used to be a few different curses implementations, but today I think only ncurses, NetBSD curses, and illumos libxcurses survive. (Does fish even build against these today? There's some code in configure.in, etc, but I'm not sure?)

In any case, ncurses is more or less the standard today. It's a good library, but it's kinda funky since the codebase is so old. It (typically) works by reading a terminfo "database" (a directory structure, usually at /usr/share/terminfo, where a bunch of terminal capability files are kept), which correspond to what's set in the $TERM variable.

The problem with that approach is that even when a new version of the library is built statically (with a typical install), the location of the terminfo directory is hardcoded. Therefore, if this directory is moved, modified or corrupted, fish built like this will simply die. So the system library/terminfo is almost always used.

As it turns out, ncurses can in fact be configured to include terminfo in the library itself. Why no one does this, I'm not sure... though I expect it has something to do with the fact the when built from source, ncurses has some ~120 custom configure options, most of which change the ABI, and its not clear at all which ones do what you want. I'm almost positive, in fact, that no one is doing this right now, since in the process I discovered a bug that breaks this configuration that was introduced last August (sent upstream a minute ago). The script includes a small patch for now, and just emits a warning if the patch fails.

Nonetheless, I think there are a number of reasons to consider this:

  • Relying on system ncurses and terminfo sucks, because its often old or buggy. I.e., Mac OS X 10.9 ships with ncurses 5.7 circa 2008.
  • Relying on system ncurses and terminfo is slow: it has to dlopen the shared library, which in turn opens the terminfo file for the application with each invocation.
  • Its easy to mess up the system config, as mentioned above, or by accidentally installing a bad file to ~/.terminfo.
  • This seems much faster, on my machine at least. And there are fewer jitters and artifacts.
  • When fish goes fully multithreaded/reentrant, it's probably going to need to use a threaded curses lib as well (or no ...?), and it's not clear one can rely on the system lib to have been configured that way.
  • Allows for more filesystem portability and resiliency against system weirdness.
  • Etc., etc. Add your own.

Cons:

  • This current first implementation approximately doubles the linked binary size. If it seems even larger, try stripping your binaries. This is only because I've included basically every usable terminfo entry here — with some investigation, one probably cut this down to ~10% of that.
  • Etc., etc. Add your own.

So, um, yeah. Let me know how this works for you? Other thoughts/considerations?

@ridiculousfish
The user-friendly shell member

Wow! This was quite an effort!

When would you recommend that users configure with the --with-builtin-terminfo flag? Also, what's the status of the invisible-island.net link, from which ncurses is downloaded? Is that official and/or likely to stick around?

For the record, I ultimately hope to drop the curses dependency entirely, and just parse the termcap files directly, like zsh does - perhaps incorporating zsh's code directly.

@geoff-codes

So, in no real order:

  • zsh curses actually just wraps ncurses itself (although I believe it can also use NetBSD or another XSI curses/terminfo lib). I just double checked that; it doesn't get past configure without one.
  • Yes, http://invisible-island.net is the official site, being the homepage of Thomas E. Dickey. He's been in charge of development since ~1996. He is also the author of xterm, lynx, mawk, and BSD yacc, amongst other notables. My impression is he's something of a maniac/genius — all those programs and another 10 or so get at least weekly patches to this day.
  • Ncurses is quite interesting because, while it is a GNU project, it one of the few (perhaps only?) to be licensed under a BSD-like license. Its an interesting read (search for 'the ncurses license'), if your interested in that sort of thing. I do believe the site is likely to stick around for some time — the changelog shows it has been regularly maintained there for appx. 18 years.

Do you think dropping a curses dependency is feasible? From my (admittedly rather cursory) knowledge of the code base here, that seems like it would be a daunting task. While I know fish reimplements a few term functions itself, it seems that even these eventually work back around to hooking an (lower) [n]curses function. Having another quick peek in term.h and curses.h, it seems many, many functions originate in / depend on these headers, no?

--with-builtin-terminfo is a poor name, but I wasn't sure what to call it. I didn't want to make it ridiculously long, like --build-and-link-static-ncurses-with-builtin-terminfo. Yeech.


So the really neat part here - recalling what you mentioned about zsh doing its own termcap parsing (terminfo just being a "newer, faster" encoding for IO purposes than the older termcap format, although they're both old and weird) - is that this actually completely eliminates all of that business: there are no files to read in at all. All the necessary termcap/terminfo data is included in the library, and thus, fish itself. (Hence also why 'HAVE_BROKEN_DEL_CURTERM' is needed here too — its not broken, it just doesn't exist.) In fact, I've intentionally set it here to completely ignore all external terminfo; that way, if another program really needs to use a different/modified capabilities file for some reason, you can go ahead and change it all you like; it doesn't affect fish. ($TERM still will, mind you, as the different terminal definitions are still selectable; but the data is all built in.)

And finally, as to 'when would one want to do this'... well, I think, possibly... always. I was actually hoping someone might be able to jump in and tell me why one wouldn't want to do this. The only reason I really think of is that there would be no concept of (strictly speaking) 'user definable terminal capabilities' for fish. But I think thats a very, very outdated idea, since we're not actually talking about terminals really, just terminal emulators, most or all of which just pretend to be xterm anyway. I don't think anyone using fish is really going out to the computer store and buying a new DEC terminal which needs a custom escape sequence programmed in to move the cursor around before they dial into ARPAnet and check their UUCP messages. Maybe, who knows. But I think myself a halfway decent "power user" of the shell, and personally, I've never hand-edited my own termcap files, ever. Nor would I really want to, I don't think. When something breaks, I just reinstall. So I thought, just do away with that whole mechanism, so it can't break in the first place.

@xfix
The user-friendly shell member
xfix commented Jan 28, 2014

--with-builtin-terminfo is a poor name, but I wasn't sure what to call it. I didn't want to make it ridiculously long, like --build-and-link-static-ncurses-with-builtin-terminfo. Yeech.

My proposal would be to use --static-terminfo.

@ridiculousfish
The user-friendly shell member

Thank you for the correction about zsh, which does appear to use curses under the hood. Based on that I don't think we can drop the curses dependency any time soon.

My two concerns are:
1. Increased memory usage.
2. Complicating the build process. For example, if building ncurses requires having gcc installed or something else huge, then the homebrew recipe could get out of hand, and we're more likely to break the build on niche platforms like the BSDs.

I'm analyzing the memory usage increase and will update this comment.

Activity Monitor reports an increase in RPRVT from 1.9 MB to 4.4 MB. This is entirely due to an increase of 2.6 MB in the __DATA section, which is writable.

allmemory reports:

Process Name [ PID]    Architecture    PrivateRes/NoSpec   Copied    Dirty    Swapped   Shared/NoSpec
===================    ============    ================= ========= ========= =========  =============

 fancy_fish [27845]:       64-bit         1390  /    1319       628      1104         0  20500 / 20239
normal_fish [27913]:       64-bit          714  /     686        47       440         0  20365 / 20034

The key change here is the increase in non-speculatively-read private pages, which is that 1319 -> 686 column. This means that we're consuming an increase of 633 pages, or 2.6 MB. So it looks like the pages really are resident - we're consuming 2.6 more MB of physical RAM.

I wonder if we can track down what's bringing them in.

@geoff-codes

Right. To your first concern, no, it doesn't take anything in particular to compile. Just any c89 C compiler should do it. The one part that does complicate slightly (I don't really like it myself) is the whole download-and-extract business — if you decide this might be worth doing, it could be worthwhile to just pull in a known good version and update it every so often. Works for now though.

As for memory: that sounds exactly right, and its entirely because I've included literally every terminfo entry that can be built in like this for thoroughness' sake; this should probably be put in a comment and slimmed down. (I didn't want to risk excluding something, but this is surely overkill.)

This is the part of the script that reads: --with-fallbacks=[BIG LIST].
This might be erring too much to the other side, but then again, maybe not:

--with-fallbacks=ansi-generic,ansi-mini,color_xterm,dtterm,dumb,Eterm,Eterm-256color,Eterm-88color,eterm-color,gnome,gnome-256color,guru,hurd,iTerm.app,konsole,konsole-16color,konsole-256color,konsole-base,konsole-linux,konsole-solaris,konsole-vt100,kterm,kterm-color,linux,linux-16color,linux-basic,mac,mlterm,mlterm-256color,mrxvt,mrxvt-256color,mterm,mterm-ansi,mvterm,nsterm,nsterm-16color,nsterm-256color,pty,putty,putty-256color,putty-vt100,rxvt,rxvt-16color,rxvt-256color,rxvt-88color,rxvt-basic,rxvt-color,screen,screen-16color,screen-256color,simpleterm,st-16color,st-256color,st52,st52-color,stv52,tt,tt52,unknown,vt100,vt102,vte,vte-256color,xterm,xterm-16color,xterm-256color,xterm-88color,xterm-basic,xterm-bold,xterm-color,xterm-utf8,xterm-vt220,xterm-vt52,xterm1,xtermc,xtermm,xterm

I'll squash that in in a second, see if that makes a dent.

@terlar
terlar commented Jan 28, 2014

Would be nice to also have xterm-termite in there. I know it has been quite a paint to have to copy that over to every server that I want to SSH into.

@geoff-codes

Done. Try that one on for size? The binary is 1.5M smaller...
@terlar I don't see that one it the main list... do you have a link to a file? Also, a quick search says maybe just try using xterm if I'm reading that right?

@terlar
terlar commented Jan 28, 2014

I don't know much about terminfo, but it has one "compiled" version and one "source" version?

I think this is the very source file (looking at the repo):
https://github.com/thestinger/termite/blob/master/termite.terminfo

And this is the compiled (looking in my system dir):
https://db.tt/dXcnDB4T

Think it should be very similar though, if not identical.

Edit:
Also as mentioned in that issue you linked it is mentioned that it is preferrable to use the specific one. And also it seems I was correct about the statement above. You can just run tic on the source file to compile it.

What do you think?

@ridiculousfish
The user-friendly shell member

What I hope to discover is why every terminfo entry is being paged into physical memory.

@geoff-codes

@terlar So I'm gonna have to get back to you on that — when I try to add that entry it comes back
with a bunch of multiply defined symbols. I think it might be formatted a bit incorrectly (for this usage at least). Although, does fish need this, or does termite? This is basically one of the upshots I hoped to gain here: separate terminfo used by applications from what fish uses. What happens if you set TERM to xterm-termite and set -gx fish256 true? Is it broken?

@ridiculousfish Ah. Because its not bundling a resource. Recall its not in a TEXT section.
It is literally being compiled in. infocmp -E and you'll see precisely what I mean.

While I'm sure this could be translated more efficiently (use an array or binary encoding or something), keep in mind with your comparison that you also need to count the external resources being used by "normal_fish" that are not being used by "fancy_fish", i.e. the shared ncurses lib that is being dlopen'd, plus wherever the terminfo file(s) (CUR_TERM) is being mapped. allmemory doesn't show this properly (at least on a Mac), since the dylib is put into the dyld_shared_cache... but I don't think that library can ever really be unloaded while fish is running.

@geoff-codes

@terlar My bad — it was a typo on my part. Its now added in the pull. See the line comments for how one can do this.

@geoff-codes geoff-codes commented on the diff Jan 28, 2014
build_tools/build_new_ncurses.sh
+checkdl(){ sh -c "$@ http://invisible-island.net 1>&2" 2>/dev/null; }
+nodl(){ echo 'Error: cannot find usable wget or curl.'; exit 1 ; }
+DLTOOL="curl -Ls" && checkdl "$DLTOOL" || DLTOOL="wget -qO-" && checkdl "$DLTOOL" || nodl
+
+BASEDIR=$PWD; OURTMP="$(mktemp -d /tmp/fish-ncurses-XXXXXX)" && cd "$OURTMP"
+mkdir -p src; cd src
+
+echo 'Downloading latest ncurses development source...'
+$DLTOOL http://invisible-island.net/datafiles/current/ncurses.tar.gz | gzip -dc |
+ tar --strip-components 1 -x
+
+echo 'Downloading latest terminfo...'
+$DLTOOL http://invisible-island.net/datafiles/current/terminfo.src.gz |
+ gzip -dc > misc/terminfo.src
+
+echo 'Downloading extra terminfo...'
@geoff-codes
geoff-codes added a note Jan 28, 2014

To add additional terminfo entries, download the 'source' terminfo and append it to the end of misc/terminfo.src.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@geoff-codes geoff-codes commented on an outdated diff Jan 28, 2014
build_tools/build_new_ncurses.sh
++ tp->Booleans[i] = FALSE;
++
++ for_each_number(i, tp)
++ tp->Numbers[i] = ABSENT_NUMERIC;
++
++ for_each_string(i, tp)
++ tp->Strings[i] = ABSENT_STRING;
++
+@@ -77,2 +102,0 @@
+-
+- _nc_init_termtype(tp);' | patch -p0 || echo "Patch failed. Not needed any more? ..."
+echo 'Configuring ncurses...'
+mkdir $OURTMP/stage; cd $OURTMP/stage; cp -R $OURTMP/src/* .; ./configure --prefix=$OURTMP/stage1-install --without-manpages --disable-getcap --disable-lp64 --disable-rpath-hack --disable-safe-sprintf --disable-termcap --enable-colorfgbg --enable-ext-colors --enable-ext-funcs --enable-ext-mouse --enable-hashmap --enable-home-terminfo --enable-interop --enable-largefile --enable-no-padding --enable-overwrite --enable-pc-files --enable-pthreads-eintr --enable-scroll-hints --enable-sigwinch --enable-sp-funcs --enable-symlinks --enable-tcap-names --enable-term-driver --enable-tparm-varargs --enable-weak-symbols --enable-wgetch-events --enable-widec --enable-xmc-glitch --with-curses-h --with-cxx --with-cxx-binding --with-manpage-format=formatted,uncompressed --with-manpage-renames --with-manpage-symlinks --with-manpage-tbl --with-pkg-config --with-pthread --with-xterm-new --without-ada --without-debug --without-dlsym --without-getcap --without-getcap-cache --without-gpm --without-libtool --without-rcs-ids --without-tests
+make; make install; cd $OURTMP; export PATH=$OURTMP/stage1-install/bin:$PATH; cp -R src/* .
+echo 'Building standalone ncurses...'
+./configure --disable-database --disable-db-install --disable-getcap --disable-lp64 --disable-rpath-hack --disable-safe-sprintf --disable-termcap --enable-colorfgbg --enable-ext-colors --enable-ext-funcs --enable-ext-mouse --enable-hashmap --enable-home-terminfo --enable-interop --enable-largefile --enable-no-padding --enable-overwrite --enable-pc-files --enable-pthreads-eintr --enable-scroll-hints --enable-sigwinch --enable-sp-funcs --enable-symlinks --enable-tcap-names --enable-term-driver --enable-tparm-varargs --enable-weak-symbols --enable-wgetch-events --enable-widec --enable-xmc-glitch --with-curses-h --with-cxx --with-cxx-binding --with-default-terminfo-dir= --with-manpage-format=formatted,uncompressed --with-manpage-renames --with-manpage-symlinks --with-manpage-tbl --with-pkg-config --with-pthread --with-terminfo-dirs= --with-xterm-new --without-ada --without-debug --without-dlsym --without-getcap --without-getcap-cache --without-gpm --without-libtool --without-progs --without-rcs-ids --without-termpath --without-tests --with-fallbacks=ansi-generic,ansi-mini,color_xterm,dtterm,dumb,Eterm,Eterm-256color,Eterm-88color,eterm-color,gnome,gnome-256color,guru,hurd,iTerm.app,konsole,konsole-16color,konsole-256color,konsole-base,konsole-linux,konsole-solaris,konsole-vt100,kterm,kterm-color,linux,linux-16color,linux-basic,mac,mlterm,mlterm-256color,mrxvt,mrxvt-256color,mterm,mterm-ansi,mvterm,nsterm,nsterm-16color,nsterm-256color,pty,putty,putty-256color,putty-vt100,rxvt,rxvt-16color,rxvt-256color,rxvt-88color,rxvt-basic,rxvt-color,screen,screen-16color,screen-256color,simpleterm,st-16color,st-256color,st52,st52-color,stv52,tt,tt52,unknown,vt100,vt102,vte,vte-256color,xterm,xterm-16color,xterm-256color,xterm-88color,xterm-basic,xterm-bold,xterm-color,xterm-utf8,xterm-vt220,xterm-vt52,xterm1,xtermc,xtermm,xterm-termite &&
@geoff-codes
geoff-codes added a note Jan 28, 2014

... and add the name of the new entry/entries to the comma-delimited list at the end of this line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ridiculousfish
The user-friendly shell member

The data is compiled in, but I'm not sure why it's being paged in...something must be reading or writing those pages.

@geoff-codes

I'm not familiar enough with how memory management works to really speak on that... my assumption was that with a binary of that size, the whole thing would get wired up, and then the OS decides if/when to page out. Does it leak/continue to grow? Or does it just sit there?

@zanchey
The user-friendly shell member
zanchey commented Jan 29, 2014

It would be nice to quantify the speedup. The terminfo database should only be queried during setupterm() or newterm(), and (at least on systems like mine) is probably in the cache as we have a hundred or so curses programs running at once!

Of course, I have the advantage of running on Debian where these things are well-maintained. Mac OS X is an entirely different kettle of fish.

And finally, as to 'when would one want to do this'... well, I think, possibly... always. I was actually hoping someone might be able to jump in and tell me why one wouldn't want to do this. The only reason I really think of is that there would be no concept of (strictly speaking) 'user definable terminal capabilities' for fish. But I think that's a very, very outdated idea, since we're not actually talking about terminals really, just terminal emulators, most or all of which just pretend to be xterm anyway.

My concern is that this changes the fix for problems like #1060 from 'add a new local terminfo entry' to 'patch and recompile fish'. This is not common but we really want to be encouraging terminal developers to advertise themselves as something other than xterm, even if it is somewhat of a pipedream, and statically-compiling the terminfo is not helpful in that regard.

I am not convinced that downstream (e.g. Debian) would be terribly impressed by us shipping a static build of ncurses, although supporting it would be useful (there are separate packages in Debian for bash and bash-static, zsh and zsh-static, etc.).

@geoff-codes

@ridiculousfish:
I found your missing memory. Those active pages can be eliminated by adding the configure flag, no joke, --disable-leaks. I've squashed that in for now so you can confirm. As it turns out, it has nothing to do with the terminfo entries we've included. It's the library_proper you're seeing. It has it the other way by design: and yep, its an optimization. In fact this flag is not meant to be used outside of testing, or so it says in the docs at least.

Testing/development:
  --disable-leaks         test: free permanent memory, analyze leaks

For testing, compile-in code that frees memory that normally would not
be freed, to simplify analysis of memory-leaks.
Any implementation of curses must not free the memory associated with
a screen, since (even after calling endwin()), it must be available
for use in the next call to refresh(). There are also chunks of
memory held for performance reasons. That makes it hard to analyze
curses applications for memory leaks. To work around this, build
a debugging version of the ncurses library which frees those chunks
which it can, and provides the _nc_free_and_exit() function to free
the remainder on exit. The ncurses utility and test programs use this
feature, e.g., via the ExitProgram() macro.

@geoff-codes

@zanchey Having let this sit for a little while, I think the points you raise are enough to table any discussion of making this a user-facing option; at least for the time being. Basically, to break it down:

  • The issue you linked, and the fact that @terlar had a custom terminfo file he needed, indicates that no, it is not appropriate to compile in appropriate to exclude the use of external terminfo, at least in a large number of cases. Though using the "fallback option" I don't think would do any damage.
  • While its stable enough for me, its still not really clear what's up with the memory issue.
  • Pulling in external packages maintained elsewhere via a script is error prone; possible security issues.
  • Frankly, there are just more important, and interesting, issues at this juncture.

However:

  • I would strongly encourage revisiting this around the time of a next release, as something to do on a platform by platform basis when preparing binaries, and perhaps consider documenting as an option for package management and distro maintainers (I'm thinking mostly about homebrew). Because it definitely makes for a better experience on Mac OS X.

I'll leave it up to all of you to decide how/when/if to close this — but I would like to be able to delete the fork/branch at some point. Does anyone know, can I do that without the code disappearing here?

I could maybe push a final commit onto this pull which leaves the script, but comments out the bits in the configure script, adding a link to this issue... or you could pull it into a dead branch? Let me know.

And @ridiculousfish. I wouldn't give up on the idea of eliminating the dependency entirely. You might want to take a look when you have the time at some of the code from my second favorite shell (and I've just recently in fact volunteered to mirror this on Github via fast export, coming soon!), like this here.

@zanchey zanchey modified the milestone: fish-tank, next-minor Mar 11, 2015
@krader1961
The user-friendly shell member

Would someone please close this? It is ancient and cannot be merged in its present state. The idea of reinventing terminfo/termcap, or linking it statically, to avoid a run-time dependency on something like ncurses is silly. If we're going to drop that dependency and stick to a superset of ANSI X3.64, which is itself not a silly idea, then there is no reason to go to the trouble of statically linking legacy libraries. We might as well just hardcode the ANSI X3.64 escape sequences (which has already been done to some extent).

P.S., I started programming in an era when "smart" terminals like a DEC VT100 were extremely rare and IBM punched cards were common. So I am extremely aware of the issues involved in this type of change and why libraries like ncurses exist.

@geoff-codes

While I have to disagree (it's the runtime dependency on the -system terminfo "database"_, not the ncurses library, that is the motivating factor), I have no interest in engaging in a flame war at this time.

So if @owners want to close this, that's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.