Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build process -- Food for thought #42

Closed
marcastel opened this issue Jun 18, 2017 · 79 comments
Closed

build process -- Food for thought #42

marcastel opened this issue Jun 18, 2017 · 79 comments

Comments

@marcastel
Copy link

marcastel commented Jun 18, 2017

Should we change/adapt the build process?

(with the objective to make it easily maintainable and accessible to a greater number)

Small knowledgeable community

I know nobody amongst my fellow developers that has insight knowledge on the build process beyond bin/package make, let alone to maintain it, but simply to customise it. Neither do I.

That said, they know how to maintain and tweak GNU Autotools and CMake toolchains.

Note: Do not jump to conclusions here, I am not trying to promote Autotools or CMake; quite the contrary. I simply want to emphasise the lack of openess of the build process' logic and toolchain.

Scarse documentation

Embedded usage documentation in any C or Korn shell script is a fantastic asset of the AST developments, and certainly immensely underused among users of AST packages.... except for the AST developers who have consistently added usage information to all their utilities.

Nonetheless that documentation does not suffice for a newbie to get his head round the build toolchain and gain sufficient insight information to act alone without calling out for help.

Today I am confronted with a failing build on a platform, which is certainly not exotic (MacOS), and I find myself spending hours trying to understand where the errors occur.

Unless told otherwise, I have no supporting information to help me get through my build failures. And calling out for help won't be of great help because ( I presume) only a few have invested significant time in understanding the guts of the toolchain. Questions will take time to be answered, if ever answered.

Build tool

When all goes well, the AST build toolchain seems to beat flat out the other tools mentioned above. It has (apparently) no dependencies, allows for all the GNU Autoconfigure probing without the M4 hell, and nicely lays out its build products.

Opinion: GNU Autotools are a fantastic suite. But they have a major inconvenience: M4. Opaque and to a certain extend clumsy. Probably a good compromise for portability 30 to 40 years ago. But no longer the ad hoc tool for today; pre-processing could be done the AST way :-)

Could the AST build toolchain be system agnostic and a possible replacement for GNU Autotools or CMake on other projects? A toolchain written in portable POSIX shell targeting any raw (POSIX) UNIX or Linux.

Whilst this was probably a driver in its conception, we see, going through the source files that it depend on bash here, lynx or wget there, etc. So it is not agnostic and doesn't build on a raw system; it requires GNUish capabilities. Hence it targets UNIX/GNU or Linux/GNU platforms.

Note: for the reasoning let us ignore for now that we probably need gcc to avoid proprietary compilers (where such compilers still exist).

Logically one can ask, why then maintain a distinct build toolchain? Why not use GNU Autotools or CMake?

Liminary thoughts

The breakup of the AST development team has (luckily) brought the AST developments to the open source community. But the community is small (and probably fragile).

If the AST packages and the Korn shell are here to stay, the community needs to be enlarged.

Enlarging the community means, making the build process accessible to many.

Migrating to GNU Autotools or CMake is an enormous effort which would require such time investment that it is almost guaranteed to stall.

Documentation and HOWTOs seems to be the only realistic approach. This also requires time, and reverse engineering.

Request for comments

In the 90s, shell portability was a big concern, and scripting had to focus on POSIX shells only (Korn shell wasn't a POSIX shell at the time, it now is).

Today, thanks to AT&T opening up the source code, a Korn shell exists on (almost) every platform. Not PDKSH or old versions, but a ksh93 executable (whatever its release).

Consequently, in 2017 onwards, we can assume that we have a Korn shell executable that supports the 93 syntax and features.

Converting the AST build toolchain scripts from universal shell syntax to Korn shell 93 syntax can:
a) greatly reduce the LOC (e.g. iffe could be reduced by 50%)
b) allow for clean environments with the function keywords, limiting globals
c) break down the code into smaller and more maintainable chunks using FPATH
d) usage information can be added to all functions

This doesn't require a full reverse engineering effort, nor does it require a full rewrite of the code.

At the same time this allows for a learning curve which can be populated in HOWTO's and central documentation.

By doing this we can (re)gain knowledge of the AST build toolchain, document it properly for the community to get involved, and lead the way for a ksh2023 rather than a ksh93+z2023 :-)

@dannyweldon
Copy link

You could try reaching out to the mailing list, there may be more people subscribed than those that are watching this git repo.

To subscribe, try sending a plain text email to mailman-request@lists.research.att.com with the word "help" in the body and follow the instructions you receive. (Note, the mailmain web server is no longer working, but it will respond to email commands.)

What part of the build is failing and what errors are you getting?

I did not think that anything ultimately depended on bash, however there may be bash-specific work-arounds in the generic bourne-shell compatible scripts. eg. in bin/package

As for the use of curl and wget, they aren't 100% necessary either because hurl.sh (src/cmd/INIT/hurl.sh) can fall back to using /dev/tcp/$HOST/$PORT style connections if it's running under bash or ksh and it can't find curl or wget.

I recently looked into the build failures for:

cmd/kshlib/dss
cmd/kshlib/cmdtst

only to come to the conclusion that the dss builtin has been broken for a while and needs rewriting because the API version of nvapi has changed. And I'm not yet sure why cmd/kshlib/cmdtst tests are failing, but I doubt that it is needed any more because the grep and xargs builtins are integrated and working fine now in src/lib/libcmd.

I have been looking at getting the ast repo building automatically with travis ci, but currently bin/package test has over 680 errors on a working build on my linux machine! I am thinking that some of those tests may need to be removed or silenced until they pass reliably.

We don't have access anymore to the ast build farm to build on multiple platforms, but at least travis ci can test on linux and macos:

https://docs.travis-ci.com/user/multi-os/

So I think that those platforms should be first-class platforms that have to be able to build and test without errors. We may also be able to get an x86-based Solaris system in a virtual machine somewhere building automatically as well as freebsd, darwin and even cygwin.

We should also start to document how to debug the build system in the github wiki.

@dannyweldon
Copy link

Siteshwar from Redhat now has commit access to the repo and has added a PR for freebsd #19 that might help, as OSX is based on BSD I believe.

Also, he has added a travis file to the beta branch, but it is currently only targeting fedora and is not yet running any tests.

@saper
Copy link
Contributor

saper commented Jul 1, 2017

I personally quite like the build system here, although I have yet to fully wrap my head around it (I am currently debugging a problem related to iffe to detecting things fully on FreeBSD 11).

Manpages are no longer on line, I think getting documentation online and some introductory material would help.

I have even made my own little "release" including #19 - I was surprised how quick and easy it was.

@dannyweldon
Copy link

Man pages are readable here:

https://web.archive.org/web/20151104235435/http://www2.research.att.com/~astopen/download/
This link is the best place to start as it seems to set up the frames properly.

Then visit: AST, nmake, overview
Also: Manual, Commands, iffe + package + nmake + probe (But there are links to these in the above)

@krader1961
Copy link
Contributor

FWIW, I used ksh88 then ksh93 for more than two decades when I worked for Sequent Computer Systems. Then switched to zsh when I changed jobs. I then abandoned zsh when I realized the zsh architecture was broken beyond repair.

After two years of contributing to the Fish shell project I've abandoned it for several reasons. Primarily because it seems like the big problems with its implementation (e.g., how I/O through pipelines is handled) will never be fixed. Also, because the current developers are expending effort on pointless changes such as changing FISH_VERSION to fish_version without any justification other than that one, very inexperienced, contributor thinks that any variable which isn't exported cannot be all uppercase.

So I was intrigued to see that the Korn shell had been open sourced and hosted on Github. But I can't figure out how to get nmake built and usable on macOS (i.e., OS X). The homebrew command doesn't seem to know how to install nmake. And attempts to sh ./bin/package make fail because nmake is not available. I would be interested in using and contributing to ksh93 development. But there needs to be either

a) better documentation for how to build ksh93 on macOS (google searches didn't yield any answers), or

b) switching to a more modern build system like Cmake or autotools/autoconf.

Admittedly the latter isn't very modern but it is more widely supported and understood than the AT&T nmake mechanism.

@siteshwar
Copy link
Contributor

I don't have access to an OS X machine, but I have put the build script that I use to compile ksh on fedora in notes here. Latest changes are in beta branch, so I will recommend building from there. Also, did you use clang or gcc to compile on OS X ? I would suggest you to try compiling with gcc as the build system behaves very strangely sometimes. I agree with your thoughts about using a more modern build system.

ksh is extremely fragile and is very prone to regressions. It suffers due to implementation too. As an example see how build broke last time when there were changes in glibc. So I try to keep changes to it minimal.

@saper
Copy link
Contributor

saper commented Oct 12, 2017

./bin/package make is generally the way to go. One of the first steps is to build nmake, most probably it fails somewhere on the way.

The build usually proceeds even if the previous steps failed, so it is worth examining the build log from the top rather from bottom and find the issue there. nmake being not present is just a consequence of the previous build failure.

@krader1961
Copy link
Contributor

@siteshwar, Thanks for that script. Just running sh ./bin/package make on macOS produces output that begins like this:

package: make start at Thu Oct 12 20:12:18 PDT 2017 in /Users/krader/projects/3rd-party/ast/arch/darwin.i
CC=cc
SHELL=/usr/local/bin/ksh
HOSTTYPE=darwin.i386
NPROC=12
PACKAGEROOT=/Users/krader/projects/3rd-party/ast
INSTALLROOT=/Users/krader/projects/3rd-party/ast/arch/darwin.i386
PATH=/Users/krader/projects/3rd-party/ast/arch/darwin.i386/bin:/Users/krader/projects/3rd-party/ast/bin:/
cmd/INIT:
ksh[56]: eval: line 6: 41460: Abort
make: *** termination code 6 making cmd/INIT
ksh[68]: wait: 41459: Abort
lib/libast:
ksh[75]: eval: line 6: 41467: Abort
make: *** termination code 6 making lib/libast
ksh[87]: wait: 41466: Abort

Your script actually results in a ./arch/darwin.i386-64/src/cmd/nmake directory being created but not nmake being built. So the build then fails to find nmake:

+ nmake --base --compile '--file=/Users/krader/projects/3rd-party/ast/src/cmd/nmake/Makerules.mk'
/usr/local/bin/ksh: line 4: nmake: not found
mamake [cmd/nmake]: *** exit code 127 making Makerules.mo

I should point out I have a hybrid macOS system since I have installed many GNU tools via Homebrew and have arranged for some of the GNU tools to shadow the macOS/BSD variants. But that has not generally been a problem when working with other open source software.

On Ubuntu 16.10 just running sh ./bin/package make results in a working ksh binary (in ./arch/linux.i386-64/src/cmd/ksh93/ksh). So my problem is clearly unique to macOS (aka OS X).

I'll spend a little more time trying to get ksh to build on macOS but I'm not very motivated to do so since other shells (e.g., Elvish) build and run on macOS without jumping through hoops and are more likely to have a future.

@saper
Copy link
Contributor

saper commented Oct 13, 2017

Seems like your shell is crashing, what is your /usr/local/bin/ksh and what happens if you move it away (don't use it)?

@dannyweldon
Copy link

Also, are there any errors in arch/darwin.i386-64/lib/package/gen/make.out ?

@krader1961
Copy link
Contributor

The first problem I found was an unwanted line wrap from cutting/pasting @siteshwar's script. Once I fixed that I found that symbols like nl_catd were not defined. That's because the ./arch/darwin.i386-64/include/ast/ast_nl_types.h files that is generated by iffe doesn't define it. And that file is included by ./src/lib/libast/std/nl_types.h which shadows the system provided header of the same name where the symbol is defined. I hacked around that problem by moving src/lib/libast/std/nl_types.h out of the way and modifying the #include statements in the affected files to include the system header of that name and the ast_nl_types.h header.

There are several discussion threads about this header problem when building with the AST tools. Such as this one: https://mail-index.netbsd.org/netbsd-bugs/2014/07/14/msg037462.html.

After working around the nl_types problem the build gets a lot farther and does generate a nmake binary. But the build then fails with a lot of lines like these:

ksh[1424]: eval: line 6: 29178: Abort
make: *** termination code 6 making cmd/ncsl
ksh[1436]: wait: 29177: Abort
cmd/pack:
ksh[1443]: eval: line 6: 29184: Abort
make: *** termination code 6 making cmd/pack
ksh[1455]: wait: 29183: Abort
lib/libvdelta:

Forcing /bin/sh to be used by using ./bin/package make -S SHELL=/bin/sh to do the build still results in the same "abort" messages -- they're just formatted differently. Same with SHELL=/bin/bash. So it isn't the shell. It looks like it's /Users/krader/projects/3rd-party/ast/arch/darwin.i386-64/ok/bin/nmake that is dying from receiving SIGABRT. Presumably an assert() is failing. Note that I can successfully invoke it with the -v switch.

Sure enough. I enabled core dumps and what we see is that nmake is invoking strcpy() on overlapping buffers which is undefined behavior. You have to use memmove() in this situation:

* thread #1: tid = 0x0000, 0x000000010ce00d42 libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGSTOP
  * frame #0: 0x000000010ce00d42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fffbd00d457 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fffbce854bb libsystem_c.dylib`__abort + 140
    frame #3: 0x00007fffbce8542f libsystem_c.dylib`abort + 144
    frame #4: 0x00007fffbce85592 libsystem_c.dylib`abort_report_np + 181
    frame #5: 0x00007fffbceabf28 libsystem_c.dylib`__chk_fail + 48
    frame #6: 0x00007fffbceabf38 libsystem_c.dylib`__chk_fail_overlap + 16
    frame #7: 0x00007fffbceabf69 libsystem_c.dylib`__chk_overlap + 49
    frame #8: 0x00007fffbceac132 libsystem_c.dylib`__strcpy_chk + 64
    frame #9: 0x000000010cc51eaf nmake`resetvar(p=0x00001e00ac75c590, v="LICENSE=since=2003,author=gsf", append=2048) + 319 at variable.c:580
    frame #10: 0x000000010cc51768 nmake`setvar(s="TA", v="", flags=2048) + 1368 at variable.c:680
    frame #11: 0x000000010cc32798 nmake`assignment + 1032
    frame #12: 0x000000010cc2d58b nmake`parse + 1275
    frame #13: 0x000000010cbefc54 nmake`apply + 660
    frame #14: 0x000000010cc2f7d9 nmake`assertion + 137
    frame #15: 0x000000010cc2d576 nmake`parse + 1254
    frame #16: 0x000000010cc3ab51 nmake`readfp(sp=0x00001e00ac6e9b00, r=0x00001e00ac6f9000, type=18) + 5553 at read.c:407
    frame #17: 0x000000010cc392a4 nmake`readfile(file="/Users/krader/projects/3rd-party/ast/src/cmd/INIT/Makefile", type=18, filter=0x0000000000000000) + 964 at read.c:455
    frame #18: 0x000000010cc0f048 nmake`main(argc=11, argv=0x00007fff53018c90) + 6136 at main.c:662
    frame #19: 0x000000010cdde235 libdyld.dylib`start + 1

@krader1961
Copy link
Contributor

krader1961 commented Oct 14, 2017

Replacing strcpy(p->value, v); with memmove(p->value, v, n + 1); in src/cmd/nmake/variable.c fixes the first point of failure. Revealing that in another spot it's calling memccpy() with overlapping buffers which is also undefined behavior:

    frame #8: 0x00007fffbceac18a libsystem_c.dylib`__memccpy_chk + 69
    frame #9: 0x0000000105637919 nmake`sfputr(f=0x00001e00a8425b00, s="strings.h", rc=0) + 1097 at sfputr.c:109

@krader1961
Copy link
Contributor

Okay, I fixed the strcpy() and memccpy() bugs from my previous comments. The build now gets much farther and the number of fatal build errors has dropped from 100 to 41 because nmake is no longer triggering assert()'s 😄 Unfortunately, every single binary (other than those like nmake and iffe used to drive the buidl) fail to link with these errors:

Undefined symbols for architecture x86_64:
  "__ast_catclose", referenced from:
      _match in libast.a(translate.o)
      __ast_translate in libast.a(translate.o)
  "__ast_catgets", referenced from:
      _match in libast.a(translate.o)
      __ast_translate in libast.a(translate.o)
  "__ast_catopen", referenced from:
      _find in libast.a(translate.o)
ld: symbol(s) not found for architecture x86_64

So we're back to the problem that the iffe tests for the nl_* family of symbols on BSD is broken.

@saper
Copy link
Contributor

saper commented Oct 14, 2017

@krader1961 not sure I have a fix for this problem but in case you keep on troubleshooting the build I have a small collection of patches to make it build on FreeBSD.

@krader1961
Copy link
Contributor

@saper, What I don't understand is why your PR is needed or why I am seeing compatibility problems building on macOS. I just booted my FreeBSD 12.0 virtual machine. I did sudo pkg install ksh93 which succeeded. But running ksh93 resulted in this error:

/usr/local/bin/ksh93: Undefined symbol "readdir"

Is it the case that ksh93 on BSD systems has been broken for a long time?

@krader1961
Copy link
Contributor

My PR #76 to fix the two places that don't handle overlapping buffers on BSD correctly plus @saper's PR #19 lets me build everything but dlls and pax on macOS Sierra (10.12.6). But it's unbelievably slow to do so. The fastest time I've seen is 23 minutes (on a 12 core Mac Pro with 24 GiB of memory). In no small part because of all the seemingly redundant iffe invocations. For example, iffe: test: is -liconv a library ... yes occurs 93 times in the build log. The Travis build for my PR also took 23 minutes. Surely we can find a way to make building just ksh a wee bit faster 😄 I suspect that 99% of users looking at this project are not interested in any of the other commands bundled with it other than ksh.

Also, when I do ./bin/package make -S it keeps modifying the files in ./bin/ by prepending lines that make the scripts no longer executable and makes git think the changes to them need to be committed. So I did a "git checkout bin/package" then did sh bin/package clean and it erased the entire project tree including the ast directory! Gee, thanks, I guess, since it doesn't look like it erased anything else I care about. While the AST build system met the needs of this project twenty years ago today it seems like more trouble than it's worth.

@krader1961
Copy link
Contributor

I noticed ./bin/package make -S is testing for cosine math functions:

iffe: test: is cos a library function ... yes
iffe: test: is cosl a library function ... yes
iffe: test: is cosf a library function ... yes
iffe: test: is cosh a library function ... yes
iffe: test: is coshl a library function ... yes
iffe: test: is coshf a library function ... yes

None of the commands or library code in the project use those functions. Obviously I picked those at random as they caught my eye while looking at the build log. There are hundreds of such feature tests that are not relevant for any of the code being compiled.

If the ksh command in this repo is only going to get bug fixes and ports to new environments then changing the build tool chain doesn't make sense. But if this is going to be an active, evolving, project then it definitely needs to be refactored and updated to use Cmake.

@krader1961
Copy link
Contributor

Okay, I see my previous comment about math functions like acosh() not being used despite probed for is incorrect. Those functions are table driven so my original search of the code for invocations failed to find them. /me wipes egg off my face.

siteshwar pushed a commit that referenced this issue Oct 15, 2017
Trying to build this project on macOS (a BSD variant) resulted in two
`assert()` failures when running `nmake`. The problem is that there are
at least two places which pass overlapping buffers to functions which
are explicitly defined to have undefined behavior when the buffers overlap.

See issue #42.
@saper
Copy link
Contributor

saper commented Oct 18, 2017

Symbol visibility is an important issue, that is why this simple patch may fix a lot on non-Linux platform. Regarding 12.0 I I would check if there was no API/ABI change, may happen as 12.0 is the unreleased -CURRENT. Maybe some other change broke it.

@siteshwar
Copy link
Contributor

fwiw i would also like to evaluate meson as a possible option for new build system.

@krader1961
Copy link
Contributor

krader1961 commented Oct 24, 2017

FWIW, I've been trying to figure out why I can build ksh93 with gcc but not clang on macOS. One reason is the use of -I-. This causes gcc to emit a warning:

make.out.gcc:cc1: note: obsolete option -I- used, please use -iquote instead

Clang treats it as an error:

clang: error: '-I-' not supported, please use -iquote instead

There are other compiler options, such as -mr, that neither gcc or clang support. And while the -mr flag isn't actually used after the initial probe for flags supported by the build tools the -I- flag is used (some of the time) which causes part of the build with clang to fail.

This causes the AST build system to use the ppcc wrapper script rather than invoking the cc command directly. That causes problems because that script interprets flags like -fno-strict-aliasing as equivalent to the sequence of short flags -f -n -o .... And ppcc treats -n as meaning not to actually compile the source into an object file.

I've worked around those issues and have managed to build ksh93 with clang on macOS and Ubuntu. What's interesting is that some unit tests that fail when built with gcc pass with clang and vice versa on macOS. On Ubuntu the test results are identical for the two compilers. Furthermore, the Ubuntu failures don't match the failures seen on macOS with either compiler. The fewest errors (154) occurred on Ubuntu. Clearly there are lots of problems with the existing unit tests.

P.S., I've also noticed that the ksh binary this build process produces has commands like cat and chmod implemented as builtins. But that isn't the case for the ksh v93u provided by Ubuntu or macOS (or Homebrew on macOS). It's not obvious to me that having those particular commands as builtins is a good thing.

@krader1961
Copy link
Contributor

fwiw i would also like to evaluate meson as a possible option for new build system.

The fish-shell project discussed using Cmake and Meson here and ultimately chose Cmake. However, a couple of the reasons they rejected Meson don't apply to this project. And I do love that Meson is built on Python. However, as this blog article notes Meson introduces yet another way to build projects without solving any significant problems with Cmake and is much less mature.

If we do go to the trouble of replacing the current build system we should not switch to autotools. While venerable, widely available, and lots of people are familiar with it (unlike nmake) it has almost as many quirks as our current build system and would not be much of an improvement.

@qbarnes
Copy link

qbarnes commented Oct 24, 2017

Unless their functionality has changed greatly in the last few years, please avoid using GNU autotools. Whenever you have to step off the beaten path, you plummet to the bottom of a ravine. They have way too many implicit and hidden dependencies. And they are a mess whenever trying to migrate software for new, evolving environments (OSes) or when cross-building software.

@siteshwar
Copy link
Contributor

siteshwar commented Oct 24, 2017

@jhfrontz mentioned the history of -I- flag here.

Regarding the choice of build systems, we have more than one persons agreeing that we should not be using autotools.

@krader1961
Copy link
Contributor

For the record I have found the following to be the minimum set of files and directories needed to build ksh93 and run its unit tests. From src/lib:

Makefile   libardir   libcmd     libcoshell libdll     libexpr    libodelta  libsum
Mamfile    libast     libcodex   libcs      libdss     libmam     libpp

From src/cmd:

INIT     Mamfile  cpp      kshlib   msgcc    probe    tests
Makefile builtin  ksh93    mam      nmake    re

Obviously switching to Meson or Cmake would eliminate several of those. Using this bare minimum reduces the ksh93 build time roughly 25%. It's still obscenely slow because of all the redundant invocations of iffe and the fact that some programs are built twice.

@krader1961
Copy link
Contributor

Haha! The person who wrote Meson wrote a blog article about the transition to Cmake at Canonical (the company that produces Ubuntu): https://blog.kitware.com/use-of-cmake-at-canonical/. Which makes me inclined to vote for switching to Meson for this project. 😄

@krader1961
Copy link
Contributor

I just spent a couple of hours reading various Reddit threads, Stackoverflow questions, and blog posts about the merits of Cmake versus Meson. For example, this article from July of this year is a strong thumbs up for Meson over Cmake. However, the article does end on this note:

For now I found only one thing that would have to let me go back to CMake once in a while: meson requires Python 3.4 and newer. This is not the case on a few machines I still have to work on, but time will let these phase out too.

Given that ksh93 is still trying to support ancient K&R compilers I'm wondering how much of a deal breaker the dependency on Python 3 is. Obviously we no longer need to support K&R (pre ANSI C) compilers. But what about old OS's like Solaris which may not have Python 3 available?

@krader1961
Copy link
Contributor

@jelmd, I have no idea what I am supposed to do with that script you posted in your comment. You don't seem to understand that running bin/package make builds the nmake command. Is that not true on your system? It is certainly true for me on macOS, Ubuntu 16.10 (Linux), and FreeBSD 11.

@marcastel
Copy link
Author

marcastel commented Nov 8, 2017

I have been watching this thread and others for a while. Let me say THANK YOU for all the digging and testing. Though I still can't get a proper compile on macOS, I see progress thanks to your think tank actions.

If I may, I would like to share my 2c thoughts on a recurring topic in this thread: replace the build system

Getting the build process to work on modern (or current) platforms, is obviously a primary objective.

However, already planning a build system revolution (i.e. change), besides not being respectuous of all the effort put in over decades by the folks at ATT, would not solve the build objective: understanding the dependancies of our code. All a build system does is enact a set of rules provided by the source package. Those rules poke the underlying system to understand its capabilities and available features. What we need in the community take over of a previously proprietarily maintained code, is to identify and document those rules.

A blunt analogy: imagine you migrated to GNU autotools, but had no idea of what you had to writeup in configure.ac or Makefile.am. I did say blunt, no offense :-)

Additionally, why would I replace an inventive build system that relies exclusively on the shell, by one that pulls in other dependencies (m4, Python, ...)? Knowing that those systems also do rely on the underlying shell. Simplicity? Clarity? Extensibility? Programming language ? ...

The only real argument I would not be able to defend is opaqueness. Current utilities like package or iffe, among others, are monstrous single-flow scripts with galore public variables. In whatever programming language(a), that coding logic would be opaque too.

Further, the portable shell constraint has also considerably contributed to the opaqueness of the code(d)... which compares here to the opaqueness of GNU autotools m4 macros for casual users.

Reverse engineering seems to be (at least to me) the only way to go. I can understand that many may prefer starting from a new blank page, using either tools they know or the latest hype toolchain(c). So, as a POC of my meandering thoughts, I started a lab experiment with one of those opaque utilities which is often mentioned: iffe.

My objectives are:

a) Reverse engineer the iffe utility
b) Get rid of universal portable shell code constructs and migrate to ksh93u syntax(b).

I am only a couple of hours done the line. So no tangible results yet. However, first impressions are that, once restructured and simplified, iffe won't be such a complex beast.

That done, subsequent objectives would be:

c) Build a modular and extensible system where user-contributed functions can simply be added to FPATH and made available to iffe (à la git(1))
d) Provide a caching mechanism so that previously answered questions (e.g. iffe: test: is stdio.h a header ...) are made available in a global registry (à la GNU autoconfig).

More on this as i progress on iffe modernisation.
Looking forward to your feedback.


(a) And yes Korn shell is a capable programming language. David Korn's motivation was that a shell is "about string processing", and had ATT followed Richard Stallman's mouvance at the time, the Korn shell would have probably stalled Larry Wall's efforts and we would have enjoyed a CKAN rather than a CPAN :-)

(b) Making the assumption that most current platforms have either a default ksh93u executable (e.g. macOS) or can compile out-of-the-box the current AST packages. Bootstraping the build on ancient or yet undiscovered systems will be addressed subsequently.

(c) A little sarcasm here triggered by the self-promoted fullstack community re-inventing the wheel with tools like Grunt, Gulp and the like, taking the headlines all over the Web, despite those tools being more complex, less flexible, and less maintainable than grandpa's make(1) files.

(d) I would be tempted to believe that these opaque tools are just the output of other tools, not made publicly available by ATT. Glen Fowler or Phong Vo could possibly answer this.

@jelmd
Copy link

jelmd commented Nov 8, 2017

@krader1961: 1) reading the full comment/context of the link, would make sense. IMHO one doesn't need to wonder, that you do not understand things, when you continue to just picking up some keywords or links w/o taking its context into account. 2) If you still not understand its purpose, as a experienced ksh programmer you should be able to find out, what the script does - so reading the script would be another option. 3) Simply executing/exploring what it does, would be an option, too.

Wrt. nmake: because you obviously don't read the documentation of the software you wanna change, you still have not understood, that actually nmake is NOT required to build ksh93, and it also not needed to run the related tests (big hint: plz read at least the READMEs!) before postulating wrong assumptions.

@jelmd
Copy link

jelmd commented Nov 8, 2017

@marcastel: I 100% agree with all your points including footnotes. IMHO it is a good idea to do, what all other established langs do, i.e. use the own lang to build itself, because "real" bootstrapping from scratch is not needed anymore - all related systems seem to have a ksh93 package now.

@krader1961
Copy link
Contributor

@marcastel, You said

... not being respectuous of all the effort put in over decades by the folks at ATT...

That is not an accurate reflection of what I have said. I have repeatedly stated that AST Nmake is better than GNU autotools and other options available 10+ years ago. If my only choice is GNU autotools or AST Nmake I would choose the latter.

Keep in mind that my arguments are focused solely on building ksh. If you are interested in building every portion of the AST project, individually or collectively, you could reasonably decide that the current Nmake based build system is optimal. But that is not the optimal choice today for building only ksh.

There is absolutely no chance that this project will continue to depend on nmake and iffe if I have any say in the matter.

The current build rules are broken. I have experienced many instances where bin/package make ast-ksh or bin/package test ast-ksh has failed in an unexpected manner solely because some of the source has been changed. Where doing a rm -rf arch followed by a subsequent build and test succeeds. Yes, someone could fix the Makefile/Mamake file contents to fix those problems. But why should they given that the current build system is extremely inefficient?

I'm willing to bet that I can convert building ksh to the Cmake or Meson build systems before you can implement your improvements to Nmake. Which, according to @jelmd, isn't even needed to build ksh.

@marcastel
Copy link
Author

@krader1961, I was not pin pointing anybody. And I am indeed happy to hear that you feel sufficiently confident to transform, in a breeze from one system to another. Hoping that you will share that significant insight :-)

While I do not support your urge to change the build system, I can understand it.

In the meantime, I am pursuing my RE effort. I needed a linter, so here is an iffe parser which can parse iffe(5) syntax. It can output an abstract syntax language representation -- see sample below or full output which represents parsing of src/lib/libast/features/common; this is currently a Korn compound variable, but JSON or YAML outputs would be trivial would be trivial).

This abstract notation allows extreme flexibility with existing code -- BTW irrespective of the build system :-)

iffe-parser-ast

It is still rough around the edges and I need to set up regression testing. But fairly confident this can get polished rapidly with more logging and debugging features.

A by-product of this parser is that is could be used to semi-automatically document the feature probing done for AST components.

@marcastel
Copy link
Author

PS: I also tried myself at an iffe(5) BNF syntax

@krader1961
Copy link
Contributor

@marcastel, Can you provide links to any open source projects that are using the AST Nmake build system? I could not find any despite several hours of research. There may be private projects at companies or universities which are using it but those are not interesting since we can't examine them. Furthermore, they are unlikely to incorporate any improvements from this project to the Nmake build system. Also, neither myself or @siteshwar is

... confident to transform, in a breeze from one system to another...

That is why we are working to remove source code not needed to build ksh. The result of that work will make it easier to understand what needs to be implemented in a new build system to support building ksh.

@krader1961
Copy link
Contributor

PR #123 has been merged. That makes it possible to build libast using Meson. It is the first step in switching from Nmake to Meson for building and testing the ksh command.

@marcastel
Copy link
Author

marcastel commented Nov 16, 2017

Congratulations @krader1961 and al. I have been seing all the traffic, impressive. I'm still walking my way down the other line, nonetheless and have made good progress too.

@krader1961
Copy link
Contributor

PR #124 has been merged. That makes it possible to build ksh using Meson. The next step is to teach Meson how to run the ksh unit tests.

@krader1961 krader1961 added this to the next-minor milestone Nov 20, 2017
@siteshwar
Copy link
Contributor

On my system ksh now gets built in around 1 minute which is a significant improvement over the legacy build system that took around 3 minutes. Most of this time is still taken by iffe tests. Build timings should improve further when we start cleaning up these tests.

siteshwar pushed a commit that referenced this issue Nov 30, 2017
Trying to build this project on macOS (a BSD variant) resulted in two
`assert()` failures when running `nmake`. The problem is that there are
at least two places which pass overlapping buffers to functions which
are explicitly defined to have undefined behavior when the buffers overlap.

See issue #42.
@krader1961
Copy link
Contributor

I'm closing this since we are now using Meson to build/test ksh93 including on Travis CI environments. There is more work to be done to remove dependencies on the legacy iffe command used by Nmake but that is orthogonal to this issue.

@dannyweldon
Copy link

For the record, when I got nmake to build in parallel and got my build down to 2m30, that was without Kurt's optimisation to skip testing for all the math library functions because my system must be missing some dependency. With that fixed, it would have been a lot faster, probably comparable to meson. iffe tests could probably have been split into separate files to enable more parallelism, though I never tested that.

@siteshwar
Copy link
Contributor

We have managed to remove all the iffe tests from the build process. On my system ksh builds in around 20 seconds now, that is almost 10 times faster than the legacy build system.

@marcastel Thanks for opening this issue and your efforts to document iffe. This would certainly help if we have to make any fixes in the legacy code. It would be awesome to see more contributions from you in future.

@cesss
Copy link

cesss commented Jul 4, 2018

I know I'm late into this discussion, as a decision was already taken. Anyway, if someday into the future anybody suggests changing the build system, let me say that I consider inconvenient the requisite to build Python before the shell (I'm talking about installing an OS from source).

I'm just into a project that involves installing an OS from scratch, and we discarded ksh93 just because of the Python dependency (not that we hate Python, but we want the shell up and running before Python). Either standard make or CMake would fit better in the process of building an OS from scratch.

@saper
Copy link
Contributor

saper commented Jul 4, 2018

That crossed my mind, too, so I continue to keep the old way and just cherry pick improvements if possible.

@siteshwar
Copy link
Contributor

@cesss Thanks for taking time to write. I can suggest you couple of solutions if you want to use ksh:

  1. You can build ksh from the legacy branch. It does not have dependency on python. Build script can be found in this wiki page.

  2. Use a prebuilt python binary to build current development version of ksh, or just use a prebuilt ksh binary. I am not sure if you have hard requirement to build shell from source, but even when building an OS from scratch you have to depend on some prebuilt binaries.

@cesss
Copy link

cesss commented Jul 10, 2018

Thanks a lot. Yes, of course you usually need to employ binaries built in a previous stage when you are bootstrapping a new system, but we are following a simple/purist approach. In fact, if we reconsider it and adopt ksh, it will be by adding to it a build procedure that only uses tools that are available during the early stages of building a new OS from scratch. In that case (ie: if we reconsider and adopt ksh), we can do a pull request with the alternative build procedure if you accept pull requests.

citrus-it pushed a commit to citrus-it/ast that referenced this issue Apr 15, 2021
Somewhat notable changes in this commit:
- The 'set +r' bugfix (re: 74b4162) is now documented in the
  changelog.
- Missing options have been added to the synopsis section of the
  ksh man page.
- The minor formatting fix from ksh-community/ksh#5
  has been applied to the ksh man page.
- A few fixes from att@5e747cfb
  have been applied to the ksh man page.
- The man page fixes from att#353
  have been applied, being:
  - An addition to document the behavior of 'set -H'.
  - A fix for the cd section appending rksh93.
  - A fix for some options being indented too far.
  - Removal of a duplicate section documenting '-D'.
  - Reordering the options for 'set' in alphabetical order.
  - A minor fix for the documentation of 'ksh -i'.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants