New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: add GOOS=illumos #20603

Open
sean- opened this Issue Jun 7, 2017 · 28 comments

Comments

Projects
None yet
@sean-
Copy link

sean- commented Jun 7, 2017

Disambiguating solaris vs illumos

SunOS/Solaris has a storied and complicated history. In [2010] Illumos forked from OpenSolaris and has continued its life in the open (, and now spent the majority of its life, as Illumos (not Solaris).

The solaris build tag is "mostly" compatible with Illumos and Illumos-based distributions (e.g. SmartOS, Nexenta, Open Indiana, Delphix, etc), however Illumos has diverged significantly from Solaris. In order to detect and support Illumos-native functionality, I propose:

  1. illumos be added as a new GOOS build tag
  2. The illumos build tag be distinct from solaris.

We considered extending the life of the solaris build target to include the illumos target for the period of one release but decided against this because it would taint community code with:

// +build !solaris
// +build illumos

that would need to be cleaned up at the end of the transition period. Backwards compatibility for the sake of backwards compatibility isn't something we're interested in maintaining.


Semi-related: it would be nice if there was a way of specifying and targeting distributions at build time. cgo on Linux and alpine vs glibc comes to mind as another area that would benefit from a distribution-specific build target.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 7, 2017

/cc @4ad @binarycrusader @jtsylve for opinions too.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Jun 8, 2017

Can you outline the differences relevant to the Go standard library between solaris and illumos?

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 8, 2017

Illumos didn't fork in 2007, which would be especially hard since OpenSolaris didn't have it's first release until 2008. I suspect the submitter meant 2010.

However, many of the other points made are true -- for reasons that are beyond the control of engineering, the OpenSolaris program was discontinued in 2010. There are roughly seven years of divergence at this point. Solaris has changed significantly in that time for any new interfaces that were added or updated after the initial fork, etc.

Both Solaris and OpenSolaris-based derivatives (such as Illumos) added or updated existing interfaces after the fork; these interfaces, even though they have the same names, may not be compatible as they sometimes use different constant values in system headers, accept a different number of parameters, or one version accepts values the others does not.

With that said, any interface that existed before the fork will generally be compatible because Solaris has strong guarantees for backwards compatibility at the binary level (although this is not strictly guaranteed at the source level). The only catch would be new parameter values that functions may not have supported in an older release.

For the vast majority of the Go standard library, there are no appreciable differences -- they can continue to be treated as equivalent platforms. This is part of why I have not yet approached the Go development team about trying to resolve this. However, there are some important differences, most of which affect packages such as golang.org/x/sys/unix or anything that uses native system interfaces.

I've mainly been waiting for the right time to approach the golang-dev mailing list about how we should account for the differences since they are starting to matter. For example, if you run the mkall.sh script in syscall, you'll find a number of differences: https://gist.github.com/binarycrusader/1c57088d65c8f4071853af5efa37271e

Looking over those differences, you'll see what I said earlier matches up -- some of the error constants are different, Solaris has many that Illumos does not, and Illumos has a few options Solaris intentionally doesn't support (such as MAP_32BIT for mmap()), etc. The next release of Solaris also supports the xpg7 standard, which Illumos doesn't support yet as far as I know, so there are some new interfaces there as well.

More important differences in constants are things such as TCP_KEEPCNT, etc. that both Illumos and Solaris support but have different values for.

The small remaining differences are things such as different text for particular errnos, or the Bits in FdSet being uint64 instead of int64 on Solaris (bug fix) and don't typically matter in practice.

Now as for the pull request that was referenced, yes, there are big differences there. Solaris does actually have all of those statistics available, but using a completely different set of interfaces. Although both Solaris and OpenSolaris-based derivatives have the kstat general interface, the set of stats available is different. Additionally, the next release of Solaris has a completely new statistics subsystem, so being able to differentiate between Solaris and OpenSolaris-based derivatives is likely important.

There are other considerations as well when adding new interfaces to packages such as golang.org/x/sys/unix; by a quick and hacky estimation, Solaris libc has easily hundreds of additional private/public interfaces not present in OpenSolaris-based derivatives and OpenSolaris-based derivatives have a few that Solaris does not as well. Generally speaking, those differences won't affect the Go standard library, the primary points of contention are around networking-related interfaces, and some memory-related interfaces (such as mmap). There are other differences as well when it comes to the system linker (/usr/bin/ld) and what it supports, etc. but I don't think those are particularly relevant (yet).

In short, I've only recently become concerned that this is a problem worth solving somehow, and the amount of divergence that affects Go itself is fairly small.

For the record, Go is important to Solaris, and we've tried hard to remain compatible with OpenSolaris-based derivatives. I've spent roughly two years now working on porting Go to Solaris and on a port of Go to sparcv9 with Aram which should be ready to integrate in the Go 1.10 timeframe. As such, we're very interested to workout an acceptable solution with all parties involved.

@sean-

This comment has been minimized.

Copy link

sean- commented Jun 8, 2017

@ianlancetaylor There are a number of new syscalls that only exist in Illumos. A quick of changes includes:

  • brand(2)
  • getrandom(2)
  • epoll(2)
  • preadv(2) and pwritev(2)

[/me deletes most of his reply, thank you @binarycrusader for that excellent reply - my history with Go on Illumos is only a handful of months. I used 2007 as the date based on the head of a series of reflog entries and my memory of the acquisition, but 2010 is the correct date. ]

Internally we were concerned about maintaining portability but know these are going to continue to diverge, but given our use of Go, it's clear that this will be problematic going forward as additional syscalls show up in x/sys/unix. I'm going to hand-wave past any already present divergence that I suspect is being carpeted over by the fact that we rebuild Go internally and there isn't much crossover between binaries created on Illumos and run on Solaris (though I have heard several anecdotal reports of some binaries not working but not verified this myself). I'm sure some of the kstats referenced work, though others don't (yet?/ever?).

I think there is a lot of history and shared work that could be leveraged going forward, too, but fundamentally the two OSes have diverged to the point that there is a need for a discrete build tag in order to make appropriate decisions at compile time.

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 8, 2017

  • getrandom(2)

Solaris 11.3 added this:
https://docs.oracle.com/cd/E86824_01/html/E54765/getrandom-2.html

As for the other functions, they are not yet available in Solaris (intentionally or because they haven't been implemented yet). But yes, those are good examples.

@4ad

This comment has been minimized.

Copy link
Member

4ad commented Jun 8, 2017

Having a way to distinguish illumos and Oracle Solaris targets based on build tags seems fine. I don't think that having incompatible solaris and illumos build tags is a good idea. It breaks a lot of existing code for no benefit.

Let's add two new build tags, solaris11x and illumos, and keep the solaris build tag around. By default, the solaris build tag will match both illumos and Oracle Solaris targets, which is what you want in 99% of cases anyway, and it doesn't break any existing code.

When one wants specific support for a particular Solaris variant, either one of solaris11x or illumos build tags is to be used.

(Don't get bogged on the specific names I chose for this example, the names don't matter, we can use other names like oraclesolaris and illumos, or whatever).

@sean-

This comment has been minimized.

Copy link

sean- commented Jun 8, 2017

At the end of the day something needs to change as it is disingenuousness to users if we keep with the status quo and expect a GOOS="solaris" binary to run on either Oracle Solaris or Illumos.

If Oracle Solaris is moving to its own build tag (i.e. oraclesolaris), then sharing the solaris build tag with illumos seems fine. I think this is fine, but don'g want to obligate the Oracle community to this.

If Oracle Solaris is not going to use a new build tag, then I would object to having the solaris build tag be inclusive of illumos.

That said, we're decidedly 👍 to having Oracle Solaris get its own, new, and dedicated build tag to resolve any ambiguity going forward.

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 8, 2017

I dislike bike-shedding, but for the record, I would discourage any build tag containing something that implies a "version" of some sort such as "11x", etc. While that may seem tempting, the public version is used for marketing purposes and isn't reliable as a way to discern differences in technical interfaces.

In particular, Solaris often "backports" functionality to the previous release while the next release is in development. As an example, Solaris 11.3 sometimes receives new functionality from what Solaris calls "SRUs" (the monthly updates that contain security fixes and other improvements) that are from the current in-development release.

As such, on Solaris, feature-test based builds are ideal. Unfortunately, Go's architecture (as far as I can tell) assumes runtime-based feature tests instead of build-time feature tests (unlike rust's build.rs) so that makes this difficult.

In an ideal world, I feel like the original build tag for solaris probably should have been "sunos", with the Sun/Oracle specific variant being "solaris" and "opensolaris" for the community derivatives. This makes sense too because "Solaris 11.3" is technically the marketing name and marketing version for the current release of SunOS 5.11.

However, since we don't live in an ideal world, I'd suggest "sunos" and "illumos" as the two new tags going forward; they're version-agnostic and reasonably accurate.

@jtsylve

This comment has been minimized.

Copy link
Contributor

jtsylve commented Jun 9, 2017

I'd suggest "sunos" and "illumos" as the two new tags going forward; they're version-agnostic and reasonably accurate.

This seems like a reasonable suggestion to me for all of the reason that @binarycrusader mentioned in his last comment.

@jen20

This comment has been minimized.

Copy link

jen20 commented Jun 9, 2017

I am very much opposed to a system which tries to maintain compatibility between Oracle Solaris and Illumos distributions in a manner which would differ from every other GOOS. In my opinion we should keep the existing solaris build tag as referring to Oracle Solaris, and introduce illumos.

This reflects the way that the various BSDs operate, and therefore imposes no additional cognitive overhead on people who don't care about operating systems other than Linux and Darwin (which is, let's face it, most people!)

The short-term pain which will be experienced by the Illumos community to go through and add build tags to things outside the standard library is not to be minimised - but on the other hand nor is the cognitive overhead of a three-tier GOOS system on the rest of the world. I think we in the Illumos community should just deal with this short-term problem.

@sean-

This comment has been minimized.

Copy link

sean- commented Jun 9, 2017

Given there appears to be no debate surrounding the illumos build tag, can we move forward with that?

@binarycrusader , if there is a desire to have Oracle Solaris be tagged with its own build tag, can that be submitted and addressed independently?

@bradfitz , short of a PR, what would you like the next steps to be?

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 9, 2017

@sean- I'm not requesting a separate build tag for Solaris, I had only suggested it as a way to ease the transition for OpenSolaris-based derivatives. My current employer has sponsored much of the work done on Solaris at this point for Go, and I've pushed hard to maintain compatibility with OpenSolaris-based derivatives.

As such, I leave the decision on build tags up to the Go maintainers.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 9, 2017

As such, I leave the decision on build tags up to the Go maintainers.

We (the Go maintainers) are not active Solaris or Illumos users. Ideally we'd prefer if the Solaris & Illumos would agree on a solution that solves the problems at hand. Maybe that's a new GOOS value. (A full fork, e.g. GOOS=freebsd vs GOOS=dragonfly) Maybe that's a build tag only for now.

Is it a goal (or non-goal) for binaries built for Solaris to run on Illumos, or vice versa?

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 9, 2017

Solaris only guarantees binary compatibility for binaries built on an older version of the operating system so that it can run on a newer version. Because the standard libraries (libc) and the linkers are now significantly different, it's highly unlikely to work. With that in mind, I would assert that it is not a goal for binaries built for Solaris to run on OpenSolaris-based derivatives such as Illumos or vice/versa.

Both share a common set of system interfaces (many decades worth), but we're also nearing one decade of divergence. It's unclear to me what the intent is with Go's build tags vs. GOOS, so I can't say which should be the answer.

@jen20

This comment has been minimized.

Copy link

jen20 commented Jun 9, 2017

@bradfitz I would treat it as a non goal for binaries built with (say) GOOS=solaris to work on Illumos and vice versa, though if they do it would be a happy accident.

IMO things would be best served moving forwards by treating Solaris and Illumos in the same manner as FreeBSD vs DragonflyBSD. Assuming @binarycrusader et al have no problem with that, it seems to be the path of least confusion, though admittedly with some short term pain.

@sean-

This comment has been minimized.

Copy link

sean- commented Jun 9, 2017

@bradfitz Are you good with that agreement? It sounds like we have a consensus that everyone is happy with.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 9, 2017

We'd like to do Go 1.9beta1 next week, and I don't think we can really do this before Go 1.10.

If we do GOOS=illumos, we'd need a GOOS=solaris builder first (#15072, which has been open for some time).

@jen20

This comment has been minimized.

Copy link

jen20 commented Jun 9, 2017

@bradfitz I'm not sure on how a Solaris builder can be supplied beyond VirtualBox (perhaps @binarycrusader can help there), but I'm sure we at Joyent can run any necessary Illumos builders without issue. (cc @bcantrill).

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 9, 2017

We currently run Illumos builders on Joyent already. (but would love help improving our setup, if somebody has some time) It's the Solaris ones where we lack coverage.

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 9, 2017

Solaris amd64 currently only supports Xen-based virtualization, Solaris-based virtualization (kernel zones), full virtualization (VirtualBox/VMWare), or bare metal provisioning. It does not have the virtio/virtnet drivers required by GCE (?).

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 9, 2017

We have 10 VMWare nodes. But I'd rather discuss on #15072.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Jun 19, 2017

Based on discussion above, switching to GOOS=illumos for Go 1.10 is fine. There's no need to have Solaris builders before then, although of course if we get further into the cycle with no Solaris builders we might reconsider GOOS=solaris entirely (golang.org/wiki/PortingPolicy). But that's just exposing a current problem (no Solaris builders, only Illumos ones), not introducing a new problem.

-rsc for @golang/proposal-review

@rsc rsc modified the milestones: Go1.10, Proposal Jun 19, 2017

@rsc rsc changed the title Proposal: New build tag for illumos build: add GOOS=illumos Jun 19, 2017

@binarycrusader

This comment has been minimized.

Copy link
Contributor

binarycrusader commented Jun 30, 2017

There's an Oracle Solaris builder in place now and there will be more soon, so this change can be made when appropriate.

@ikozhukhov

This comment has been minimized.

Copy link

ikozhukhov commented Jul 21, 2017

how you want specify how build targets 'solaris' and 'illumos' should be different?
i do DilOS (based on illumos) , and i'm interested in details. i try move to use more Debian userland, but it is not clean to me how you want to split 'solaris' and 'illumos' targets? who will be responsible for feature requests? and we have golang-1.8.x and i'm interested in next updates.
also, i have Intel & SPARC platforms and i have problems with golang on SPARC - i tried to prepare gcc6-sparc-cross build tools and produce golang builds. i'm able produce binaries, but with some a little updates. it is not golang issue - probably related to gcc team, but i still interested in golang port to DilOS SPARC.
i can provide build zones on Intel & SPARC if you are interested in some builds.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Nov 15, 2017

Moving to Go 1.11, as this apparently didn't happen while I was away on leave.

@bradfitz bradfitz modified the milestones: Go1.10, Unplanned Nov 15, 2017

@affixalex

This comment has been minimized.

Copy link

affixalex commented Dec 12, 2017

I'm generally hesitant to chime in on this sort of thing for fear of bikeshedding, but I didn't see this mentioned here.

I think it would be reasonable to target the Solaris 10 brand. The syscall ABI isn't guaranteed to be backwards compatible in the master branch of Illumos (or Solaris, to my knowledge). The libc is generally considered the compatibility boundary.

https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/common/brand/solaris10/s10_brand.c

The Solaris 10 brand, however, does have some implicit guarantees about the kernel ABI. These may be spelled out explicitly somewhere, I'm not sure.

From the top of my head, I think this approach would allow binaries to run on both Solaris and Illumos without any loss of functionality.

(As a parenthetical footnote, I think it'd be nice if Illumos had a distinct OSABI and attendant host triples etc in compiler toolchains along with a stable kernel ABI, but the general consensus is that the whole issue is a spectacular troll.)

@4ad

This comment has been minimized.

Copy link
Member

4ad commented Dec 12, 2017

We're not doing Solaris 10, it lacks APIs we use. Even if we were doing Solaris 10, we'd want to use different APIs on Solaris 11, so Solaris 10 can never be a base for everything.

@4ad

This comment has been minimized.

Copy link
Member

4ad commented Dec 12, 2017

As for the syscall ABI compatibility, that is a non-concern since Go only uses libc on Solaris variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment