Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raku lacks a generic way of handling backend constants (signals, protocol families, etc.) #243

Open
Kaiepi opened this issue Oct 31, 2020 · 1 comment
Labels
language Changes to the Raku Programming Language

Comments

@Kaiepi
Copy link

Kaiepi commented Oct 31, 2020

Before I can make a PR for my solution for #111, there's another problem that needs to be solved: ProtocolFamily, SocketType, and ProtocolType don't correspond to real values the backend uses with sockets. This is due to a change I made early on in my work on the IP6NS grant. At the time, they corresponded to Linux's values for PF_*/SOCK_*/IPPROTO_*, but these aren't always the same as those used by other platforms, such as FreeBSD. Allowing the JVM to expose values for these that could be used by NativeCall wasn't something I could figure out how to do at the time, since I hadn't worked with Java much, and there are no values like these exposed for use with NIO. While making them correspond to nqp constants instead allows sockets to be used on more platforms, this hides the problem more than it solves anything, and makes them become an issue again once an API for DNS resolvers gets involved. IPPROTO_* values are unlike PF_* or SOCK_* ones in that they're typically numbers assigned by IANA, in which case they will not differ from platform to platform when the protocols they correspond to are supported. These numbers can appear in some types of DNS responses (WKS, for instance), so should people attempt to use ProtocolType to represent them, they're in for a rude surprise.

There is a lot of code involved in exposing signals from the backend, which is rather inefficient in the JVM in particular. This doesn't translate or scale well for protocol families, socket types, and protocol types, as well as any other type of value like these that could exist in the future, such as socket options. I think there's a more general problem to be solved here:

  • There are constant integers that carry meaning in the backend.
  • These should be exposed to the runtime for use elsewhere, such as with NativeCall.
  • The host may or may not support any number of these values...
  • ...which the backend may or may not be capable of supporting as well...
  • ...but the runtime should still have a representation for them anyway, and should be able to detect what level of support for them exists.
  • These may have specified values that the runtime should always know about...
  • ...or they may be platform-dependent, in which case they should remain unknown if unsupported.
@Kaiepi Kaiepi added the language Changes to the Raku Programming Language label Oct 31, 2020
@Kaiepi
Copy link
Author

Kaiepi commented Nov 18, 2020

I have a solution I'd like to propose for this, which comes with some good news and some bad news. The good news is it solves this problem in a way that improves the performance of both signals and sockets on MoarVM with minimal breakage, with &signal becoming around 25% faster and IO::Socket::INET.listen becoming around 15% faster; the bad news can't be given without explaining how it works first.

So far, I've been calling the values this issue pertains to "constants". This term is already overloaded and already carries a meaning in a backend context, so I call magical constants like signals "runes" instead.

On the Rakudo side of things, a common API for runes can be defined with the following types:

my enum Rune::Kind ( #`[...] );

my enum Rune::Support ( #`[...] );

my role Rune[Rune::Kind:D] {
    method kind(::?CLASS:_: --> Rune::Kind:D) { ... }

    method support(::?CLASS:_: --> Rune::Support:D) { ... }
}

my role Rune::WithDefault[Int:D] {
    method default(::?CLASS:_: --> Int:D) { ... }

    multi method CALL-ME(::?CLASS:U: Int()) { ... }
}

A rune has a kind and a level of support associated with it, alongside the key, value, and index they have as a result of being defined as enums. Kinds differentiate between different lists of runes in the backend, and are relevant when generating rune enums or getting a level of support for an individual rune. Separating the level of support for a rune from its value (like signals do now) makes it possible to eliminate Rakudo::Internals.VM-SIGNALS, since there's no longer a need to ask the backend for a list of runes more than once to get their support levels.

Runes may or may not always have a defined value when the host doesn't support them (e.g. 0 for Signal). When they do, Rune::WithDefault allows a default, out-of-range value for these to be provided, which its CALL-ME candidate invalidates. Signal::Signally no longer has any behaviour unique to it because of this type, which can now be shared with ProtocolFamily and SocketType without introducing more types.

nqp gains two new ops:

nqp::getrunes(int $kind)
nqp::getrunesupport(int $kind, int $idx --> int)

getrunes returns a list of key/value pairs corresponding to a $kind of rune (corresponding to a Rune::Kind:D), similarly to getsignals.

getrunesupport gets the level of support for an individual rune (corresponding to a Rune::Support:D) given its kind and index.

nqp protocol for runes is based around their "canonical indices" rather than their real values, which makes it possible for the backend to continue to validate runes given as arguments to ops in constant time. The signal, connect, and bindsock ops now accept indices of runes instead of values, like getrunesupport does.

On MoarVM, besides using rune indices instead of values, current performance levels of signal and socket ops can be maintained by doing more of the work involved in generating lists of runes during compile-time. All lists of runes can be generated like this, but it's not guaranteed to be possible for some to have a predefined order that can be known at this point in time. In this case, when accessed for the first time, a list of runes is canonicalized with a mergesort by value. The result is cached in the VM instance alongside its boxing for use during lookups and to allow getrunes to be called more than once for a kind of rune more efficiently (should there come a time when that becomes necessary).

The bad news is to do with backends other than MoarVM. A similar strategy to how I implemented this API for MoarVM could be used for the JVM and JS backends, but in order for them to be capable of obtaining real values for socket-related runes, the JVM backend would need a C compiler, and the JS backend would need a C++ compiler on hand when built. In theory, a C or C++ preprocessor would be enough to obtain rune values, but in practice I found this to be fragile, and builds would break as soon as it gets used with values that aren't C literals anyway. If this is the way to go, I don't think I have the insight needed to teach the build system to work with C/C++ compilers to complete implementations of this for the JVM and JS backends (at least, not when time's a concern).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes to the Raku Programming Language
Projects
None yet
Development

No branches or pull requests

1 participant