Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When to release Elvish 1.0 #815

Open
xiaq opened this issue Apr 6, 2019 · 5 comments
Open

When to release Elvish 1.0 #815

xiaq opened this issue Apr 6, 2019 · 5 comments

Comments

@xiaq
Copy link
Member

@xiaq xiaq commented Apr 6, 2019

Here are my thoughts on what Elvish 1.0 means and when to release it.

Background

Elvish has, up until then, been designed and implemented predominantly by one person, namely me. I certainly don't mean to belittle the contributions from and discussions with the community, nor do I reject a future where Elvish becomes more like a community project. Nonetheless, I am the only person who can push directly to the repo, and I have never merged any code that I don't personally like.

Having a small team of designer for a language is a better idea than it might sound (and 1 is the smallest you can get). For instance, C was designed by Dennis Ritchie; Go by Robert Griesemer, Rob Pike and Ken Thompson; Clojure by Rich Hickey. None of those languages are perfect, but they are more coherent and tend to be smaller than languages designed by committees. (Examples abound; Scheme is a notable, perhaps only, counter-example).

To conclude, before the Elvish community evolves significantly, Elvish will remain designed (and "controlled", if you will) predominantly by me. This is both a constraint of reality and a good thing. As a result, this design note and potentially many others to come, will be colored by my honest personal opinions.

With that out of the way let's get to the actual topic...

What Elvish 1.0 means

I treat the significance of 1.0 as pretty much the same way as Go treated its 1.0, namely there is a compatibility promise. What this means is the following:

  • There will be a more or less rigorous language specification (this is Go's);

  • Any code written against that language specification will continue to work in future releases of Elvish 1.x.

In other words, once a language feature is in Elvish 1.0, it will be in all of Elvish 1.x releases. One can workaround that by explicitly marking some features as unstable and not documenting them in the language specification, but I want to avoid that wherever possible.

I would also think that Elvish 1 should have a lifespan of at least 10 years from the 1.0 release. Combined with the compatibility release, this means that Elvish 1.0 language needs to get things as "right" as possible, because once it is released, I will stick to such features for 10 years (this is not a promise, but rather a goal).

Also note that the UI experience is not put under the compatibility promise. My hope is that by Elvish 1.0, the UI will have become modular enough that it can have its own versioning, and breaking changes in the UI is more acceptable than the language.

What are still not right

So the reason we haven't had an Elvish 1.0 release yet is, of course, some parts of the language are still not "right". This is a strictly subjective matter, and this is where my personal taste kicks in.

Here is a preliminary list of things I don't find "right" yet. As anyone familiar with programming languages will notice, all of those problems are already understood within the PL theory community, and many have multiple solutions. However, the real problem here is a design one: which solutions to pick and mix to make Elvish feel simple and coherent.

  • State sharing between concurrent threads. Shells are inherently concurrent languages due to the existence of pipelines. However, it is not clear how to share state between threads in a safe way: Elvish today is actually an concurrency-unsafe (or concurrency-dangerous if you will :) language, and you can actually crash Elvish by concurrently accessing variables from different goroutines.

    • Should all variables be guarded by mutexes (in other words, make all variables atoms)?

      • As far as I know no language does this, but Elvish can probably do it if we are willing to permanently give up performance in favor of safety and ease of use.

      • If we do this, it makes atomic access to single variables easier, but might end up making atomic access to multiple variables harder too.

    • Should concurrent access of variables simply be disallowed, and CSP-style concurrency enforced?

    • Should we keep the core language concurrency-dangerous, and require explicit use of concurrency primitives? Most languages actually do this, but it requires programmers to be careful.

  • Static type system. A language without any kind of static type system is not likely to be future-proof. I have this vague idea that Elvish should have an optional structural type system much in the style of TypeScript, but there are many details to be figured out.

  • Should numbers be strings? This might sound like a trivial question, but it's hard to answer.

    • Answer: they both are and aren't! Design is #816.

    • If numbers are strings (which is the current case), it means that you cannot convert JSON (or any other serialization format, really) to Elvish value and back and get the same value, because JSON numbers are converted to strings.

    • If numbers are not strings, should a literal like 42 be a number or a string? This is also hard to answer, because in vim 42, the 42 is definitely a filename and thus a string, but in + 42 2, the 42 is definitely a number.

    • Maybe we should adopt some kind of sloppiness in this subject, treating numbers like strings (and vice versa) sometimes, but not always. But sloppiness is very hard to get right.

That's it for now; the list will likely grow and shrink over time as I update this post to reflect the status.

@xiaq
Copy link
Member Author

@xiaq xiaq commented Apr 18, 2019

Addendum to what's still not right:

  • A system for classifying errors. Perhaps not so much as a "system", but a coherent guideline. I am mainly familiar with those of Python and Go and neither is exactly satisfying:

    • Python provides a few "standard" exception types like ValueError and TypeError. However, they are way too vague and don't provide nearly enough signal when it comes to fine-grained error handling. Programmers can also define their own exception types, but there is not much guideline on how to use them.

    • In Go there are basically two ways you can do errors: 1) build adhoc errors that are basically just glorified messages, via fmt.Errorf or errors.New 2) implement a dedicated type for a class of errors. An example of this is os.PathError. Sometimes, functions are provided to distinguish certain classes of errors from others; an example of this is os.IsPermission. All of these techniques in Go seem to work fine for their specific use cases, but then there isn't much uniformity; programmers are left on their own to decide how to report and classify errors.

@xiaq
Copy link
Member Author

@xiaq xiaq commented Apr 19, 2019

Another more interesting question that I haven't "got right" is the ability of defining new types from Elvish code. (There currently isn't any.)

  • On the one hand, Elvish has maps and lists, which can pretty much carry most shapes of data (never mind efficiency); this is indeed how data in languages like Clojure and JavaScript are typically represented. Maps and lists are easy to understand, inspect and serialize. Behaviors cannot be attached to them; instead, you can write functions that manipulate them.

  • On the other hand, Elvish has a few builtin custom types, like styled and styled-segment (the type of keys in binding tables are also custom types, but they are not exposed to Elvish code). Those custom types are really just Go structs that implement some methods, which makes sense as this is indeed how you define new data types in Go. They also have some custom behaviors; notably for styled, it knows how to turn itself into ANSI sequences when printed onto the terminal. This is something that maps and lists cannot do unless we introduce something along the line of metatable in Lua.

Another way to summarize this is that there are three different requirements that pull the design in different directions:

  • Interoperability with Go code -- including, notably, the implementation of builtin facilities. This encourages defining custom types, because Go code typically defines new shapes of data using struct, and new behaviors are implemented as methods of those structs.

  • Uniformity and simplicity within Elvish. This favors having as few custom types as possible -- use lists and maps for everything.

  • Go/Elvish code parity. Go code can define custom behavior (styled being the arch-example), Elvish code cannot. This is not a good scenario to be in, as builtin facilities (which are Go code) have access to magic that is inaccessible to Elvish code.

@xiaq xiaq removed the comp:misc label Oct 18, 2019
@xiaq xiaq changed the title What Elvish 1.0 means What's blocking Elvish 1.0 Dec 31, 2019
@xiaq xiaq changed the title What's blocking Elvish 1.0 When to release Elvish 1.0 Dec 31, 2019
@dumblob
Copy link

@dumblob dumblob commented Jul 14, 2021

  1. WRT state sharing between concurrent threads - I'd say first implement something simple (an MVP) and later maybe something more complex.

    I'd first go for CSP-style channels as that should be quite easy to expose from Go. But ideally I'd probably go for something like V's shared & (r)lock blocks (they can be nested and guarantee safety!) which is presumably more time-consuming to implement.

  2. I'm a bit skeptical about defining new types. I'd prefer the direction "Uniformity and simplicity within Elvish." out of the three choices. I'm too new to grasp everything in Elvish and all the reasons why things in Elvish are as they are but intuitively I'd say such a high-level shell doesn't need any dedicated support of "working with types" in syntax (not even lists/maps/integers/...) as it's anyway futile imagining those thousands of types utilities produce in pipes.

    My idea always was to add own "method" (i.e. new functionality) to an existing shell command (thus leveraging its existing implementation details!) which would have a standardized name/API and which would ensure compatibility on demand. On of potential ways how to implement it could be as follows:

    1. each command has to accept pipe data in the most generic format (I'm deliberately not saying what format it is, but I'm sure it has to be extensible at least in compile time as newly defined commands can require new data types which won't fit the existing set of data types the generic format will support) - imagine something like TLV or any other stream-oriented extensible (unlike msgpack/cbor/...) efficient (in terms of native binary data support unlike JSON/XML/...) format like uSX or ston

    2. each command is in addition to (1) encouraged to accept also data in other formats - especially the one format which the command itself finds most efficient to work with (targetting zero-copy)

    3. each command is required to produce pipe data in the most generic format (the same one as in (1) )

    4. each command is in addition to (3) encouraged to produce also data in other formats - especially the one format which the command itself finds most efficient to work with (targetting zero-copy)

    5. at the beginning of the pipe plumbing both producer and consumer will try to find the most efficient way to exchange data

    This is to avoid any need for conversion commands in between while at the same time offer run-time detection of the most efficient way to produce & consume data.

    Btw. having separate filtering commands as the primary recommended measure to filter data in pipelines might sound very tempting but it actually makes things less composable in practice considering that non-filtering commands will be extended over time which will result in new filtering needs and these will at some point be impossible to satisfy in the generic context of a "filtering command" and will be satisfiable only in the contexts of non-filtering commands.

    Another reason is that filtering commands will never be able to efficiently support the special data types and thus everything would in practice just collapse to using the generic format significantly harming performance. Actually this whole idea might be implemented as an extension subcommand for "generic" filtering commands but the "main" implementation would be rooted in the non-filtering command.

@iandol
Copy link
Contributor

@iandol iandol commented Nov 17, 2021

which would have a standardized name/API and which would ensure compatibility on demand ... "each command has to accept pipe data in the most generic format"

I think this would be great. Although the concept of a 'pipeline' is the first one greeting us on the elvish home page, most elvish commands are not pipeline aware (why there is a rather cryptic all (one) to feed an each): all and nested each tare required to get pipelines working across commands. Being more pipeline friendly would make elvish more expressive, see for example nushell which has really expressive and semantically obvious pipelines: df | detect columns | drop column | into filesize 1K-blocks Used Available — @xiaq — you don't mention it above so it seems you prefer for the elvish design to be a pipeline-optional approach rather than a pipeline-first approach?

@ilius
Copy link

@ilius ilius commented Jan 8, 2022

Python provides a few "standard" exception types like ValueError and TypeError. However, they are way too vague and don't provide nearly enough signal when it comes to fine-grained error handling.

I was surprised and disappointed when I learned that ValueError doesn't even have an attribute/property (passed as optional argument) for the value that caused the ValueError!

@xiaq xiaq removed the note label Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants