Project Goals

andychu edited this page Apr 1, 2017 · 39 revisions

Goals

  • Immediate goal: Implement a bash-compatible shell called OSH.
  • Long term goal: Design a modern Unix shell language called Oil that can do everything bash/zsh/etc. can do, and more.

Oil treats shell seriously as a programming language, in terms of both its implementation and defining its semantics.

For a more immediate view of the project, see the Oil blog. In particular, this blog entry was written at the same time as this page.

Use Cases

  • System Administration
    • Building Linux distributions (e.g. Arch Linux uses bash for PKGBUILD).
    • Startup scripts
    • Configure and build scripts. Reproducible and distributed builds.
  • Distributed Computing
    • Building containers
    • Specifying remote jobs
    • Feedback and Monitoring: performance measurement, security testing.
  • Data Science / Scientific Computing
    • Heterogeneous "big data" and small data pipelines. The language should scale down as well as scale up, i.e. low startup latency for small jobs.
    • Incorporate features of "workflow languages" and systems in the MapReduce family.
    • Concise data cleaning, transformation, and summarization.
    • Reproducible Research.
    • Non-goal: mathematical modeling. That should be left to specialized languages like R, Julia, and Matlab. Communicate with those languages through coprocesses (to avoid startup overhead and concurrency.)
  • Interactive Computing
    • A general purpose REPL (terminal and probably a Jupyter kernel).
  • Document Publishing
    • http://oilshell.org/ and many programming books are built and orchestrated with shell scripts / Makefiles

Oil Language Design Goals

  • Easy upgrade path from bash, the most popular shell in the world.
    • To do this, I've written a very compatible bash parser, which will allow automatic conversion of bash (osh) to oil. So the language has a different syntax and a superset of bash semantics.
  • Consistent syntax.
  • Fix sh and bash semantics to be more developer-friendly (in a backward compatible way).
    • Proper Arrays
    • Strict mode for developer productivity (enhanced set -o errexit, nounset, pipefail)
  • Enhance the shell language; treat it as a real programming language.
    • Fill in obvious gaps, like abspath, etc.
    • Compound data structures
    • Example: Completion functions in bash have a bad API involving globals and are difficult to write. It should feel more like writing completion functions in Python or JavaScript.
    • Selected influences: Python, R, Ruby, Perl 6, Lua (API), ML, C and C++. Power Shell.
  • Reduce language cacophony in shell programming by reimplementing tools closely related to the shell.
    • Example: combine shell, awk, and make.
    • Also combine tools like find (which has its own expression parser and starts processes), and xargs/GNU parallel, which start processes in parallel. GNU parallel is actually mentioned in the bash manual.
  • Richer constructs for concurrency and parallelism.
    • Folding in make -j and xargs -P goes a long way.
  • Allow secure programs to be written.
    • In emitting strings: escaping
    • In reading strings: error checking should be easy, better control over "read" delimiters, etc.
    • Fix issues with globs and flags, i.e. untrusted file system and untrusted variables
  • C and C++ bindings
    • provide access to advanced Linux kernel features - namespaces, cgroups, seccomp, tracing, /proc, etc. (but remain portable to other Unices)
    • It should be possible to write a busybox in oil.
  • Should be the best language for writing quick command line tools.
    • In particular, replace the getopt interface in bash with something much better.
  • Expand the range of things that can be done with the "polyglot" model.
    • Coprocesses
    • Built-in serialization formats like CSV, JSON, maybe HTML
    • Maybe some binary formats as libraries
  • No extra "macro processing" on top of the parser. History substitution will be built in, but disabled in batch mode. procs can be used instead of aliases.

Language Design Style

  • Imperative on the scale of code, but declarative/functional/concurrent on scale of architecture, not unlike sh itself.

Implementation Goals

  • Proper error messages like Clang/Swift. Static Parsing.
  • Provide end-to-end tracing and profiling tools (e.g. for pipelines that run for hours)
  • Library-based design like LLVM. Example: the same parser is used in batch mode as well as completion mode, which is not true of all shell implementations. The parser can be used for auto-formatting and linting, which is also not true of other implementations.
  • Few dependencies so it can be used in bootstrapping Unix systems and clusters. (e.g. distributed as a C++ file and optional oil source.)
  • Much of oil should be written in oil (which means the VM needs to be fast enough for this).

Longer Term Goals

  • Expose our toolkit for little languages -- lexing, parsing, AST representation, etc. So that other languages can be built in the same way.
  • Metaprogramming with ASTs as first class data structures.
  • FastCGI Scripts on shared hosting (using strict input validation and hygienic text generation).
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.