-
Notifications
You must be signed in to change notification settings - Fork 0
Architect Expressions #16
Comments
UPDATE:
Artifact Expression User StoriesWriting Artifact ExpressionAs an operator, I should not need to specify the uid/gid of the entrypoint program. It is assumed that the entrypoint program will run as root:root. As an operator, I should not hard code the exposed ports and addresses of a container. These information should be given to the application at runtime via environment variables, as they may change dynamically through the lifetime of an automaton. As an operator, I must be able to supply a list of environment variables in the format As an operator, I must be able to supply a list of arguments to execute when the container starts (the entrypoint). These values act as defaults and MAY be replaced by an entry point when creating a container. As an operator, I must be able to supply default arguments to the entrypoint (e.g. CMD Dockerfile instruction), which will be executed when the container starts if the entrypoint is not specified. As an operator, I must be able to supply a current working directory of the entrypoint in the container. This value SHOULD act as a default and MAY be replaced by a working directory specified when creating the container. As an operator, I must be able to supply labels for the container using the annotation rules. As an operator, I must be able to specify a system call signal that will be sent to the container to exit. The signal can be a signal name such as As an operator, I should not specify the persistent storage mount points (similar to Docker Volumes) in the Artifact spec, instead this information should be present in the State spec. As an operator, I must be able to build an artifact from another artifact fetched from a remote or local registry transparently via a content address to the artifact spec. The content address of an artifact must be unique and deterministic (any changes to the content should generate another pseudorandom content address), the content address MUST be supplied through a trusted source. As an operator, I must be able to specify imperative instructions to add layers to an artifact. These imperative instructions should be translated into deterministic output (i.e. an OCI image) before the artifact is fetched and used by another operator. As an operator, I want matrix artifacts to comply with the OCI image standards so As an operator, I want to be able to delete references to created artifact specs using imperative commands such as:
This command should delete the references to the top artifact layer, manifests, and configuration. However the underlying layers may still be utilised by other artifacts. Hence they should remain there until the garbage collector collects them. As an operator, I would like to enable liveness and readiness probe on an artifact. If an automaton is not responding to valid periodic requests, it should be killed and restarted. If an automaton is not ready, no traffic should be directed to the automaton. Sharing Artifact expressionsAs an operator, I want to be able to push an artifact to a shared registry so all other As an operator, I want to have copies of the actual artifact (layer tar archive, manifest, configuration, etc.) when I want to test the artifact locally. As an operator, I want the read only components of an artifact to be shared across multiple artifacts (for example, a NixOS artifact shares some base layers with an Alpine artifact), this should be done through nix store graphs to reduce storage space. |
Just regarding uid and gid, I think these should be automatically specified. The operator should not worry what user the container is running as. In fact, I would think each instance of an Automaton can run in its own user. It just ensure more isolation. So maybe we can ignore anything inside /etc/passwd? |
A single container can bind to several network interfaces/ports. The main designation for what a container speaks is the protocol spec. But right now the protocol spec doesn't specify address details. Like an HTTP protocol spec right now has no mention of the HTTP address. This is by design, the operator should need to create an artificial address, all these can be automatically and determiniatically derived. But how? Well via "pushing the config down". Basically the inside apps of the container needs to be given parameters (addresses) to bind to. This is better than the internal apps binding to a fixed address and us having to remap it. We can use environment variables to achieve this just like anything else. So in effect both my own address and the addresses of my deps is handed down to me. @ramwan can you weigh in here? The internal address however doesn't matter because we can always remap. However, because we don't know (from the orchestrators pov) what port or interface they bound to, we would either need to discover that info (via container metadata) or ask the operator in the artifact config. Also do containers have 127.0.0.1 by default? Note that this means the EXPOSE option would be overwritten if configurable or it would be used as discoverable metadata to know what ports to remap. |
Regarding uids/gids, processes inside containers should just run as root:root. This may be security issues later though, but for now it should be fine. |
@CMCDragonkai |
I think we talked about this before, but DNS is not a good idea due to cache invalidation. We need deeper control over name resolution so yes it does need to be dynamic, but the problem is with synchronising dynamic changes atomically or in an eventually consistent manner across the Matrix network. Some additional thoughts. It is important to not think of DNS in terms of its implementation, but in terms of its core abstraction as a key value database adding a level indirection to pointer dereferencing. DNS through its hierarchical structure and TTLs is eventually consistent database with a single source of truth: 1 writer, many readers. The main issue is with change, as we change name resolution either due to migration, scaling, redeployment or other things, these changes must propagate to how we resolve names (especially to avoid "recompiling" the entire Matrix network). If our network is small, changing from a single database is adequate, however as we scale up, changing names can involve global locks or significant downtime or significant name resolution overhead. How do we have consistent systems that is highly available is a classic distributed systems problem. One solution is to consider both time and space partitioning for an eventually consistent system. |
Sharing artifacts should be based on Nix graph based sharing. Container based layer sharing is a flawed version of this. But I believe layers should be mappable to nix store graph. To do this appropriately, we need to consider content storage system for our artifacts. Let's not reinvent the wheel, the Nix system is a great way of building artifacts. However this means integrating the In the Nix system there is a concept of multistage evaluation due to mainly to the requirements of determinism. At the top level we have the Nix expressions which is a human readable turing complete language that allows areas of side-effects to occur specifically for the convenience of working within a filesystem context. That is Nix expressions can refer to other entities on the filesystem and even have limited abilities to perform IO like reading environment variables. The next stage is the derivation, where the expressions can compiled to a limited configuration language that is not turing complete (ATerm), but ultimately is a graph like data structure. All reducible expressions should be reduced at this point, there's no further evaluation at the Aterm stage. The key point is that everything is fully specified (any IO that the Nix expression language knows about is fully read and completed). There is a relationship between an in-memory interpretation of a Nix derivation expression and the derivativion itself that exists on disk (a kind of hybrid memory and disk model of state), thus the existence of the derivation file seems to be a side effect of evaluating the Nix expression, yet the side effect is transparent to the interpretation, so its a hidden side effect and everything is still pure. Finally there is the execution of the derivation that builds a final artifact. While all legitimate derivation expressions will always be compilable to a derivation file, not all derivation files will produce and output store path. The derivation might not work for any reason. So our Artifacts are the actual outputs of any derivation, whole the Artifact specification is some sort of translation of the derivation expression or derivation itself. Note that the transformations are one-way. If you have the derivation expression, you ca transform to the derivation, but not the other way around. Now for the sharing of artifacts. By flattening the hierarchy of key values here, from fixed output derivativations, to derivations and outputs, everything from source to remote deps to outputs is put in a global content addresses Note we easily support other artifact specification formats easily by piggy backing off Nix. Docker formats are understood via just a conversion like Docker2Nix. |
We also need readiness/liveness probes included as well. In my HTTP applications, I generally add an |
So for sharing artifacts using Nix, will the operators be writing Nix expressions to bring in the docker images using a function created by us, which would be similar to something like |
As for whether a container have 127.0.0.1 by default or not, the OCI runtime spec does not mention this topic. If we create a new network namespace with |
Content addressing, interface vs instance, class-based OOP, IoC
The text was updated successfully, but these errors were encountered: