## Configuration file formats

* INI
* XML
* JSON
* YAML
* TOML
* ...


### Common issues

Any reasonable configuration file format and structure is manageable on relatively small scale.
Issues however become immediately apparent if one has to govern __many config__ files for many applications for __many environments__..

Usually, the bottomline is: a common structure, schema of the config file exists, but needs to be populated with differing values.
The schema is usually tough to enforce explictily and it is even harder to tell if value may be changed to another one.

Manual maintainance is no longer feasible.


Great, relatively small-scale example are kuberenetes deployment manifests. Generating them for one environment is troublesome, doing that for many environments quickly makes it a diffcult task. 

### Common solutions

1. Templating
  * Jinja2, Helm Charts..
2. Schemas
  * JSONSchema, Python Marshmallow..
3. From {some_format} to {config_format} generation
  * JSONNET, ksonnet, Skycfg


## Problem statement

Broadly speaking, utilizing above solution comes bundled with issues, which tend to overshadow any gains:

* Frequently, "extends" existing standard which comes at the cost of carrying over its drawback
* Refactoring ridden with fear (no type safety)
* Synchronizing schemas and generators/templates is hard and error prone
* Inheritance order matters - why?
* Turing-completeness - why?
* Undefined behaviours
* Undefined, null, error ...
* Side-effects
* High congnitive load
* ...
* and anyone can add more examples :)


### Cherry-picked example

Does this \[__JSONNET__\]:
```json
{
  person1: {
    name: "Alice",
    welcome: "Hello " + self.name + "!",
  },
  person2: self.person1 { name: "Bob" },
}
```

is really better than this \[__JSON__\]:
```json
{
  "person1": {
    "name": "Alice",
    "welcome": "Hello Alice!"
  },
  "person2": {
    "name": "Bob",
    "welcome": "Hello Bob!"
  }
}
```

* Same syntax issues
* Still typo-prone
* Writing a schema is manual labor
* "_embedding_" pytohn-esq statements does not lends itself to be more readable

## Data constraint languages

* (sort-of) nix expression language
* Dhall
* CUE

### Why DCL/CUE?

"Jsonnet is based on BCL, an internal language at Google. It fixes a few things relative to BCL, but is mostly the same. This means it copies the biggest mistakes of BCL. Even though BCL is still widely used at Google, its issues are clear. It was just that the alternatives weren't that much better.

There are a myriad of issues with BCL (and Jsonnet and pretty much all of its descendants), but I will mention a couple:

* Most notably, the basic operation of composition of BCL/Jsonnet, inheritance, is not commutative and idempotent in the general case. In other words, order matters. This makes it, for humans, hard to track where values are coming from. But also, it makes it very complicated, if not impossible, to do any kind of automation. The complexity of inheritance is compounded by the fact that values can enter an object from one of several directions (super, overlay, etc.), and the order in which this happens matters. The basic operation of CUE is commutative, associative and idempotent. This order independence helps both humans and machines. The resulting model is much less complex.

* Typing: most of the BCL offshoots do not allow for schema definitions. This makes it hard to detect any kind of typos or user errors. For a large code bases, no one will question a requirement to have a compiled/typed language. Why should we not require the same kind of rigor for data? Some offshoots of BCL internal to Google and also external have tried to address this a bit, but none quite satisfactory. In CUE types and values are the same thing. This makes things both easier than schema-based languages (less concepts to learn), but also more powerful. It allows for intuitive but also precise typing.

* There are many other issues, like handling cycles, unprincipled workarounds for hermeticity, poor tooling and so forth that make BCL and offsprings often awkward.

So why CUE? Configuration is still largely an unsolved problem. We have tried using code to generate configs, or hybrid languages, but that often results in a mess. Using generators on databases doesn't allow keeping it sync with revision control. Simpler approaches like HCL and Kustomize recognize the complexity issue by removing a lot of it, but then sometimes become too weak, and actually also reintroduce some of this complexity with overlays (a poor man's inheritance, if you will, but with some of the same negative consequences). Other forms of removing complexity, for instance by just introducing simpler forms/ abstraction layers of configuration, may work within certain context but are domain-specific and relatively hard to maintain.

So inheritance-based languages, for all its flaws, were the best we had. The idea behind CUE is to recognize that a declarative language is the best approach for many (not all) configuration problems, but to tackle the fundamental issues of these languages." \[1\]

---
\[1\] - https://github.com/cuelang/cue/issues/33#issuecomment-483615374

## Top CUE advantages

* Typing 
* Types are values
* Values lettuce (inheritance order does not matter)
* Configuration, validation, generation .. all in one package
* Allows both to integrate with tools and import from external tools
