Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML Enhancement Proposal (YEP) - "Table Style" for expressing a Sequence of Similarly Keyed Mappings #317

Open
TaylorSMarks opened this issue Nov 20, 2023 · 3 comments

Comments

@TaylorSMarks
Copy link

It's fairly common for YAML files to include sequences of several mappings where all the mappings have the same or very similar keys.

Such lists of mappings are tedious to write and a fairly common source of errors.

On projects I work on, I've introduced an extension to the YAML format that adds in a "Table Style" for expressing such sequences.

Before I stumble through trying to explain the syntax, maybe I'll start with some examples:

Without Table Style:

parameters:
  - name: NAME
    value: the-service
  - name: NAMESPACE
    value: the-ns
  - name: OCP_ENV
    required: true
  - name: VERSION         
    required: true
  - name: AWS_REGION
    required: true
  - name: AWS_ACCESS_KEY
    required: true
  - name: AWS_SECRET_KEY
    required: true
  - name: DDB_TABLE_PREFIX
    required: true
  - name: VAULT_IS_ENABLED
    value: "true"

Could instead be expressed using "Table Style" as this:

parameters:
  | name             | value       | required |
  | NAME             | the-service |
  | NAMESPACE        | the-ns      |
  | OCP_ENV          |             | true     |
  | VERSION          |             | true     |
  | AWS_REGION       |             | true     |
  | AWS_ACCESS_KEY   |             | true     |
  | AWS_SECRET_KEY   |             | true     |
  | DDB_TABLE_PREFIX |             | true     |
  | VAULT_IS_ENABLED | "true"      |

Obviously, it's a lot more compact, but I think it's also more legible. Looking at it using YAML sans Table Style, how quickly did you notice that VAULT_IS_ENABLED is receiving the string true as a value, not having the boolean required set? This is a common issue I run into right now - not noticing that the key (and maybe type) has changed.

Here's another example - first without:

env:
  - name: AWS_REGION      
    value: ${AWS_REGION}
  - name: AWS_ACCESS_KEY
    value: ${AWS_ACCESS_KEY}
  - name: AWS_SECRET_KEY
    valueFrom:
      secretKeyRef:
        name: ${NAME}
        key: AWS_SECRET_KEY
  - name: DDB_TABLE_PREFIX
    value: ${DDB_TABLE_PREFIX}
  - name: VAULT_IS_ENABLED
    value: ${VAULT_IS_ENABLED}

And now with Table Style:

env:
  | name             | value               | valueFrom.secretKeyRef |
  | AWS_REGION       | ${AWS_REGION}       |
  | AWS_ACCESS_KEY   | ${AWS_ACCESS_KEY}   |
  | AWS_SECRET_KEY   |                     | {'name': "${NAME}", 'key': "AWS_SECRET_KEY" } |
  | DDB_TABLE_PREFIX | ${DDB_TABLE_PREFIX} |
  | VAULT_IS_ENABLED | ${VAULT_IS_ENABLED} |

I suppose I'll try formally specifying the rules about this now:

  1. Anywhere you can start a sequence with an indented -, it's also valid to begin a table with an indented |.
  2. All lines of a table have the same indentation and begin and end with |. The table ends when the next line is dedented. Comments and blank lines within tables are acceptable.
  3. Lines within a table are broken down into cells by the character |. These cells are indexed from left to right.
  4. The first line of a table (the header line) defines the keys. Each remaining line of the table is a mapping within a sequence.
  5. Each cell of the header line contains either a simple key (name, value, and required in the examples above) or a compound key (valueFrom.secretKeyRef in the second example above).
  6. Cells within a mapping line contain either nothing (whitespace), a Scalar, or a Flow Sequence or Flow Mapping.
  7. Mapping lines do not need to contain as many cells as the header line, but they may not contain more.
  8. If a mapping line has fewer cells than the header line, the mapping lacks key/values for the trailing keys in the header line.
  9. If a mapping line contains nothing but whitespace, the mapping lacks a key/value for the key in the header line at the same index as the cell within the mapping.

How was that for rules? I wasn't sure if I could express them clearly enough in English. Do people understand? What ambiguity is there? Is there an existing complete BNF for Yaml? I searched through the existing spec, and although it mentions "bnf" a few times, I'm not sure it fully lays it out anywhere... if there is one, I can update it to include "Table Style".

It might be worth looking at the goals of YAML:

✅ "YAML should be easily readable by humans." I think that was established above already.
✅ "YAML data should be portable between programming languages." That can be done by updating the spec...
✅ "YAML should match the native data structures of dynamic languages." - I've seen this Table syntax used in Gherkin feature files/Karate test framework quite a bit. It's also in markdown.
✅ "YAML should have a consistent model to support generic tools." I don't think my proposal touches this...
✅ "YAML should support one-pass processing." Later lines do not change how earlier lines are interpreted, so I believe we're good here.
✅ "YAML should be expressive and extensible." I don't think this proposal removes much potential for future extensions...
✅ "YAML should be easy to implement and use." Adding Table Style shouldn't be any more difficult than existing features of YAML.

@Thom1729
Copy link
Collaborator

Is there an existing complete BNF for Yaml? I searched through the existing spec, and although it mentions "bnf" a few times, I'm not sure it fully lays it out anywhere.

In the YAML 1.2 spec, The grammar is not consolidated into one place, but distributed across various sections. You can copy the combined text of all production definitions by running the following in the browser console with the spec open: copy($$('pre.rule').map(x => x.textContent).join('\n\n')).

@UnePierre
Copy link

I like the idea.

So far, I've formatted table-like data like this:

  env:
    - { name: AWS_REGION,       value: ${AWS_REGION} }
    - { name: AWS_ACCESS_KEY,   value: ${AWS_ACCESS_KEY} }
    - { name: AWS_SECRET_KEY,                              valueFrom: { secretKeyRef: { name: ${NAME}, key: AWS_SECRET_KEY } } }
    - { name: DDB_TABLE_PREFIX, value: ${DDB_TABLE_PREFIX} }
    - { name: VAULT_IS_ENABLED, value: ${VAULT_IS_ENABLED} }

... valid syntax and also kind of readable, yet redundant in the keys.

And I agree: some humans are well-experienced with spreadsheets.

Additional feature

Maybe make the header line somewhat more distinguishable from the other rows?
I really like the AsciiDoc approach to tables.
Applied to your example, that would be like:

env:
  | name             | value               | valueFrom.secretKeyRef 
  
  | AWS_REGION       | ${AWS_REGION}       |
  | AWS_ACCESS_KEY   | ${AWS_ACCESS_KEY}   |
  | AWS_SECRET_KEY   |                     | {'name': "${NAME}", 'key': "AWS_SECRET_KEY" } 
  | DDB_TABLE_PREFIX | ${DDB_TABLE_PREFIX} |

  | VAULT_IS_ENABLED 
  | ${VAULT_IS_ENABLED} 
  |

... and this is also an example of allowing "one line = one row" and "multiples lines separated by blank lines = one row" side-by-side.

@ashemedai
Copy link

A similar idea was raised in another of YAML's repos and a solution was offered to do this with existing syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants