Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Document why we don't use maps in the API in api-conventions.md #2004

Closed
bgrant0607 opened this Issue Oct 27, 2014 · 7 comments

Comments

Projects
None yet
2 participants
Owner

bgrant0607 commented Oct 27, 2014

Forked from #1980.

The API has a number of lists containing names embedded in objects, such as Volumes, Containers, Ports, Env, and VolumeMounts. Both configuration and field references (e.g., in filter expressions, events) are uglier and/or more verbose when using lists rather than maps. This puts us at a disadvantage compared to other systems with more elegant API and/or configuration schemas (e.g., Fig).

Automatically translating maps into lists of named objects appears to be hard. In JSON and YAML, structures and maps cannot be distinguished without a schema. It looks like we'd need either duplicate fields, duplicate schemas, or a custom parser in order to support both formats in the same API version. Duplicate schemas have proven hard to maintain, both in Kubernetes and internally. I don't think we want to maintain parallel API versions forever, either. The go-yaml parser is around 9k lines of code -- a custom parser is not something we want to own, IMO.

The specific proposal here is to:

  • change these lists to maps
  • make these name fields optional and auto-populate them from map keys in apiserver, so that names are available in subobjects even without the maps to which they belong

Port name is currently optional. It would effectively be required. A convention of "p" would be straightforward for users/tools, however, such as in the case of ports auto-populated from Docker images (e.g., by podex).

This would be a breaking change. We could do it in v1beta3.

The names of top-level objects, in ObjectMeta in v1beta3, would not be changed. Those can be auto-populated in v1beta3 by clients in a straightforward manner.

/cc @smarterclayton @erictune @proppy @thockin @jbeda

Contributor

jbeda commented Oct 27, 2014

We've gone back an forth on this in the past in various config discussions. Here is the last time I remember on k8s: GoogleCloudPlatform#853 (comment)

The crux of this problem is that it isn't clear to the user what "left hand side strings" are "magic keywords" in the config system/API vs. which are user data.

Hoisting the example from that other bug -- compare these two:

Example A:

ports:
  - name: www
    hostPort: 80
    containerPort: 80
    protocol: tcp

Example B:

ports:
  www:
    hostPort: 80
    containerPort: 80
    protocol: tcp

While B is obviously shorter than A, I think it is more confusing for the novice user. When copy/pasting examples or reading unfamiliar configs, the novice user won't know what www is. Is this a magic value that they aren't supposed to change (like ports) or is it an input/naming thing that they should change?

If we follow your suggestion and a name into the sub object It'll be even more confusing:

ports:
  www:
    name: www
    hostPort: 80
    containerPort: 80
    protocol: tcp

Questions users'll be asking: why is www there twice? What happens if I change one but not the other?

Owner

bgrant0607 commented Oct 27, 2014

@jbeda Thanks for pointing me at the previous discussion. I remembered that it had come up before but couldn't remember where.

I see your point regarding distinguishing schema keys and user names. However, we don't have enough user data to be able to really know which they'd prefer.

Also, maps are used in competing solutions, such as Fig:

Admittedly, Fig does this only for their top-level objects -- containers in their case, and I do the same in my configuration generators. Although, they do accept both maps and lists for environment variables. I'll look at how they do this (note: Fig is python, not Go).

Another alternative could be to make all subobject names optional. There are no subobject names in Docker's API, nor in Fig. However, one reason we have names for subobjects is that we've added a parent object around containers and volumes.

With respect to the example above: Mainly subobject names would be needed internally rather than in serialized form. We could potentially omit them from serialized form by specifying "-" so that the fields are ignored during marshaling/unmarshaling.

Contributor

jbeda commented Oct 27, 2014

We should work to reduce the amount of boiler plate, for sure, but I'm not sure that the API is the place to do it. If we want to have a more concise config language/schema -- go for it. There is room in the world to allow for different trade offs.

There are other things we want to avoid here -- significantly, each key should have one and only one form for what it accepts -- this let's us have a strongly typed schema instead of forcing us to interpret the yaml parse tree with custom code.

Owner

bgrant0607 commented Oct 27, 2014

@bgrant0607 bgrant0607 changed the title from Consider converting all subobject lists to maps in API to Document why we don't use maps in the API in api-conventions.md Oct 31, 2014

Owner

bgrant0607 commented Oct 31, 2014

Abandoning this idea. Converting it to a doc bug to document the reason for the way the API is.

@bgrant0607 bgrant0607 self-assigned this Nov 6, 2014

@brendandburns brendandburns added a commit that referenced this issue Nov 17, 2014

@brendandburns brendandburns Merge pull request #2423 from bgrant0607/docfix
Documentation improvements. Fixes #2004, #2115, #2171.
6fa798c

@bgrant0607 bgrant0607 reopened this Feb 26, 2015

Owner

bgrant0607 commented Feb 26, 2015

@ghodss has pointed out that lists do not allow generic merging for configuration updates.

Owner

bgrant0607 commented Feb 27, 2015

Reclosing in favor of #4889.

@bgrant0607 bgrant0607 closed this Feb 27, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment