Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute API 2 #77

Closed
wants to merge 5 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
176 changes: 176 additions & 0 deletions new/rfcXXX-new-attributes-api.md
@@ -0,0 +1,176 @@
---
RFC: unassigned
Author: Daniel DeLeo <dan@chef.io>
Status: Draft
Type: Standards Track
---

# New Attributes API

This RFC proposes a new API for setting and accessing attributes in
cookbooks. The goals of this new API are:

* Separate desired and observed state.
* Structure the syntax in a way that "trainwreck" errors
(`NoMethodError` for `[]` on `nil`) can be avoided.
* Structure the syntax in a way that attribute tracing can be
implemented simply.
* Separate node-specific persistent storage from
cookbook/role/environment attributes.
* Structure the syntax in a way that can be implemented efficiently.

The new API is added alongside the existing API, so existing code will
work without modification.

## Motivation

There are several motivations for this change.

### Separate Desired and Observed State

In the current attributes system, system "facts" determined by Ohai are
intermixed with user attributes, which are generally used to define the
desired state of the system. For example, `node[:fqdn]` gives the
system's FQDN, as determined by Ohai at the beginning of the Chef run,
and identical syntax is used to reference a user-defined attribute, such
as `node[:mysql][:version]`. As a result, it's not possible to know
which attributes are "facts" except by committing to memory the list of
attributes provided by Ohai. Defining a new syntax where Ohai and
user-defined attributes are accessed by different methods makes cookbook
code clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there going to be some way of reading the merged attributes? Like merged_attributes[:fqdn]? Otherwise in my recipe I still have to remember which attributes are set in the system hash and which are set in the attributes hash.

EG, if I want to have some branching logic in my cookbook based on the state of a user, I have to remember whether I query that information from the system attributes or my user cookbook attributes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing node.override[:fqdn] = 'blah' doesn't actually work today since automatic supersedes everything. pulling the system attributes out so that they don't get union'd with the other node attributes makes sense and will avoid people trying to go down that road.

i implemented a little CMDB on top of my node attrs at Rhapsody and had a one-on-one with adam about 4 years ago now where we talked about that and he directed me to using something like node.default[:cmdb][:serial_num] to be an attribute which I would treat as being higher precedence over the dmi serial_num in the automatic attributes since some of them were '0000000000' and others were '1234567890' (companies that either didn't take the time to burn their serial numbers into DMI or on RMA'd boards). that's the correct design pattern to use, and this makes it a bit more explicit that automatic attrs come from ohai and are consumed and you don't think about mutating them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the argument we are making is that you should have to know whether the data you're using is from Ohai (which implies it was observed and true at the beginning of the chef run), or from user-defined attributes (which are aspirational, and reflect a thing that chef may have made true or might make true in the future). In addition to semantic differences, there are the behavior differences Lamont points out, e.g., the Ohai data is intended to be read-only, which is currently implemented by giving it the highest precedence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't know that we were making the user explicitly know which attributes come from Ohai. In that case, I totally understand keeping it separate for writing and reading.

Additionally, separating Ohai data from user-defined attributes
eliminates the possibility of attribute name collisions. For example, a
user might wish to set the FQDN of a host with Chef and use the `fqdn`
attribute to define the desired FQDN, or create an Ohai plugin to detect
the currently installed version of MySQL, storing this as
`node[:mysql][:version]`. With the current attributes system, the user
can't use these attribute names because they conflict with existing Ohai
or cookbook attributes.

### Eliminate Trainwreck Errors

The current attribute system presents node data as a nested collection
of Hash-like objects. There are some unfortunate limitations of this
model that lead to frustrating behavior for users. In particular, code
like this:

```ruby
node[:foo][:bar][:baz]
```

May raise a `NoMethodError` with the message:

```
undefined method `[]' for nil:NilClass
```

Note that in this example, it's ambiguous as to which part of the
attribute structure is missing, as this error will occur in an identical
way if either `node[:foo]` or `node[:foo][:bar]` are `nil`.

The UX of this error case cannot be significantly improved without
abandoning the nested Hash model. If an object other than `nil` was
returned for accessing a missing key, then code like
`do_something if node[:some_attribute_present]` would break (Ruby does
not allow custom false-y types), and monkey patching `nil` is
error-prone because `nil` is an immediate object (there is only one
instance).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another major issue with the method-missing syntax:

node.foo   # returns node['foo']
node.class # returns the Chef::Node symbol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now addressed in the "Method Missing Method Name Collisions" section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to call this out as the reason why method chaining syntax MUST NOT be used. I actually misunderstood the problem on first reading, and I don't think its obvious to the casual reader exactly what the implications are to the API design. Any kind of node['a']['b']['c'] or node.a.b.c kind of method chaining will recreate this problem and therefore CANNOT be part of the new design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to clutter this section too much, but I guess that kind of thing is what the rationale section is for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added a discussion in the rationale section.


### Enable Attribute Tracing

A common problem when developing cookbook code is attributes containing
unexpected values, which can be frustrating to debug. The current
implementation of attributes makes implementation of a tracing feature
difficult; in particular, it requires that attribute objects have
circular references to their parent objects so that the full attribute
path is known when setting a new attribute value. It is preferable to
avoid circular references because code that blindly walks the object
graph (such as a string inspection method) may enter an infinite loop if
it does not correctly guard against this case. Additionally, edge cases
such as "reparenting" a part of the node tree may be difficult to
implement correctly.

Providing an attributes API where the full attribute path is known when
new attributes are set would greatly ease the implementation of the
tracing feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean I can no longer do something like:

mysql = attr('mysql')
mysql['version'] = '1.2'

What would attr('mysql', 'version') return then? I only set '1.2' on the hash returned from the node, so my node object wasn't actually updated, correct?

If so I think an example like this could be added. Or a line along the lines of

The object returned from the attr call will not be connected to the node. IE, updating them will not update the node's persistent state.

Or if attr is going to return a Hash/Mash, that could just be detailed when the syntax is detailed below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that kind of hash-like syntax would be something we'd do, and I think also we'll try to make set and get go through different method calls. Again, the options I managed to think of for the syntax are here: https://gist.github.com/danielsdeleo/b942599fffbc9c9df407 but I'm sure there's a few options that didn't occur to me yet.

In addition, the details of the underlying storage could change; for example, we could implement a "flat" key value store, but have the keys be arrays, so under the covers you have:

["a", "b"] => "foo",
["a", "c"] => "bar"

That said I haven't put much thought yet into whether we should do that or just map things back to the nested hashes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth keeping RFC 18 in mind too, for the simple cases. node['foo.bar'] could be handled nicely (or node.attr('foo.bar')) but you can still drop down to a lower API that could give you back nils if you want it.

### Separate Node-Specific Persistent Storage from Other Attributes

So-called "normal attributes" in the current attributes API have a
unique behavior compared to other attributes in that they are persisted
between Chef Client runs. While this behavior is useful and necessary
for some use cases, it causes some frustrations for users:

* learning chef attributes is more complicated, since documentation and
recommendations are more complicated to understand.
* "Phantom" attributes. Users remove an attribute, or modify a
complicated data structure, but their changes are mysteriously not
applied because the normal attributes from a previous run are not
cleared.
* Recipes that generate node data that needs to be persisted must call
`node.save` manually, to ensure that persisted data is not lost in the
event that the chef-client run subsequently fails. This causes issues
with search data consistency, increased server load, and other
problems.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably be clear here that the proposal is to include a new storage 'thing' on the server that will be used for persistence which is not the node object.

For back-compat we'll support normal attributes in the existing node object, but should emit a deprecation warning on those -- people that have their mysql passwords stored as normal attributes (which they should NOT of course) will still continue to be able to do that.

The run_list and tags would presumably be moved over to the new storage location, which could conceivably be a breaking change (I don't immediately see how you keep the two storage locations in sync in order to mirror the normal attributes).

What happens to tools like 'knife node edit' where people expect be able to edit their run_list in their editor?

Also should probably specify if normal attributes will be supported in the new API and emit a deprecation or if people will be forced to not use them in the new API?

Since this storage location stores the run_list then it pretty much has to be indexed and be 'public' since people (and nodes) want to search on that and see it.

It would be nice to have another storage location which was similarly persistent and node-local that had default perms of r/w only for the individual node and for admins which was also NOT indexed. If we're going to add a new storage location, I'd really like to add this feature since it'll solve the problem of "i want to write creds to this machine and not have an attacker on a different box have access to those creds AND I trust my Chef Server isn't broken into and can do AAA". I think it would be easier to for users to deploy than Chef Vault and hits a reasonable security bar since most sites will just assume they're toast if their Chef Server is busted into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the replacement for "normal" attributes might be an RFC of its own. For the first step of this RFC, I'd be happy just to have a get/set API for attributes that doesn't read from normal attributes, and provide a facade over the current normal attributes that uses the existing node object storage. From there, you can add a new node-specific storage system, which ideally should enable orchestration primitives (compare and swap, etc.) and address your authorization use cases.

As for compatibility concerns, I think it's okay for things that used to work, like knife node edit to no longer function as they did before, as long as this only happens after you've opted in to the new thing, though of course we should try to minimize the disruption as much as we can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sounds good to me.


### Enable More Efficient Implementation

The current implementation has poor performance when attributes are
mutated and then read frequently. Chef 12 introduces a partial cache to
mitigate the problem, but this doesn't completely solve the performance
issues in certain degenerate cases. Due to the limitations of the
existing API, it is difficult to improve the performance further. The
proposed API should allow for straightforward implementation of a
granular merged attribute cache.

### Method Missing Method Name Collisions

The current attributes system allows attributes to be accessed via a
`method_missing` API, e.g., `node["fqdn"]` can also be accessed as
`node.fqdn`. While this makes the code appear more readable,
implementing it directly on the `node` object leads to method name
collisions which can be confusing for users. For example:

```
node.foo # returns node['foo']

node.chef_environment # returns the environment, which is an ivar on the
# node object

node.class # returns the Chef::Node symbol
```

Providing a new API where attribute access occurs via arguments to
statically defined methods eliminates this source of ambiguity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also any new public method on the node object either needs to be named something silly or is potentially a breaking change if someone happened to have a top level node attribute named whatever the method is. Which is really stating that we're likely to release breaking changes to the API in minor revisions (or else have to do completely convoluted design to try to avoid introducing breaking changes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature flags could cope with this for pre-release-y stuff:

def attr
  if Chef::Config[:new_attributes]
    awesome_stuff
  else
    method_missing(:attr)
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature flags will be good in the beginning if we have to experiment with method names or just want to limit the initial release to the most adventurous users, but eventually we'd like to have both APIs on by default. I think we'll have to just accept the small compatibility break. The scope will be quite small since method missing attribute users are a minority, and those using the same method names we're going to use should be a very small fraction. That said, we can do:

def attr
  if Chef::Config[:new_attributes]
    awesome_stuff
  else
    Chef::Log.warn("This might break in the future, use node['attr'] instead")
    method_missing(:attr)
  end
end

And when we flip the default for Chef::Config[:new_attributes] to true, we still give you the option to set it to false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point feature flags have to be turned on by default so you take a flag day where you break people. As the node object accrues more methods you break more possible attributes, and then you always have this warning and deprecation period in order to roll out any new method. And the likelihoood, particularly with many newer contributors coming on board is that breaking change to the node object get accepted and are shipped, then cause a regression then need to be rolled back and deprecations and feature flags added, then 6-12 months later we get to turn those off, then finally we get to turn the feature on in the next major release. That sucks to just be able to add a method to an object. The solution is to not have magical method_missing DSLs on objects where users can introduce arbitrary methods. The method_missing trick should be restricted to mostly wrapping other objects that have defined interfaces themselves but where you don't know the methods and can't do restricted delegation. The resource DSL and the resources attributes themselves are an exception to that since that is the way they need to behave, but that really needs to be done in a better cleanroom where the minimum amount of collision occurs and those cleanroom objects are never extended. We need to stop hanging arbitrary method chains off of general purpose objects like the node object, its horrible software design and feature flags don't do anything to make the developer's life any better, its just adding process in the way of fixing bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to stop hanging arbitrary method chains off of general purpose objects like the node object ...

Yep. But we are where we are right now, so how do we get where we want to go? All the syntaxes I've proposed either have no method missing magic or restrict it to a clearly defined scope (the chef sugar option). However, to implement them, we have to add some methods to Chef::Node, which means we either break people outright or we give them warning and/or an escape hatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree in general, node attributes seem like they should be connected to the node object. Trying to pull that into a more isolated concept seems like more than just an API change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attributes still are "connected" to the node object depending on how the syntax discussion works out (though even the delegation options just imply that attribute_access_method is a shortcut for node.attribute_access_method). I'm just saying the new api won't have the problems that are caused by implementing attribute access via method_missing directly on Chef::Node because you always call node.attribute_access_method and not node.some_method_that_might_be_an_attribute_or_might_not.


## Specification

### Ohai Data Access

TBD

### Default and Override Attribute Set and Access

TBD

### Node-Specific Data Storage Set and Access

Syntax is TBD.

Node-specific storage will initially be implemented as a facade on top
of the existing "normal" attributes structure. In the future we may
propose additional enhancements to this API to provide pluggability,
orchestration support, and/or improved authorization behaviors.

## Rationale

This section will be updated once the specification is more complete.

## Copyright

This work is in the public domain. In jurisdictions that do not allow for this,
this work is available under CC0. To the extent possible under law, the person
who associated CC0 with this work has waived all copyright and related or
neighboring rights to this work.