Knox has been in development at Pinterest since early 2015. It is our solution for storage of secrets and keys across all of our infrastructure. After we began development and deployment of Knox, several open source solutions were introduced. They all have different trade-offs and use cases, but we continued using Knox because we felt it met our needs. We discuss some of these differences and trade-offs below of several similar solutions. All of these solutions are api based and some of them expose a client.
Keywhiz is a key management solution open sourced by Square in early 2015. Keywhiz has many different components written in a variety of languages.
For machine authentication, Keywhiz uses mTLS which is not configurable (requiring a PKI). User authentication is supported through LDAP or bcrypted passwords. This is limiting for cases where an alternative authentication method is required.
For authorization, all users can access all keys and machines can read any keys that are in any of the groups they belong to. Machines can also be given write access by being configured to do so.
For storage, Keywhiz requires PostgreSQL or MySQL. This does not appear to be easily pluggable and appears to only support SQL syntax.
Keywhiz client creates an in-memory FUSE file system so that any language can easily make calls. A FUSE file system makes it so that secrets are protected from any threats that involve compromise of the disk. It also makes it so keys are removed on reboot. From a security perspective this is desirable in most cases. From an availability and usability perspective FUSE file systems aren’t as well supported as the native file system and this may cause more server calls than would otherwise be necessary (making failure more likely). The secrets are cached in memory so that they can be used efficiently similarly to knox. Unlike Knox, Keywhiz provides an interface to os level permissions on keys to further harden access.
For rotation, Keywhiz has key versioning by appending a version number to the name of the key. It is unclear how to use it from its documentation and it doesn’t support version statuses.
Vault is a key management solution open sourced by Hashicorp in early 2015. Vault is written in Go and designed to fit into generic data center environments.
For machine authentication, Vault supports appID (shared secret) and mTLS. For user authentication, Vault supports github, ldap, and username/password. For anything else it provides a way to add authentication backends by adding them to the commands.go file. This is similar to Knox’s interface and it provides more options out of the box.
For authorization, Vault has the idea of policies that grant access to resources that can be applied to individual machines, individual users, groups of machines, or groups of users as determined by the authentication mechanism. This type of access control is very expressive and the path at which a key is placed can automatically take care of any access control. This will make adding keys easier to use in certain situations (like when many keys are being added for the same service at the same time), but can be confusing for new users who need to figure out which access control rules will apply to their secret. Assuming users understand this, it provides equivalent security to Knox.
For storage, Vault provides many options out of the box and allows for a lot of flexibility here. They have more options than Knox and also provide a way so other backends could be built similar to Knox’s key database, but slightly more complex.
Vault client implements the basic get and write calls while adding user authentication correctly. Their client does not do any sort of caching and assumes the developer’s code respects expiration times before calling the client (or performing another API call). Because it does not perform caching this will likely need to be implemented by developers in code.
For rotation, Vault leverages client leases and key versioning to force updates. Vault’s form of rotation makes it unclear of what to do if Vault ever has downtime and it is likely that developers may want to build some sort of client wrapper to cache keys for availability and speed. Unlike Knox, Vault does not support version statuses for rotation. Developers could obtain similar results by building a wrapper on top of Vault, but this would add complexity and room for error.
Confidant is a key management solution open sourced by Lyft in late 2015. Confidant is entirely written in python and built for exclusive use in an AWS environment.
For machine authentication, Confidant leverages KMS and IAM roles to provide signed tokens from AWS that can only be created by a certain IAM role. For user authentication, confidant leverages google authenticator through a web UI. It does not provide interfaces for these to be plugged in, but the code is organized such that modifying the code to switch them out would not be too difficult. These are methods are comparable to Knox albeit it would likely be slightly more work to plug in a custom authentication scheme.
For machine authorization, each service has a list of key ids it is able to access. All authenticated users are authorized to access all keys. In knox, access control occurs on the key level which allows for finer grained control. Also there are different permission sets that allow for more flexibility. (machines could be given permissions to rotate keys; users could be given permissions to only read keys and not modify them). Lastly IAM roles are a different way of grouping machine rather than hostname prefix. Assuming IAM roles are logically partitioned, this could make a lot of sense for code ran in AWS.
Confidant requires the use of dynamodb for storing credentials and redis for a session store. These are not requirements for Knox which leaves the db interface open for you to implement as you wish. The currently exposed interfaces lets you use mysql, postgres, or sqlite out of the box.
Confidant currently does not provide a client to cache keys locally for speed or availability. Knox has a pre-built client to interact with the Knox server that is tested and running on thousands of machines. Because of this lack of client it cannot solve the problem of interaction with developers (for use within code).
Confidant does provide key versioning for rotation. Each secret consists of a set of string key values which could provide for multiple active versions at a time. It does not appear this was the intent of that feature and according to their documentation keys must be unique across all credential pairs in a service, so building a naming convention on top of this to support active and primary versions for signing/verifying might be difficult. Rotation policy itself would largely be determined by client implementation since once the key revision is updated, the server will start returning the new value.
Confidant does support logging and metrics through graphite and statsd. These do not appear to be easily pluggable without changing confidant’s code.