Skip to content
This repository was archived by the owner on Nov 24, 2025. It is now read-only.

Conversation

@rawlinp
Copy link
Contributor

@rawlinp rawlinp commented Mar 16, 2021

What does this PR (Pull Request) do?

It provides a blueprint for a new Traffic Vault Interface, paving the way for us to replace Riak w/ PostgreSQL.

@rawlinp rawlinp added the blueprint feature requirements / implementation details label Mar 16, 2021
@rob05c
Copy link
Member

rob05c commented Mar 18, 2021

  • use the Traffic Ops API plugin system

So this wouldn't be a Plugin?

If we're defining an Interface, it'd be really nice if people could drop in their own implementations.

The main purpose of plugins is just to avoid merge conflicts, so people can extend the app without code changesets that get conflicts and have to be rebased all the time. If it's not a "plugin," someone could do their own implementation of the Interface, but they'll have to modify lines in other files and get all that merge conflict pain.

  • rather than focusing on just the data store integration, overriding
    plugins are also responsible for everything else, including business
    logic, HTTP handling, validation, etc.

That's true of the existing hooks. But is there any reason we can't add a new hook to plugin.Funcs like getSecureDB GetSecureDBFunc that's just called once on startup?

@rawlinp
Copy link
Contributor Author

rawlinp commented Mar 18, 2021

  • use the Traffic Ops API plugin system

So this wouldn't be a Plugin?

Correct, it would not be a TO API plugin, as in it wouldn't totally override 22 different routes. Instead, it will be a type TrafficVault interface, of which a concrete struct (e.g. RiakImpl or PostgresImpl) will be assigned at startup time. I imagine this interface being hooked into the APIInfo object, similar to how the DB transaction is hooked into that as well. The interface methods (e.g. Ping, DeleteDNSSECKeys, etc.) will then be called within the route handlers we have today.

The main purpose of plugins is just to avoid merge conflicts

Really the main purpose of plugins is to allow proprietary implementations to override existing behavior while avoiding merge conflicts. With this interface design, the implementations could all be in their own file in order to avoid merge conflicts still. However, they wouldn't also have to be responsible for everything else I mention in the blueprint, just primarily the read/write integration with their chosen backend.

So while it's not a "TO API plugin" per se, it's still a "pluggable" interface. We could call it a "Traffic Vault plugin" interface if you want, but it won't be a "Traffic Ops API plugin."

@rob05c
Copy link
Member

rob05c commented Mar 18, 2021

So while it's not a "TO API plugin" per se, it's still a "pluggable" interface. We could call it a "Traffic Vault plugin" interface if you want, but it won't be a "Traffic Ops API plugin."

the implementations could all be in their own file in order to avoid merge conflicts

Ah, ok. Sure, I have no attachment to the current plugin/plugins.go file. If this is given a different package or system, it seems like it'll end up duplicating a lot of the code. But, as long as it's possible for users to safely extend, it doesn't matter nearly so much what it looks like.

it would not be a TO API plugin, as in it wouldn't totally override 22 different routes

I'm not sure I understand. With the existing system, all hooks are optional. If this were added as a hook, e.g.

type Funcs struct {
	load      LoadFunc
	onStartup StartupFunc
	onRequest OnRequestFunc
	getSecureDB GetSecureDBFunc
}

Plugins would look like

type SQLiteSecureDB struct {} // implements ISecureDB
func init() {
	AddPlugin(10000, Funcs{getSecureDB: sqliteSecureDB}, "SecureDB SQLite backend", "0.1")
}
func sqliteSecureDB(d StartupData) ISecureDB {
	return SQLiteSecureDB{}

They wouldn't have to implement onRequest or any other hooks. There's no reason for a "SecureDB" plugin to have to be responsible for overriding routes, unless it wanted to.

@alficles
Copy link
Contributor

I mostly like the interface change here. One use case I think we'll see a few folks want is segregated systems by kind. That is, they may not want to use the same secrets storage for SSL keys and URI Signing keys. In fact, they may want to split up SSL keys into multiple backends by category as well. I think a plugin system would allow folks to meet really specific goals like that without undue effort. I think the proposal here is most of a plugin system itself, though. It kinda depends exactly how you implement it.

Swapping in a security hat, I'm not seeing a whole lot of details about how we'll use Postgres and pgcrypto to create an improved backend. It's pretty easy to make something with the exact same security properties as we have in Riak, but I think a major part of this initiative is to actually improve on that. Is it possible to get a little more detail on how we're planning to organize the new backend? Specifically, are we using asymmetric or symmetric encryption (or is it configurable)? Where are we putting the encryption keys themselves and how are we securing them?

Here's my wishlist of stuff that would be really nice to be able to have in a secrets storage system:

  • TLS Mutual Auth
  • Audit trail for access and modification
  • Encryption for secrets at rest
  • Strong key for at-rest encryption with a decryption key that cannot be accessed from the secrets storage system

I think postgres offers all that, depending on how we design it. I'm not quite sure, reading the blueprint, what the design plan for the blueprint will be. Any chance we could flesh that out some?

@dneuman64
Copy link
Contributor

@alficles While I think it is important to understand all of the security details, I largely think we will leave that up to each private implementation of the backend storage system. I think the official requirement is "at least as secure as Risk is today" which, to be honest, is a pretty low bar. For this blueprint we are providing an alternate to Riak but we aren't going to go so far as to prescribe how secure that backend needs to be.

I also think it would be cool to have the ability to split out the types more granularly, but I also don't think that should be a requirement of this blueprint.

@rawlinp
Copy link
Contributor Author

rawlinp commented Mar 18, 2021

@alficles I haven't fully worked out all the details yet, but I was thinking these for sure:

  • Encryption for secrets at rest
  • Strong key for at-rest encryption with a decryption key that cannot be accessed from the secrets storage system

It would use the pgp_pub_encrypt and pgp_pub_decrypt functions from pgcrypto, with the private and public keys most likely stored on the TO hosts initially. I don't think postgres would need to store either of the keys.

I hadn't considered TLS Mutual Auth, but if that's easy enough for the initial implementation, maybe we should just do it. For "Audit trail for access and modification", I think that could be handled on the postgres side without requiring anything specific in TO, but I might be wrong.

The nice thing about not using Riak is that we can make this as secure as we want, but we don't necessarily need to implement all the security features at once. Aside from at-rest encryption which should probably be provided from the start, I think everything else could be added on as options later. If that sounds agreeable, I can update the blueprint to reflect what I've just stated.

@alficles
Copy link
Contributor

I largely think we will leave that up to each private implementation of the backend storage system. I think the official requirement is "at least as secure as Risk is today" which, to be honest, is a pretty low bar. For this blueprint we are providing an alternate to Riak but we aren't going to go so far as to prescribe how secure that backend needs to be.

@dneuman64 The blueprint proposes both an interface and a postgres replacement. It's reasonable to talk about how that replacement will advance security goals. And there's no "official requirement", Rawlin is writing the requirements. :D We can set the bar wherever we want as a team. I advocate for raising it slightly (but not excessively).

@alficles
Copy link
Contributor

@rawlinp

It would use the pgp_pub_encrypt and pgp_pub_decrypt functions from pgcrypto, with the private and public keys most likely stored on the TO hosts initially. I don't think postgres would need to store either of the keys.

If you're using pgp_*_*crypt functions, all of those require providing the key to postgres. You can do it transiently, which is better than stored, but at that point, why not just have the plugin handle the encryption and never transmit the keys?

I hadn't considered TLS Mutual Auth, but if that's easy enough for the initial implementation, maybe we should just do it. For "Audit trail for access and modification", I think that could be handled on the postgres side without requiring anything specific in TO, but I might be wrong.

TLS Mutual Auth should be a separate endeavour, imo. It probably doesn't amount to much more than providing a place in our config for a client cert and then configuring our DB connections to use it if provided.

The nice thing about not using Riak is that we can make this as secure as we want, but we don't necessarily need to implement all the security features at once. Aside from at-rest encryption which should probably be provided from the start, I think everything else could be added on as options later. If that sounds agreeable, I can update the blueprint to reflect what I've just stated.

Precisely, the goal is to create the flexibility to have all those things, not to have them implemented in one go. For the access trail, though, we want to know which users are doing things with secrets. The DB won't know who is doing secrets things because the design uses a single common credential for all DB access.

Per-user DB credentials was actually originally in my wishlist, but I removed it because it's actually really hard, both in implementation and operationally. But having users provide a delegated cert signed from a root that the DB trusts would significantly improve overall security. I think that's solidly above where I think we should raise the bar, though. I would happily review any implementation if somebody opens a PR with the enhancement. :D

But that does mean that only TO knows who is accessing and modifying secrets, so TO has to bear the responsibility for logging that out. (It should log out access and modification for all the other things, too, of course, but secrets are particularly important.)

@rawlinp
Copy link
Contributor Author

rawlinp commented Mar 22, 2021

If you're using pgp_*_*crypt functions, all of those require providing the key to postgres. You can do it transiently, which is better than stored, but at that point, why not just have the plugin handle the encryption and never transmit the keys?

That is a good point. It seemed simpler to just have postgres handle the encryption/decryption, but it doesn't seem much more difficult to handle that w/ the Go standard library. Plus, it might be better to use that CPU on the TO servers rather than the DB, since the TO servers are horizontally scalable. I'll try that out, and if the performance seems acceptable, I'll probably move forward w/ that approach.

Copy link
Contributor

@ocket8888 ocket8888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks fine to me. Originally I was going to ask for clarification on the "shape" of the TV interface, but honestly it doesn't matter. It's just gonna be whatever it needs to be for the endpoints to work, and that's something the implementer can figure out.

Copy link
Contributor

@srijeet0406 srijeet0406 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. This will be a welcome change!

Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid! Traffic Vault can be provided by PostgreSQL if this blueprint is implemented correctly.

Copy link
Member

@mitchell852 mitchell852 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zrhoffman
Copy link
Member

Looks like everyone who reviewed has approved. Merging...

@zrhoffman zrhoffman merged commit 152550f into apache:master Mar 30, 2021
@rawlinp rawlinp deleted the add-trafficvault-interface-blueprint branch April 1, 2021 15:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

blueprint feature requirements / implementation details

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants