Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin: Secure socket for remote management #3994

Merged
merged 6 commits into from Jan 27, 2021
Merged

admin: Secure socket for remote management #3994

merged 6 commits into from Jan 27, 2021

Conversation

mholt
Copy link
Member

@mholt mholt commented Jan 25, 2021

This PR adds 3 separate, but very related features:

  1. Automated server identity management
  2. Remote administration over secure connection
  3. Dyanmic config loading at startup

1. Automated server identity management

How do you know you're connecting to the server you think you are? How do you know the server connecting to you is the server instance you think it is? Mutually-authenticated TLS (mTLS) answers both of these questions. Using TLS to authenticate requires a public/private key pair (and the peer must trust the certificate you present to it).

Fortunately, Caddy is really good at managing certificates by now. We tap into that power to make it possible for Caddy to obtain and renew its own identity credentials, or in other words, a certificate that can be used for both server verification when clients connect to it, and client verification when it connects to other servers. Its associated private key is essentially its identity, and TLS takes care of possession proofs.

This configuration is simply a list of identifiers and an optional list of custom certificate issuers. Identifiers are things like IP addresses or DNS names that can be used to access the Caddy instance. The default issuers are ZeroSSL and Let's Encrypt, but these are public CAs, so they won't issue certs for private identifiers. Caddy will simply manage credentials for these, which other parts of Caddy can use, for example: remote administration or dynamic config loading (described below).

A bare-bones config might look like this:

{
	"admin": {
		"identity": {
			"identifiers": [
				"123.123.123.123",
				"example.com",
				"127.0.0.1",
				"localhost"
			],
			"issuers": [
				{
					"module": "acme",
					"ca": "https://my-acme-server.example.com/",
					"trusted_roots_pem_files": ["my-acme-root.crt"]
				}
			]
		}
	}
}

Here, Caddy is told that its identities are those IP addresses and DNS names. It then will use your custom ACME server with a custom root certificate (to trust when connecting to it) to get certificates for those identifiers. Note that in this case, your CA would have to issue certs for localhost and 127.0.0.1, which most CAs don't do, since they can't be verified if they are remote.

2. Remote administration over secure connection

This feature adds generic remote admin functionality that is safe to expose on a public interface.

  • The "remote" (or "secure") endpoint is optional. It does not affect the standard/local/plaintext endpoint.
  • It's the same as the API endpoint on localhost:2019, but over TLS.
  • TLS cannot be disabled on this endpoint.
  • TLS mutual auth is required, and cannot be disabled.
  • The server's certificate must be obtained and renewed via automated means, such as ACME. It cannot be manually loaded.
  • The TLS server takes care of verifying the client.
  • The admin handler takes care of application-layer permissions (methods and paths that each client is allowed to use).\
  • Sensible defaults are still WIP.
  • Config fields subject to change/renaming.

Here's a basic example config, that I will explain:

{
	"admin": {
		"identity": {
			"identifiers": ["example.com"]
		},
		"remote": {
			"access_control": [
				{
					"public_keys": ["base64-encoded DER certificate"]
				}
			]
		}
	}
}

Explanation:

  • First we configure identity management. We tell Caddy that its identifier is example.com, so it will try to obtain and renew a certificate for that domain. By default, it will use publicly-trusted CAs. This is OK for DNS names that are properly configured. Identity management is required when enabling remote administration, otherwise the server cannot present a TLS certificate to the client and secure the connection.
  • We've enabled a secure admin endpoint at its default address :2021 (you can customize it with "listen": "..." just like the regular admin endpoint) - note that the default address is not bound to a local interface, so it can be accessed remotely.
  • A single public key is then added to the ACL. Only the sole bearer of the associated private key is allowed unrestricted access of the API.

We can also restrict different clients/users as for methods and paths they are allowed to access:

{
	"public_keys": ["base64-encoded DER certificate"],
	"permissions": [{
		"paths": ["/id/foo/"],
		"methods": ["GET"]
	}]
}

All the users specified in public_keys will be allowed to access all paths in the API starting with /id/foo/ using only the GET method. As you can see, you can specify multiple paths and methods, and multiple groups of them, per group of public keys.

Other advanced functionality is a bit limited because we cannot import any Caddy modules: they all import this package instead! So, we cannot import the caddyhttp or caddytls packages and take advantage of their advanced routing or security logic. The admin controls are relatively simple, but I imagine this should be more than enough...?

Caddyfile config can probably be added later.

3. Dynamic config loading at startup

Since this feature was planned in tandem with remote admin, and depends on its changes, I am combining them into one PR.

Dynamic config loading is where you tell Caddy how to load its config, and then it loads and runs that. First, it will load the config you give it (and persist that so it can be optionally resumed later). Then, it will try pulling its actual config using the module you've specified (dynamically loaded configs are not persisted to storage, since resuming them doesn't make sense).

This PR comes with a standard config loader module called caddy.config_loaders.http.

Here's how it looks:

{
	"admin": {
		"config": {
			"load": {
				"module": "http",
				"url": "https://example.com/my_caddy_config.json"
			}
		}
	}
}

Caddy will download the config at the given URL and run it.

You can also configure authentication -- both client and server -- to ensure you get only trusted configs. If you add this to your config:

"tls": {
	"use_server_identity": true
}

then Caddy will use the configured identity (explained above) as a client certificate to present to the server it is connecting to. In this case, identity management must also be configured.

Functional, but still WIP.

Optional secure socket for the admin endpoint is designed
for remote management, i.e. to be exposed on a public
port. It enforces TLS mutual authentication which cannot
be disabled. The default port for this is :2021. The server
certificate cannot be specified manually, it MUST be
obtained from a certificate issuer (i.e. ACME).

More polish and sensible defaults are still in development.

Also cleaned up and consolidated the code related to
quitting the process.
@mholt mholt added in progress 🏃‍♂️ Being actively worked on under review 🧐 Review is pending before merging labels Jan 25, 2021
@mholt mholt added this to the v2.4.0 milestone Jan 25, 2021
@mholt mholt self-assigned this Jan 25, 2021
admin.go Show resolved Hide resolved
admin.go Show resolved Hide resolved
admin.go Show resolved Hide resolved
@SvenDowideit
Copy link
Contributor

Automated certs are a problem for one of my intended use cases - in a fully airgapped system

I'm mentioning this, in the hopes that you can think of a solution - or even better, that someone has documented how to make this all "just work" :)

At some point, it will make some really awesome docs

@mholt
Copy link
Member Author

mholt commented Jan 26, 2021

Automated certs are a problem for one of my intended use cases - in a fully airgapped system

I'm mentioning this, in the hopes that you can think of a solution - or even better, that someone has documented how to make this all "just work" :)

Caddy comes with an embedded ACME server, so not constrained in airgapped environments. 👍

This allows Caddy to load a dynamic config when it starts.

Dynamically-loaded configs are intentionally not persisted to storage.

Includes an implementation of the standard config loader, HTTPLoader.
Can be used to download configs over HTTP(S).
@SvenDowideit
Copy link
Contributor

that makes "The server's certificate must be obtained and renewed via automated means, such as ACME. It cannot be manually loaded." interesting - and makes me wonder if its possible to get an ACME server to hand apps an already created cert, and... clearly, I need to do more reading and learning!

@mholt
Copy link
Member Author

mholt commented Jan 26, 2021

@SvenDowideit

that makes "The server's certificate must be obtained and renewed via automated means, such as ACME. It cannot be manually loaded." interesting - and makes me wonder if its possible to get an ACME server to hand apps an already created cert, and... clearly, I need to do more reading and learning!

That is the definition of what an ACME server does. :) It gives clients certificates. Since it's your own infrastructure, they don't even need to be verified/validated. Caddy also has an internal issuer you can customize that performs no validation whatsoever and doesn't even use ACME, it just gives you a certificate.

Identity management is now separated from remote administration.

There is no need to enable remote administration if all you want is identity
management, but you will need to configure identity management
if you want remote administration.
@mholt mholt removed in progress 🏃‍♂️ Being actively worked on under review 🧐 Review is pending before merging labels Jan 27, 2021
@mholt mholt merged commit ab80ff4 into master Jan 27, 2021
@mholt mholt deleted the remote-admin branch January 27, 2021 23:16
@HSPDev
Copy link

HSPDev commented Mar 25, 2021

Not to ping anybody on an old PR, but I recently discovered this feature.

Dynamic config load during startup is SO cool!
But what would make it even cooler, is if we could time the reload interval, and Caddy would pull it by itself?

Scenario:

We are currently running a bunch of Caddy servers in an autoscaling group (e-commerce with lots of dynamic domains, on-demand TLS, common backing store for certs).

Config management was a pain to solve with Caddy. We would have to dynamically get the ASG's instance IP's and update the configs through the API, on the fly, and ensure they are all in sync.

Instead we've solved it (currently) with a cronjob, that grabs the current valid config from S3, every minute, and checks if it's different than the one loaded by caddy (API to localhost). If it's different, we'll update the config in Caddy.
We've also added it, so the boot script for caddy loads this on fresh instance startup.

This have worked flawlessly, but if Caddy could just pull from an URL periodically by itself, it would be MUCH easier!
I would probably update our upstream servers to serve the config dynamically. :-)

Precise config management is hard/impossible no matter what for these kind of deployments. We would likely deploy caddy to update every 30 seconds, so within a few minutes, all servers should always be in sync no matter what.

@francislavoie
Copy link
Member

francislavoie commented Mar 25, 2021

Instead we've solved it (currently) with a cronjob, that grabs the current valid config from S3, every minute, and checks if it's different than the one loaded by caddy (API to localhost). If it's different, we'll update the config in Caddy.

FYI, Caddy won't reload the config when you push an identical config to what it's running already, so you could get rid of that check in your script. See the API doc https://caddyserver.com/docs/api#post-load

If the new config is the same as the current one, no reload will occur. To force a reload, set Cache-Control: must-revalidate in the request headers.

I'll let @mholt follow up on the rest of your questions when he can

@mholt
Copy link
Member Author

mholt commented Mar 30, 2021

@HSPDev

But what would make it even cooler, is if we could time the reload interval, and Caddy would pull it by itself?

I suppose; although push is almost always better than pull IMO.

and checks if it's different than the one loaded by caddy (API to localhost)

FYI, as Francis already said, Caddy will already do this (by default, if the new config is the same as what it is already running, it will no-op).

Could you open a new issue to request the periodic config pulling, if one does not already exist?

@YourTechBud
Copy link
Contributor

YourTechBud commented Apr 5, 2021

But what would make it even cooler, is if we could time the reload interval, and Caddy would pull it by itself?

@mholt I agree with this specially in a dynamic environment where the number of caddy instances are constantly scaling up and down. Pushing config to individual instances on change can put a lot of strain on the config management system since they would need to track the number of caddy instances running at any point in time. The way we solve this is by injecting a sidecar proxy which periodically pulls the config. I guess it would make our lives a bit easier if caddy could pull this all by itself.

I'll raise an issue for this since there doesn't seem to be one already. @HSPDev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants