--insecure flag will be removed and replaced by easier-to-use always-secure options
Instead, we aim to ensure that all clusters are secure, but offer new secure options where the user can choose the combination of security features that best matches their needs and their environment.
(Additionally, the word "insecure" gives the wrong impression that CockroachDB is not a secure database. The "insecure mode" only really exists for internal testing by the CockroachDB team and should not have been exposed to users from the start.)
What you need to know in a post-insecure world
CockroachDB aims to remain easy to use when all clusters are secured!
Simplified decision making:
Cluster is secure in all cases.
Note: the vulnerabilities in the 2nd and 3rd column are amplified by sharing the same TCP listener (address/port) between SQL and RPC listeners.
Partial mitigation possible via separate
Note that the first column in the table above assumes that clients also validate server
Existing clusters previously run with "insecure mode"
Advanced decision flowchart under the fold
We know from experience that customers who advocate for "insecure mode" really only care about:
These users certainly do not want to disable authentication and authorization, yet we also know that users do not realize that
Some users also care about point 4 because setting up TLS certs in a web browser is a pain. However since v20.1 we have a solution for those users:
Jira issue: CRDB-3870
The text was updated successfully, but these errors were encountered:
53405: cli flags clean-up r=tbg a=knz Informs #53404 Informs #49918 Release note (cli change): The command-line flag `--socket` has been removed. It was deprecated since v20.1. Use `--socket-dir` in replacement. Release note (cli change): The command-line `--insecure` has been marked as deprecated. See #53404 for details. The flag will be removed in a later version in a staged fashion: first, additional security mechanisms will be added to enable more flexible deployments which were previously done using `--insecure`; then the flaf will be removed from server commands, then finally in a later version also from client commands. Co-authored-by: Raphael 'kena' Poss <firstname.lastname@example.org>
Is the intent here to move all connections to the crdb process to be secure? And by secure, i mean secure in the way that cockroachdb expects? We manage TLS as well as cert authN/AuthZ outside of crdb as a separate process on the database instances, and thus run crdb in
Perhaps I can take another look at our golang support for our authN/authZ as well as the cockroach code to see if the integration could work via some extension points/plugins.
Hi Jason, thank you for the feedback.
Meanwhile, we'll encourage you to keep node-to-node connections secure using TLS; if you prefer not to set up TLS certs manually, we'll have some new experience (see #51991) which will simplify the setup.
Does that sound good to you?
@jasobrown a question for you though: does your set up use a single SQL account for all operations, or do you use separate SQL accounts with different privileges? Can you outline a little bit how your authorization looks like?
edit: discussed this with jason offline
53991: pgwire: accept non-TLS client conns safely in secure mode r=aaron-crl,irfansharif,bdarnell a=knz Fixes #44842. Informs #49532. Informs #53404. This change makes it possible for a DBA / system administrator to reconfigure individual nodes *in a secure cluster* to accept SQL client sessions over TCP without mandating a TLS handshake. Authentication remains mandatory as per the HBA rules. Motivation: we have at least two high-profile customers who keep their nodes and client apps in a private secure network (with network-level encryption / privacy) and who experience client-side TLS as unnecessary and expensive friction. Additionally, **this feature is a prerequisite to upgrade an insecure cluster to secure mode without downtime.** Why this does not impair security: - authentication remains mandatory (as per the HBA rules --  ). - the feature is opt-in: the operator must set a command-line flag (`--accept-sql-without-tls`), which is not enabled by default. - there is an interlock: the user must both set up the flag and set log-in passwords for their SQL users (by default, users get created without a password and thus cannot log in without client certs). - for now, node-node connections still require TLS. : https://www.postgresql.org/docs/12/auth-pg-hba-conf.html : https://dr-knz.net/authentication-in-postgresql-and-cockroachdb.html For context, the default HBA configuration is the following: ``` host all root all cert-password # fixed rule host all all all cert-password # built-in CockroachDB default local all all password # built-in CockroachDB default ``` The directive `host` covers both TLS and non-TLS incoming TCP connections (`local` is for the unix socket). The method `cert-password` means "client cert or password": without a cert, the password is mandatory. As previously, the user can further secure the configuration by restricting non-TLS connections to just a subnetwork, for example: ``` host all all 10.0.0.0/8 password # accept conns on the 10/8 network. host all all all reject # refuse conns from other nets. local all all password ``` Note that this change is limited to the server side: CockroachDB's own `cockroach` CLI commands do not yet know how to connect to a CockroachDB server without TLS; such connections are only supported from `psql` or SQL client drivers in apps. See #53994 for a follow-up. Release justification: fixes for high-priority or high-severity bugs in existing functionality 54019: roachtest: de-flake 'inconsistent' r=knz a=tbg This test sets up an intentionally corrupted replica and wants its node to shut down as a result of its detection. When only two of the three nodes were included in the consistency check, either one of them could end up terminating (as no obvious majority of healthy replicas can be determined). Change the test so that we wait for the cluster to come fully together before setting a low consistency check interval. Closes #54005. Release justification: testing Release note: None Co-authored-by: Raphael 'kena' Poss <email@example.com> Co-authored-by: Tobias Grieger <firstname.lastname@example.org>
Citation needed. Are people really getting confused by this? I think the real issue with
Security is not binary. The question is always "secure against what threats?" Some of these cases are secure against threats that others are not, so it's misleading to say that they're all equally secure (of course they may be used in systems that are secure against those threats if security is provided at other layers)
And that's really what we're talking about here - traditionally we've just had all-or-nothing, with the no-protection-at-all
That flowchart scares me - there's a lot of decision points there. A lot of them are new. The goal of #51991 and eliminating insecure mode is to reduce the number of options that most users experience, by steering nearly all users to the "internally-managed TLS" path. I think we need to reconsider the amount of flexibility we're providing here, and what decisions the user really needs to make.
For example, I don't think we ever want to give the "non-TLS SQL" option this much emphasis. CockroachCloud will always need to use TLS, so we must have a good user experience for TLS (perhaps by working with driver implementations on things like #32932).
Are there enough of these that we need to provide a no-downtime upgrade path? This seems like it could add quite a bit of complexity (can you upgrade from any security mode to any other, or only certain ones?)
This is definitely not true. There have been some recent examples of this but I think the most common reason people resist setting their clusters up securely is about all the TLS setup, especially for node-to-node.
This seems like a requirement, not just a "possibly".
This came up internally multiple times.
This is pointed out prominently at the top of the issue description. But I'll take your point, as well as this one:
I have adjusted the title of the issue to emphasize the new things / improvements and pulled that notion as first sentences/paragraphs in the issue description.
You and I know that security is not binary but the point here is that all the proposed options keep all internal security controls active, which
Anyway you are right that we need to talk about threats. Adding a table in the issue description instead of the flowchart. Bram helped me understand that a list of threats is more easy to use than a flowchart anyway.
"say they're all equally secure" - that's a strawman. Nobody wrote this anywhere.
Point granted. I've reduced the scope accordingly.
It seems clear that we need to emphasize the CC use case first and foremost, and thus preserve TLS options as the main recommendation (and main recommended scenario). However we do have serious $$$ at risk if we don't offer other options.
There are at least a few customers asking (those big accounts who didn't like our TLS and went to prod with
Point taken. Adjusted the text accordingly.
Apologies for the late response but there are three things that strike me here:
Regarding point 1: there is now a clearer warning than before. The warning reads as follows:
Regarding point 2: yes this work will be in a RFC of course.
Regarding point 3: agreed