Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support encrypted passwords in interserver_http_credentials section #48291

Closed
rvasin opened this issue Mar 31, 2023 · 18 comments · Fixed by #50986
Closed

Support encrypted passwords in interserver_http_credentials section #48291

rvasin opened this issue Mar 31, 2023 · 18 comments · Fixed by #50986
Labels

Comments

@rvasin
Copy link
Contributor

rvasin commented Mar 31, 2023

Use case

Right now the passwords in interserver_http_credentials section are in plain text. For example:

<interserver_http_credentials>
    <user>ineteruser</user>
    <password>test</password>
</interserver_http_credentials>

It maybe considered as unsafe by security analysts.

Describe the solution you'd like

Allow encrypted password like:

<interserver_http_credentials>
    <user>ineteruser</user>
    <encrypted_password>jexjwpixk</encrypted_password>
</interserver_http_credentials>

Normally a symmetric encryption should be used. But it’s not clear where to save the encryption key then.

Describe alternatives you've considered

As alternative solution we may use of environment variable this way:

$ INTERPASSWORD=test

<interserver_http_credentials>
    <user>ineteruser</user>
    <password from_env="INTERPASSWORD"/>
</interserver_http_credentials>

But this also may be considered as unsafe by security analysts.

@rvasin rvasin added the feature label Mar 31, 2023
@rvasin
Copy link
Contributor Author

rvasin commented Mar 31, 2023

For me it’s not clear from design/architectural point of view what should be the correct solution to avoid the plain text passwords in interserver_http_credentials. It’s not the same as we have SHA hash for client credentials.
Could someone please advice how should it normally work? How it’s better to implement it.
If everything about the design is clear you may assign me to this task I will implement it.

@den-crane
Copy link
Contributor

den-crane commented Mar 31, 2023

What vector of an attack you want to prevent?

If you want the trust among replicas listed in the Zookeper (and you consider that Zookeeper is secured and create table is secured), you can generate a random(session) token during replica attach and put this token into Zookeeper near the replica's hostname, then a source replica can validate this token (Zookeeper).
Also a token is excessive here you can check using hostname/source ip (dns query), though it depends on a network configuration.

Zookeeper must be secured anyway, otherwise an intruder can harm a database by other ways.

@rvasin
Copy link
Contributor Author

rvasin commented Mar 31, 2023

@den-crane Thank you for advice about the random token. I did not know about it.

Depending on customers the requirements may be:

  1. No plain text passwords in configs (the plain text passwords in interserver_http_credentials may fail to satisfy this requirement).
  2. All communications even between replicas must be secured (it means auth must be used for replicas)
  3. All communications between servers (i.e. replicas) must be encrypted (probably HTTPS between replicas is OK).

@den-crane
Copy link
Contributor

den-crane commented Mar 31, 2023

BTW you can store parts of the config in the ZK

like <postgresql_port from_zk="/clickhouse/ports/postgresql" replace="replace" />

So you can do

<interserver_http_credentials>
    <user>ineteruser</user>
    <encrypted_password from_zk="/clickhouse/passwords/interserver_http_pass" replace="replace" />
</interserver_http_credentials>

@rvasin
Copy link
Contributor Author

rvasin commented Mar 31, 2023

I did not know about from_zk. Thank you. As I said for some customers (their security analysts) consider using the environment variables via from_env as insecure. I don't know why. Probably usage of from_zk instead of from_env is more secure.

@filimonov
Copy link
Contributor

filimonov commented Apr 3, 2023

I doubt that storing passwords in ZooKeeper is more secure (clickhouse users can read it via system.zookeeper).

Soring that on filesystem is secure (as secure as when you store keys in ~/.ssh). If you will add some encryption there it will anyway will be reversible, or the keys will be stored somewhere around. So it will not make system more secure than it is now, it can only be considered as obfuscation. Obfuscated password can not be considered as a strong / important security improvement.

@ilejn
Copy link
Contributor

ilejn commented Apr 3, 2023

To address this issue we do need an extra level of indirection ;)
(Leaving development complexity out of the scope) keyring https://en.wikipedia.org/wiki/GNOME_Keyring suits better than zookeeper.

@rvasin
Copy link
Contributor Author

rvasin commented Apr 3, 2023

@filimonov OK. How to we solve the particular question using the files?

Yes, private keys in ./ssh have chmod 600.

I think we may implement new parameter: from_file (similar to from_env and from_zk).
Then use it this way:

<interserver_http_credentials>
    <user>ineteruser</user>
    <password from_file="/home/clickhouse/passwords/interserver_http_pass"/>
</interserver_http_credentials>

So for file /home/clickhouse/passwords/interserver_http_pass
file's body is the password.

The main idea that chmod is 600. And the user is clickhouse. Thus only clichouse user and root may read it.

The idea is that for /etc/clickhouse-server/config.xml I have chmod 666 on my Centos. Any user can read it. But for /home/clickhouse/passwords/interserver_http_pass
we will have chmod 600.
The good point as you mentioned that for /var/lib/clickhouse/preprocessed we should also have 600 (only clickhouse user can read it).

And I think we cannot use include_from:

<include_from>/etc/metrica.xml</include_from>
for our task.

@rvasin
Copy link
Contributor Author

rvasin commented Apr 3, 2023

To address this issue we do need an extra level of indirection ;) (Leaving development complexity out of the scope) keyring https://en.wikipedia.org/wiki/GNOME_Keyring suits better than zookeeper.

@ilejn Could you please explain your approach? How it would be possible to use GNOME Keyring or similar in ClickHouse for the particular task (storing password for interserver_http_credentials)?

@filimonov
Copy link
Contributor

You can just create one more file

/etc/clickhouse-server/config.d/interserver_credentials.xml

Or use include_from (what is wrong with that?).

@ilejn
Copy link
Contributor

ilejn commented Apr 3, 2023

To address this issue we do need an extra level of indirection ;) (Leaving development complexity out of the scope) keyring https://en.wikipedia.org/wiki/GNOME_Keyring suits better than zookeeper.

@ilejn Could you please explain your approach? How it would be possible to use GNOME Keyring or similar in ClickHouse for the particular task (storing password for interserver_http_credentials)?

There is an API, a wrapper (https://gnome.pages.gitlab.gnome.org/libsecret) and a cross-platform library (e.g. https://github.com/hrantzsch/keychain).
Disclaimers:

  1. I am not aware of how all these things work and if they are robust and reliable.
  2. It is natural that clients' passwords or private keys are readable. ClickHouse instance acts as a client for another database, messaging system, etc. It seems difficult to explain what is wrong with this practice and define a vector for a possible attack. That is why I would try to avoid wasting time on this issue.
  3. I am not a security expert ;)

@rschu1ze
Copy link
Member

@rvasin's implementation provides a new XML tag `encryption_codec' which indicates some symmetric encryption (AES).

Before we go ahead with the implementation, I would like to check back first if everyone is okay with the approach. When it comes to security, it is too easy to shoot oneself into the foot. And yes, Roman did the hard work already (appreciated) but let's discuss first.

If I read this ticket right, the core problem is that the configuration file that contains the credentials is not necessarily only readable by the user who owns it. E.g. permissions can be 666 and user "Bob" can read user "Alice"s passwords, and we need some way to hide the password in the file. Note that to solve the problem, "hiding" is enough - the password does not necessarily need to be encrypted.

Discussed approaches:

  • some third-party mechanism (Gnome Keyring, etc.) --> too complex, thus overkill imho.
  • use existing attribute from_zk --> solves the problem, but has the new problem that database users with sufficient permissions will be able to read the password via system table (possibly also via other means to introspect ZK)
  • use existing attribute from_env --> solves the problem, but some security analysts don't seem to like it. In fact, this technique is quite common (e.g. in Docker). Environment variables are scoped to the user, so I would consider them secure.
  • use existing include_from syntax --> also solves the problem. Just point to a file with permissions 600.

To be honest, the latter two options seem preferable to me due to their elegance and simplicity. Maybe the docs should just make a note how cleartext passwords can be hidden.

And: Does the issue only affect credentials of other servers, or does ZK also need credentials? What about the existing mechanism to connect securely to ZK? (I am not an expert with that)

And: Are there specific customer / compliance / certification requirements that mandate actual password encryption (over just hiding it)?

@rvasin
Copy link
Contributor Author

rvasin commented Jun 15, 2023

@rschu1ze I will try to keep it short quite the topic is quite wide.

As we discussed earlier in the ticket: setting 600 permission on:
/etc/clickhouse-server/config.xml
/var/lib/clickhouse/preprocessed/config.xml
(and corresponding users.xml, files in config.d/ etc)
is an essential step and thanks to engineers of Altinity to recommending it.
Based on these recommendations we started to use 600 and other permission turnings in our ClickHouse installations.
But it's not enough for our existing clients and prospects (potential clients).

Yes, it's in their requirements "no plain text passwords" in configuration files. For prospects the requirement is quite critical: they cannot start using ClickHouse without it.

And solutions like from_env, from_zk, symlinks in config.d/ (which is also recommended ClickHouse documentation) - these are the weak solutions.

In fact I wrote an internal article about usage of from_env, from_zk, include_from, incl and symlinks considering all possible combinations. The short outcomes: in any case, finally, in /var/lib/clickhouse/preprocessed/config.xml file all passwords are written in plain text. ZooKeeper is not a secure storage.

Therefore in my current PR the passwords are encrypted in /etc/clickhouse-server/config.xml and they are still encrypted in /var/lib/clickhouse/preprocessed/config.xml.

We discussed it with our company's security analysts, we showed it to our clients and prospects (their security analysts). This approach OK for them.

Also note: it is not just about passwords in <interserver_http_credentials>. The same way the passwords/secrets are in plain text in several places of config,xml:

  1. named collections. Do you know that we have in Roadmap 2023 (discussion).
    "Secure storage for named collections".

  2. Secrets like secret_access_key in S3 storages.

and so on.

Therefore the proposed solution in my PR is universal and it may be used to hide any sensitive information. Furthermore it maybe used in combination with from_env and others. For example, we may store keys of <encryption_codecs> using from_env.

@ilejn
Copy link
Contributor

ilejn commented Jun 15, 2023

Am I right that the plan is to keep the encrypted entity (password) and the key to decrypt it (encryption_codecs/../key_hex) together in one file? Really?

@rvasin
Copy link
Contributor Author

rvasin commented Jun 15, 2023

As I said in real usage the different combinations are possible.
For example, <encryption_codecs> section maybe in a separate file on "secure" drive and there is symlink to this file in config.d/ and in <encryption_codecs> section itself the keys are loaded using from_env.

The main idea: there is no ideal solution. But it's a really the next step to security improvement considering that now in /var/lib/clickhouse/preprocessed/config.xml all secrets are in plain text.

@ilejn
Copy link
Contributor

ilejn commented Jun 15, 2023

As I said in real usage the different combinations are possible. For example, <encryption_codecs> section maybe in a separate file on "secure" drive and there is symlink to this file in config.d/ and in <encryption_codecs> section itself the keys are loaded using from_env.

The main idea: there is no ideal solution. But it's a really the next step to security improvement considering that now in /var/lib/clickhouse/preprocessed/config.xml all secrets are in plain text.

And the benefit of this PR is that key and password are together in preprocessed/config.xml ?

If an intruder can read this file (and other configs) - no improvement (she knows the secret)
If an intruder cannot read this file (and other configs) - no improvement (she does not know the secret and did not know before the PR).

@rvasin
Copy link
Contributor Author

rvasin commented Jun 15, 2023

As I said on all config files we set 600 permission only clickhouse user and root can read it. If the intruder can get root access somehow in this case he will get the plain text key and encrypted passwords from preprocessed config. But decryption is not trivial. Because the encrypted text has headers etc. Thus this PR is the next step in security improvement. But this is not the final or definitive step :)

@rvasin
Copy link
Contributor Author

rvasin commented Jun 15, 2023

Again I need to repeat: my today"s post is not my opinion as developer (for me as developer 600 permission is probably enough) but these are combined opinions / even requirements of security analysts of very high profile companies. They consider many factors to make their final decisions.

As one of future steps in the security level improvements could be making a new option to eliminate preprocessing file generation. But implementation of this option requires a prior careful analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants