Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Configuration files should be more templatable in the standard images #11489

Open
michaelkaye opened this issue Dec 2, 2021 · 13 comments
Open
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@michaelkaye
Copy link
Contributor

Description:

In container driven worlds, when simple templating of a config file is desired, there are a few ways to go about it, but they're not 100% easy to use, for example cases like these require extending the synapse container with scripts or even extra docker containers:

A while ago we were able to provide templating via some fixed environment variables but removed that due to maintenance complexity as everything needed to be a environment variable.

We have the debian system that templates from debconf: https://github.com/matrix-org/synapse/blob/develop/debian/manage_debconf.pl

We currently have the --generate-config option which templates out synapse configuration based on environment variables, but only a limited number of values are supported https://github.com/matrix-org/synapse/blob/develop/docker/start.py#L130 and it relies on being able to write the generated result back out and have it persisted.

We can also currently use a conf.d/ type configuration, and be responsible for writing parts of the config out from different sources, but that gets complicated if the configuration is nested, eg sharing the database password from one source and the database hostname from another requires merging two trees of configuration.

We could create some specific syntax for use in the config file, eg hostname: "${ENV['POSTGRES_HOST']}" to allow values to be read from environment variables, which might deal with a lot of the cases - provide a static config file and specify which two or three secrets are to be provided.

Or maybe we could get value from loosening up the existing --generate-config templating so it could run before every start (so doesn't run and exit) but only templates the configuration files; not the other files it currently makes.

We find ourselves repeating these slight configurations in a lot of tiny built images, and I wonder if we could replace them with a single standard templating option.

I'm not sure of the right answer here; but it's currently painful to have to reimplement the templating of synapse config files again and again in each usecase, so creating this issue to see if we can make something simpler here.

@reivilibre
Copy link
Contributor

reivilibre commented Dec 2, 2021

See also: #11203 (configuration fragments should merge recursively) — though doesn't really address all the points here.

@benbz
Copy link
Contributor

benbz commented Dec 2, 2021

See also: #11203 (configuration fragments should merge recursively) — though doesn't really address all the points here.

If the solution to this issue is arbitrary templating (e.g. env vars for any value), in my mind that would also solve the problems the #11203 is trying to solve - e.g. there could be a single config file and the DB connection pool size / worker name / etc settings could be templated in

@reivilibre reivilibre added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Dec 9, 2021
@woojoo666
Copy link

woojoo666 commented Dec 25, 2021

one possible method may be to use the same method that mautrix bridges uses, which is just to accept an environment variable for any configuration parameter

class Config(BaseBridgeConfig):
    def __getitem__(self, key: str) -> Any:
        try:
            return os.environ[f"MAUTRIX_FACEBOOK_{key.replace('.', '_').upper()}"]
        except KeyError:
            return super().__getitem__(key)

So for some configuration like database.args.password, you would just provide an environment variable like MAUTRIX_FACEBOOK_DATABASE_ARGS_PASSWORD.

I think this would be pretty quick to implement, and would make it vastly easier to use docker compose with Synapse, as well as making it viable to version-control the config files without worrying about leaking sensitive secrets.

However I suspect this sort of extreme flexibility may lead to many more bugs and user error. For example, what happens if Synapse adds a config that happens to match a common environment variable, like path or home?

One method is to just add a good prefix like what mautrix does. We could also maintain a whitelist of accepted environment variables, so that it is clear to both developers and users which environment variables can be configured. Though we would have to update the whitelist any time a whitelisted configuration gets renamed.

@woojoo666

This comment has been minimized.

@richvdh
Copy link
Member

richvdh commented Jan 4, 2022

@benbz, @woojoo666: regarding the ability to specify values for any setting via an env var, see #5518 (comment). In particular, how would you represent a list of listeners, each with their own set of sub-options? At the moment this looks like:

listeners:
  - port: 8448
    type: http
    tls: true
    resources:
      - names: [client, federation]
        compress: false

  - port: 8008
    tls: false
    bind_address: ''
    type: http
    x_forwarded: true

    resources:
      - names: [client, federation, consent]
        compress: false

    additional_resources:
      "/_matrix/saml2/pick_username":
        module: matrix_synapse_saml.pick_username_resource

  - port: 9093
    type: http
    resources:
      - names: [replication]

  - port: 9000
    bind_address: 127.0.0.1
    type: manhole

  - port: 9111
    bind_address: ''
    type: replication

  - port: 9050
    type: metrics

And let's take the removal of migrate_config elsewhere. That is very much meant as a migration tool for people with old setups, not something to be used day-to-day.

@benbz
Copy link
Contributor

benbz commented Mar 23, 2022

@benbz, @woojoo666: regarding the ability to specify values for any setting via an env var, see #5518 (comment). In particular, how would you represent a list of listeners, each with their own set of sub-options?

Sorry I missed this question. For my purposes I don't need that level of functionality (though I'm sure if it was available I'd make use of it), I just need simple value substitution.

@richvdh
Copy link
Member

richvdh commented Jun 6, 2022

@benbz:

I just need simple value substitution.

an example of what you're envisaging here would be super helpful.

@benbz
Copy link
Contributor

benbz commented Jun 6, 2022

worker_app: synapse.app.generic_worker
worker_name: <WORKER_IDENTIFIER>
database:
  name: psycopg2
  args:
    application_name: <WORKER_IDENTIFIER>
    ....

Where <WORKER_IDENTIFIER> will be passed in (e.g. via env vars) as a distinct value for every process that uses this config fragment. For my use-case in Kubernetes I then only need to create different config fragments per worker-type rather than one for every worker instance

@richvdh
Copy link
Member

richvdh commented Jun 6, 2022

Where <WORKER_IDENTIFIER> will be passed in (e.g. via env vars)

I'm not really following - or rather, I'd like a more explicit example. Which env vars (plural?) would be used to pass the value to be substitiuted for <WORKER_IDENTIFIER>? Would we support other <...> patterns?

@benbz
Copy link
Contributor

benbz commented Jun 6, 2022

<WORKER_IDENTIFIER> coming from the $HOSTNAME env var would be good enough for my purposes as Kubernetes StatefulSets mean that I have a unique $HOSTNAME for each replica in the StatefulSet. I could imagine others needing arbitrary env vars/other sources but I don't believe I need anything more than $HOSTNAME

Here is a more fleshed out example using federation senders. Let's say I have a StatefulSet that's named federation-sender and 4 replicas. Currently I have to do the following :
homeserver.yaml

...
federation_sender_instances:
- federation-sender-0
- federation-sender-1
- federation-sender-2
- federation-sender-3
...

federation-sender-0.yaml:

worker_name: federation-sender-0
...

federation-sender-1.yaml:

worker_name: federation-sender-1
...

federation-sender-2.yaml:

worker_name: federation-sender-2
...

federation-sender-3.yaml:

worker_name: federation-sender-3
...

However given I know that the hostnames for a 4 replicas would be federation-sender-0, federation-sender-1, federation-sender-2 & federation-sender-3 and if I've got some sort of templating/substitution for <WORKER_IDENTIFIER> for the $HOSTNAME env var, then I can just do:
homeserver.yaml

...
federation_sender_instances:
- federation-sender-0
- federation-sender-1
- federation-sender-2
- federation-sender-3
...

federation-sender.yaml:

worker_name: <WORKER_IDENTIFIER>
...

Scaling up/down is a question of rewriting the federation_sender_instances list rather than also writing out a new config file (and for worker types that don't need registration in homeserver.yaml I could even set them to auto-scale without any human involvement)

@benbz
Copy link
Contributor

benbz commented Jun 6, 2022

An alternative that would negate my particular need for any of this would be:

@JohnStarich
Copy link

I agree, environment variables aren't the best when good, managed configuration is available.

In my case, Docker Swarm doesn't offer a very good managed configuration solution (it can't update existing configs in-place) so I wrote a tool to wrap the synapse image at boot: env2config. It's pretty good at most kinds of env-to-config mappings, but it's probably not perfect. Maybe the mapping function would be a helpful discussion starter, if you agree there's still a case to do something like this.

One big issue I can foresee is arbitrary keys in yaml files, so a . key separator could not always be used.

@max06
Copy link

max06 commented Aug 19, 2022

I just got pointed here - after reading, https://github.com/traefik/traefik jumps right into my mind.

They've managed to accept every configuration option in a yaml-file, as command line parameter or as env-var. See their docs - it might need some adaptions here, but it might also be worth a shot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

7 participants