Configuration files should be more templatable in the standard images #11489

michaelkaye · 2021-12-02T12:34:07Z

Description:

In container driven worlds, when simple templating of a config file is desired, there are a few ways to go about it, but they're not 100% easy to use, for example cases like these require extending the synapse container with scripts or even extra docker containers:

a live deployment where the database credentials cycle automatically and are configured based on environment variables (possibly sourced from a kubernetes or other secret)
complement needing to bundle it's own templating of configuration: https://github.com/matrix-org/complement/blob/master/dockerfiles/synapse/homeserver.yaml & https://github.com/matrix-org/complement/blob/master/dockerfiles/synapse/start.sh
a stateless container that doesn't provide the ability to write config file changes to disk
test servers that are spun up on demand for non-technical users, wanting minimal configuration options.

A while ago we were able to provide templating via some fixed environment variables but removed that due to maintenance complexity as everything needed to be a environment variable.

We have the debian system that templates from debconf: https://github.com/matrix-org/synapse/blob/develop/debian/manage_debconf.pl

We currently have the --generate-config option which templates out synapse configuration based on environment variables, but only a limited number of values are supported https://github.com/matrix-org/synapse/blob/develop/docker/start.py#L130 and it relies on being able to write the generated result back out and have it persisted.

We can also currently use a conf.d/ type configuration, and be responsible for writing parts of the config out from different sources, but that gets complicated if the configuration is nested, eg sharing the database password from one source and the database hostname from another requires merging two trees of configuration.

We could create some specific syntax for use in the config file, eg hostname: "${ENV['POSTGRES_HOST']}" to allow values to be read from environment variables, which might deal with a lot of the cases - provide a static config file and specify which two or three secrets are to be provided.

Or maybe we could get value from loosening up the existing --generate-config templating so it could run before every start (so doesn't run and exit) but only templates the configuration files; not the other files it currently makes.

We find ourselves repeating these slight configurations in a lot of tiny built images, and I wonder if we could replace them with a single standard templating option.

I'm not sure of the right answer here; but it's currently painful to have to reimplement the templating of synapse config files again and again in each usecase, so creating this issue to see if we can make something simpler here.

The text was updated successfully, but these errors were encountered:

reivilibre · 2021-12-02T15:09:32Z

See also: #11203 (configuration fragments should merge recursively) — though doesn't really address all the points here.

benbz · 2021-12-02T15:39:32Z

See also: #11203 (configuration fragments should merge recursively) — though doesn't really address all the points here.

If the solution to this issue is arbitrary templating (e.g. env vars for any value), in my mind that would also solve the problems the #11203 is trying to solve - e.g. there could be a single config file and the DB connection pool size / worker name / etc settings could be templated in

woojoo666 · 2021-12-25T22:51:05Z

one possible method may be to use the same method that mautrix bridges uses, which is just to accept an environment variable for any configuration parameter

class Config(BaseBridgeConfig):
    def __getitem__(self, key: str) -> Any:
        try:
            return os.environ[f"MAUTRIX_FACEBOOK_{key.replace('.', '_').upper()}"]
        except KeyError:
            return super().__getitem__(key)

So for some configuration like database.args.password, you would just provide an environment variable like MAUTRIX_FACEBOOK_DATABASE_ARGS_PASSWORD.

I think this would be pretty quick to implement, and would make it vastly easier to use docker compose with Synapse, as well as making it viable to version-control the config files without worrying about leaking sensitive secrets.

However I suspect this sort of extreme flexibility may lead to many more bugs and user error. For example, what happens if Synapse adds a config that happens to match a common environment variable, like path or home?

One method is to just add a good prefix like what mautrix does. We could also maintain a whitelist of accepted environment variables, so that it is clear to both developers and users which environment variables can be configured. Though we would have to update the whitelist any time a whitelisted configuration gets renamed.

richvdh · 2022-01-04T14:38:21Z

@benbz, @woojoo666: regarding the ability to specify values for any setting via an env var, see #5518 (comment). In particular, how would you represent a list of listeners, each with their own set of sub-options? At the moment this looks like:

listeners:
  - port: 8448
    type: http
    tls: true
    resources:
      - names: [client, federation]
        compress: false

  - port: 8008
    tls: false
    bind_address: ''
    type: http
    x_forwarded: true

    resources:
      - names: [client, federation, consent]
        compress: false

    additional_resources:
      "/_matrix/saml2/pick_username":
        module: matrix_synapse_saml.pick_username_resource

  - port: 9093
    type: http
    resources:
      - names: [replication]

  - port: 9000
    bind_address: 127.0.0.1
    type: manhole

  - port: 9111
    bind_address: ''
    type: replication

  - port: 9050
    type: metrics

And let's take the removal of migrate_config elsewhere. That is very much meant as a migration tool for people with old setups, not something to be used day-to-day.

benbz · 2022-03-23T13:22:15Z

@benbz, @woojoo666: regarding the ability to specify values for any setting via an env var, see #5518 (comment). In particular, how would you represent a list of listeners, each with their own set of sub-options?

Sorry I missed this question. For my purposes I don't need that level of functionality (though I'm sure if it was available I'd make use of it), I just need simple value substitution.

richvdh · 2022-06-06T12:59:59Z

@benbz:

I just need simple value substitution.

an example of what you're envisaging here would be super helpful.

benbz · 2022-06-06T13:09:32Z

worker_app: synapse.app.generic_worker
worker_name: <WORKER_IDENTIFIER>
database:
  name: psycopg2
  args:
    application_name: <WORKER_IDENTIFIER>
    ....

Where <WORKER_IDENTIFIER> will be passed in (e.g. via env vars) as a distinct value for every process that uses this config fragment. For my use-case in Kubernetes I then only need to create different config fragments per worker-type rather than one for every worker instance

richvdh · 2022-06-06T14:01:10Z

Where <WORKER_IDENTIFIER> will be passed in (e.g. via env vars)

I'm not really following - or rather, I'd like a more explicit example. Which env vars (plural?) would be used to pass the value to be substitiuted for <WORKER_IDENTIFIER>? Would we support other <...> patterns?

benbz · 2022-06-06T14:36:12Z

<WORKER_IDENTIFIER> coming from the $HOSTNAME env var would be good enough for my purposes as Kubernetes StatefulSets mean that I have a unique $HOSTNAME for each replica in the StatefulSet. I could imagine others needing arbitrary env vars/other sources but I don't believe I need anything more than $HOSTNAME

Here is a more fleshed out example using federation senders. Let's say I have a StatefulSet that's named federation-sender and 4 replicas. Currently I have to do the following :
homeserver.yaml

...
federation_sender_instances:
- federation-sender-0
- federation-sender-1
- federation-sender-2
- federation-sender-3
...

federation-sender-0.yaml:

worker_name: federation-sender-0
...

federation-sender-1.yaml:

worker_name: federation-sender-1
...

federation-sender-2.yaml:

worker_name: federation-sender-2
...

federation-sender-3.yaml:

worker_name: federation-sender-3
...

However given I know that the hostnames for a 4 replicas would be federation-sender-0, federation-sender-1, federation-sender-2 & federation-sender-3 and if I've got some sort of templating/substitution for <WORKER_IDENTIFIER> for the $HOSTNAME env var, then I can just do:
homeserver.yaml

...
federation_sender_instances:
- federation-sender-0
- federation-sender-1
- federation-sender-2
- federation-sender-3
...

federation-sender.yaml:

worker_name: <WORKER_IDENTIFIER>
...

Scaling up/down is a question of rewriting the federation_sender_instances list rather than also writing out a new config file (and for worker types that don't need registration in homeserver.yaml I could even set them to auto-scale without any human involvement)

benbz · 2022-06-06T14:50:27Z

An alternative that would negate my particular need for any of this would be:

Online dynamic scaling of shards for federation sender and event persistence #12604 but for all worker types that need to be referenced in homeserver.yaml
Having Synapse generate a worker_name and database.args.application_name on startup rather than them having to be provided by config.
- These are the only 2 fields I have that vary per worker-process rather than per worker-type

JohnStarich · 2022-07-24T05:10:20Z

I agree, environment variables aren't the best when good, managed configuration is available.

In my case, Docker Swarm doesn't offer a very good managed configuration solution (it can't update existing configs in-place) so I wrote a tool to wrap the synapse image at boot: env2config. It's pretty good at most kinds of env-to-config mappings, but it's probably not perfect. Maybe the mapping function would be a helpful discussion starter, if you agree there's still a case to do something like this.

One big issue I can foresee is arbitrary keys in yaml files, so a . key separator could not always be used.

max06 · 2022-08-19T20:12:18Z

I just got pointed here - after reading, https://github.com/traefik/traefik jumps right into my mind.

They've managed to accept every configuration option in a yaml-file, as command line parameter or as env-var. See their docs - it might need some adaptions here, but it might also be worth a shot.

reivilibre added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Dec 9, 2021

woojoo666 mentioned this issue Dec 25, 2021

allow passing postgres password as environment variable #11646

Closed

This comment has been minimized.

Sign in to view

clokep mentioned this issue Jun 6, 2022

Support generic settings using environment variables #12961

Closed

anoadragon453 mentioned this issue Aug 19, 2022

Docker and scalability #13555

Open

richvdh mentioned this issue Aug 22, 2022

Deprecate the env-var way of running under docker #5518

Closed

DMRobertson mentioned this issue Nov 22, 2022

Read config variables from files (e.g. docker secrets) and environment variables. #14512

Open

matrixbot mentioned this issue Dec 21, 2023

Configuration files should be more templatable in the standard images element-hq/synapse#11489

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration files should be more templatable in the standard images #11489

Configuration files should be more templatable in the standard images #11489

michaelkaye commented Dec 2, 2021

reivilibre commented Dec 2, 2021 •

edited

benbz commented Dec 2, 2021

woojoo666 commented Dec 25, 2021 •

edited

This comment has been minimized.

richvdh commented Jan 4, 2022

benbz commented Mar 23, 2022

richvdh commented Jun 6, 2022

benbz commented Jun 6, 2022

richvdh commented Jun 6, 2022

benbz commented Jun 6, 2022 •

edited

benbz commented Jun 6, 2022

JohnStarich commented Jul 24, 2022

max06 commented Aug 19, 2022

Configuration files should be more templatable in the standard images #11489

Configuration files should be more templatable in the standard images #11489

Comments

michaelkaye commented Dec 2, 2021

reivilibre commented Dec 2, 2021 • edited

benbz commented Dec 2, 2021

woojoo666 commented Dec 25, 2021 • edited

This comment has been minimized.

richvdh commented Jan 4, 2022

benbz commented Mar 23, 2022

richvdh commented Jun 6, 2022

benbz commented Jun 6, 2022

richvdh commented Jun 6, 2022

benbz commented Jun 6, 2022 • edited

benbz commented Jun 6, 2022

JohnStarich commented Jul 24, 2022

max06 commented Aug 19, 2022

reivilibre commented Dec 2, 2021 •

edited

woojoo666 commented Dec 25, 2021 •

edited

benbz commented Jun 6, 2022 •

edited