-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated web application configuration migration #1966
Comments
My two cents, at least on question ♯2, is that there should be a system-wide configuration file. ( On ♯1 and ♯3, you might consider putting it under VCS (like a tool such as etckeeper does) on the server and potentially representing your config updates as patches/diffs that are applied and committed by the .deb in the postinst script. Not sure if the variance in config.py is enough that you'd have to generate new patches locally on the fly for each instance, but you should be able to ship patches that apply cleanly for everybody. This of course touches upon static/functional changes only, but a different process leveraging Ansible would be necessary to update the dynamically generated/random/custom values. You could have Admittedly, working with patches would be sticky, especially because of indeterminism throughout the install base and since one cannot expect the lineinfile module to make substitutions and insertions in consistent locations, so this is only a brainstorm. For extra sanity checking, you can use More seriously worth considering is that you can store some variables persistently in debconf or dconf with ease. If it happens to work for your security model to put the keys and peppers in the debconf settings database, then the app code package is able to rewrite the config anew on every update instead of, or perhaps in addition to, backing it up in version control. This way, both Ansible and the app code .deb are able to query debconf for the canonical values. There should be a field type (string, I think) which can handle multi-line GPG keys without issue. I guess the idea is that you can sort've duplicate some logic between Ansible deployments and dpkg updates in a two-pronged approach to ensuring a usable yet fluidly changeable config. That is, you set custom values during the initial setup, subsequently write them out via a template whenever the package is installed or updated, and I guess hope that you don't have to release a version featuring a new custom configuration variable (as opposed to a random secret—there's no problem with using the postinst to generate those if absent), which seems like it'd always require an Ansible run to either include or alter. It goes without saying that in order to convert existing installs to any of these scenarios you necessarily require a script in the postinst or an automation role or some tags you tell admins to run, whatever is the preferred migration strategy. To me it makes a lot of sense to tie the configuration strongly to the application code package, for compatibility among other reasons, even though that is not what happens in practice, as it's managed by the app role. And in fact, dpkg has explicit features which make it convenient to ship patches with your program, as well as features for conflict resolution, managing configs and rendering templates. I seem to recall some resistance to relying on the control scripts for stuff, but a lot of software does this quite extensively and so long as you're still with this distribution method it's the straightest way to go that doesn't always require admin interaction. |
I have the following idea, please comment what do you all think. New configuration fileWe will have a static configuration file at To create a config file for new installationThis step is straight forward, using the same Ansible style we have, we can create a new configuration file. How to create the new file on the existing deployments/installsHave a post install Python script, which will read (by standard python import) the current configuration file (the What are the possible changes in the application code (to handle the new configuration)?Only the We will also have to update our Ansible logic to push out any changes to that new configuration file. What are the benefits?The first two comments already talk about standard benefits. I will talk about a new point of view for the developers of the project. This change will enable us to remove all dynamic attribute access from the code (remember the most popular #3033). Means we will be able to have a stable definition of the I will also update this ticket with an example configuration file by Monday. |
@kushaldas you mean ini format right? Just talking out loud here for everyone's benefit as it seems thats what you mean. ansible also has good support for that - so thats another benefit 👍 |
Correct, the standard INI file (not the windows style though) 🚌 |
Here is an example configuration file created by code from my
The
If any required configuration value is missing, |
Yo tossing this out here. Maybe we should use YAML and not INI. Sometimes it's nice to have multiple layers of nesting and things like lists. I would strongly prefer that approach. This would also integrate nicer with ansible. |
A long while ago I talked with @squeed about formats and he said either JSON or INI on the grounds that other things (esp. YAML) have cause problems for him in real world scenarios. I still think we want data structures that are more complex than what INI provides. Is there a compelling reason to use INI over JSON? |
@heartsucker this was a POC with only stdlib, the general feedback I received was to use YAML. |
If the feed back is YAML, I'd propose we use JSON then given his recommendations from other deployments, yeah? |
This is a total bikeshed, but, json is a bit nicer than yaml, especially when being written and read programatically. Yaml has some parsing surprises (e.g. number vs. string). My suggestion for INI-files was that it enables systemd-style dropins, which are amazingly useful. They allow for overriding arbitrary entries without the need for sed. |
I think we don't need systemd dropins so much. I think the master config should be things that are hardcoded and will never change (secrets), and as many as possible other config options should be done from the web interface and stored in the sql db. We are generally moving toward SD as being more of a plug-n-play appliance than a configurable web app, so tinkering with something like WSGI worker settings or other such things will eventually not be an option. |
Additional notes: We have a lot of config values that should never be controlled by the user and are only necessary to have controlled during testing/dev. These should not be read from a config file, but should have default values in the |
Proposed format: {
"source_interface": {
"secret_key": "j2IcwRt6A2IsUeNEChZqp66cn7cpf/oP6rxO9d8M2E8=",
"scrypt_id_pepper": "6KTTMPB+0RNV6TREDVD/7r8m/Pm5otK+7eyfaYWTm6s=",
"scrypt_gpg_pepper": "GlAWMF9rk6eH671IPiu9pBu06PmvKR+LnC+wQI/jqkM=",
"i18n": {
"default_locale": "en_US",
"supported_locales": ["en_US"]
}
},
"journalist_interface": {
"secret_key": "SGOuAsFRqs145PTueeMyCnMtLotjDldNL08MlzEHWdg=",
"scrypt_id_pepper": "6KTTMPB+0RNV6TREDVD/7r8m/Pm5otK+7eyfaYWTm6s=",
"scrypt_gpg_pepper": "GlAWMF9rk6eH671IPiu9pBu06PmvKR+LnC+wQI/jqkM=",
"i18n": {
"default_locale": "en_US",
"supported_locales": ["en_US"]
}
},
} Note that some values are duplicated so that in the future we can split the configs into two files to that the JI won't have access to the SI secrets and vice versa. These are actually all the values that should be configurable from the admin side. The rest should be hardcoded and unchangeable in prod. |
While working on this, I have hit annoying errors related to the fact that not all tests use pytest fixtures. I'm going to call #2877 a soft blocker on this because I haven't been able to figure out a work around though it may be possible. |
question: are site-specific but theoretically changeable things, like domain, configured in this manner? |
|
Per #3850 (comment) our bias is to incrementally move more options into the DB so they can be managed through the web interface. We may still ultimately want to switch to a more maintainable format for other configuration values, so leaving this issue open for discussion (though we may close in favor of more narrowly scoped improvements). |
The SecureDrop web application's configuration is read from
config.py
, which is generated from a template,config.py.example
, by theinitialize_securedrop_app
task during the initial run of the Ansible playbooks.During the initial development of the SecureDrop auto-upgrade strategy, we faced a dilemma in managing
config.py
. The dilemma is mostly captured in comments (1, 2) in the relevant Ansible role, but I will summarize it here for clarity.At the time of development, the most straightforward way to provide an auto-upgrade capability for SecureDrop was to use Debian packages. The SecureDrop application code is provided by the
securedrop-app-code
package, and production instances check https://apt.freedom.press periodically for updates to this package. Files that contain instance-specific configuration cannot be included in the Debian package because they would be overwritten on update.Additionally, the instance-specific configuration depends on configuration values that must be provided by the SecureDrop administrator during installation, so the task of generating and maintaining the web application configuration fell to the Ansible playbooks, which are configured with vars by the administrator during installation and run from the Admin Workstation.
Finally, there are some configuration values which automatically generated by the Ansible playbooks. For security, these values must be unique, random, and generated from a secure CSPRNG (e.g.
/dev/urandom
). These values are:source secret key
—the FlaskSECRET_KEY
for the Source Interface, used as an HMAC key to secure Flask's sessions from tampering.journalist secret key
—same, but for the Journalist Interface.scrypt id pepper
—random value used as an instance-wide salt (aka "pepper") for deriving Source's filesystem ID's from their codenames.scrypt gpg pepper
—random value used as an instance-wide salt (aka "pepper") for deriving Source's reply keypair passphrases from their codenames.Once set, it is critical that these values not be modified during the lifetime of the instance. To be specific:
{source,journalist} secret key
could be modified with only minor disruption to the operation of the instance. Changing these values would invalidate all current sessions, which means any currently logged in users would encounter errors and be required to log in again.scrypt {id,gpg} pepper
cannot be modified without severe disruption to the operation of the instance. Since these values are required to derive source ID's and reply keypair passphrases from their codenames, modifying them would make it impossible for sources who had created their accounts prior to the modification to access their accounts.The Ansible playbooks do not copy these values back to the Admin Workstation as part of the installation process, although they are included in backups made by the Ansible backup role. Because it is so important to avoid accidentally overwriting these automatically generated values, the Ansible playbooks currently take a defensive approach and avoid interacting with
config.py
at all if it is found to already exist on the system.Automated migration of the SecureDrop web application configuration is prevented by the confluence of all of these factors. In order to make it possible, we need to decide:
config.py
: it is also any files that may be referenced byconfig.py
, e.g. the GPG key whose fingerprint is referenced inJOURNALIST_KEY
.securedrop-app-code
package to automatically migrate the web application configuration. On the other hand, this creates a potential need for conflict resolution between disparate representations of an instance's configuration.Clearly, this is a complex issue and further discussion and design effort is warranted. I think it should be possible to break this issue up into discrete sub-issues and resolve it incrementally, but we will need to carefully devise a plan to do so.
The text was updated successfully, but these errors were encountered: