feat: Add the ability to add extra settings sources #2107

kozlek · 2020-11-09T15:45:45Z

Change Summary

refactor BaseSettings internal logic
expose filter_relevant_env_vars and load_env_vars_from_source functions to ease creation of external sources plugins
add extra_settings_sources ModelConfig key for BaseSettings

Related issue number

#2106

Checklist

Unit tests for the changes exist
Tests pass on CI and coverage remains at 100%
Documentation reflects the changes where applicable
changes/<pull request or issue id>-<github username>.md file added describing change
(see changes/README.md for details)

codecov · 2020-11-09T15:47:33Z

Codecov Report

Merging #2107 (25f0767) into master (13a5c7d) will decrease coverage by 0.11%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##            master    pydantic/pydantic#2107      +/-   ##
===========================================
- Coverage   100.00%   99.88%   -0.12%     
===========================================
  Files           21       22       +1     
  Lines         4199     4351     +152     
  Branches       854      875      +21     
===========================================
+ Hits          4199     4346     +147     
- Misses           0        5       +5

Impacted Files	Coverage Δ
pydantic/env_settings.py	`100.00% <100.00%> (ø)`
pydantic/types.py	`100.00% <0.00%> (ø)`
pydantic/fields.py	`100.00% <0.00%> (ø)`
pydantic/schema.py	`100.00% <0.00%> (ø)`
pydantic/typing.py	`100.00% <0.00%> (ø)`
pydantic/_hypothesis_plugin.py	`95.68% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13a5c7d...25f0767. Read the comment docs.

samuelcolvin

I think this could be useful, but I think it needs to be rethought. It will also need tests to pass and lots of docs.

samuelcolvin · 2020-11-29T18:57:52Z

pydantic/env_settings.py

        )

+    def filter_relevant_env_vars(self, env_vars: Mapping[str, Optional[str]]) -> Dict[str, Optional[str]]:


this method can be private. This also makes sure that it can't conflict with settings.

same with most of the methods below.

can you move this down to where the code was before? If so hopefully we can reduce the number of changes and keep history easy to track.

I'm going to revert this modification as the new implementation will only use load_env_vars_from_source.
Also, load_env_vars_from_source will be made private to avoid conflicts as you requested.

samuelcolvin · 2020-11-29T23:45:34Z

pydantic/env_settings.py

+        for field in self.__fields__.values():
+            for env_name in field.field_info.extra['env_names']:
+                value = loader(env_name)
+                if value != undefined:


surely we should use None or KeyError here?

I would prefer to keep the undefined as it allows the source to support any type including None.
About the KeyError check, I think it's too specific to dict based sources: in the case of a REST API based source, a call to loader can trigger an API call that simply returns a flat value.
What do you think ?

samuelcolvin · 2020-11-29T23:54:03Z

pydantic/env_settings.py

@@ -44,19 +45,58 @@ def _build_values(
        _env_file_encoding: Optional[str] = None,
        _secrets_dir: Union[Path, str, None] = None,
    ) -> Dict[str, Any]:
+        extra_settings = [source(self) for source in reversed(self.__config__.extra_settings_sources)]


why the reverse?

Also shouldn't this be

Suggested change

extra_settings = [source(self) for source in reversed(self.__config__.extra_settings_sources)]

extra_settings = [self.load_env_vars_from_source(source) for source in reversed(self.__config__.extra_settings_sources)]

?

I think this would reduce the logic and complexity in your customer loaders/sources.

As explained in my latest comment, the settings source sequence is expressed from the highest priority to the lowest one.
We need to reverse it to send the settings dicts to deep_update from the lowest priority to the highest one.

In the next revision, I'll move to your implementation as all sources can be treated as case 2 sources.

samuelcolvin · 2020-11-30T18:55:22Z

please see pydantic/pydantic-settings#32 and its PR #2154.

We need a way to make customising the priority of settings sources easier as well as adding new ones. I would propose something like this:

from typing import Tuple, Any, Callable
from pydantic import BaseSettings

SettingsSourceCallable = Callable[[str], Any]

class Settings(BaseSettings):
    foo: str
    bar: str

    class Config:
        @classmethod
        def customise_sources(
            cls,
            init_settings: SettingsSourceCallable,
            env_settings: SettingsSourceCallable,
            file_secret_settings: SettingsSourceCallable,
        ) -> Tuple[SettingsSourceCallable, ...]:
            return init_settings, env_settings, file_secret_settings

Here customise_sources performs the default behaviour, but you could change the behaviour by altering the order of the functions returned, or adding your own. Each function would then be passed to load_env_vars_from_source to build dicts which are then merged using deep_update like currently in _build_values.

Advantages of this approach:

customise_sources (or whatever we end up calling it) is a public function without the risk of clashing with fields
this allows (almost) complete flexibility with a reasonably simple interface

Disadvantages:

it's more verbose and complex (though arguably more expressive) than env_has_prio suggested in BaseSettings: Customization of field value priority via Config pydantic-settings#32

kozlek · 2020-11-30T22:32:46Z

First of all, thanks for your feedback 🙏

I agree with your solution involving a customise_sources classmethod. This lets full operability to the user, while its complexity is not so disturbing as most of people will stick with standard env priority & sources.

SettingsSourceCallable implementation

About the implementation of SettingsSourceCallable , I thought we could split sources in two main categories:

Sources that are able to provide a full dict with all the source's variables. This is the case of the .env file or a JSON config file: it is more efficient to load the full file and parse it once, rather than reading it for each settings field.
Sources that load variables one by one. This is the case of Docket secrets or any proprietary API that don't provide a list endpoint: we may want to load a variable only if this is absolutely necessary - if the variable cannot be found in higher priority sources.

In the first revision of the PR, I let the filter_relevant_env_vars and load_env_vars_from_source methods as public utils to help users build custom sources.
filter_relevant_env_vars allows to sort useful variables from irrelevant ones thanks to its access to self.__fields__. If we don't filter out the extra variables, it forces us to use extra='ignore'. Also, we gain access to the field instance and the is_complex property, which is useful to load nested settings.
load_env_vars_from_source uses the same access to load only the necessary fields.

However, I see now we can simplify the workflow if the custom source handles the "bulk loading" itself (case 1).

import json
from functools import cached_property
from pathlib import Path

class JSONConfigSource:
    def __init__(self, path: Path = 'config.json'):
        self.config_path = path

    @cached_property
    def json_config(self):
        return json.loads(self.config_path.read_text())

    def __call__(self, env_name: str, field: ModelField, settings: BaseSettings):
        return self.json_config.get(env_name, undefined)

In this case, all sources can move to the case 2 and we can simplify the workflow as suggest. We still have to edit slightly the load_env_vars_from_source method to handle the is_complex case but that's not complicated.
Even if most source implementations can work with only the env_name, I think we should pass both field and settings instances to the SettingsSourceCallable as they provide useful context. I probably won't not use it in my own implementations but others may.

Settings priority

As you mentioned it in your comments, settings priority seems to be important for some people.
I think your solution fit perfectly with their needs and go even further as it allows to disable some of the built-in sources !
About the expression of the desired order of the settings, I think we agree to say we declare settings from the highest to lowest priority.
That's why we have to reverse the settings sequence when passing them to the deep_update method.

Custom source config

Built-in sources like .env or docker secrets can be configured directly using BaseSettings(_env_file=".env", _secrets_dir="/secrets/").
For now, custom sources can be configured using class based callable:

from pathlib import Path
from pydantic import BaseSettings
from .ext_sources import JSONConfigSource  # defined ealier

class Settings(BaseSettings):
    foo: str
    bar: str

    class Config:
        @classmethod
        def customise_sources(
            cls,
            init_settings: SettingsSourceCallable,
            env_settings: SettingsSourceCallable,
            file_secret_settings: SettingsSourceCallable,
        ) -> Tuple[SettingsSourceCallable, ...]:
            json_config_settings = JSONConfigSource(path=Path("config.json"))
            # here we disable both env_settings and file_secret_settings
            return init_settings, json_config_settings

settings = Settings()

That's not ideal but it works so we can start like this.

Based on your comments and the related issues, the whole feature appears clearer to me. Let me know if you want to add / change something else. I'll try to do the necessary changes by the end of the week, including docs & tests 🙂

kozlek · 2020-12-04T19:36:22Z

By defining a common type for all settings sources (SettingsSourceCallable = Callable[[str], Any]), we consider there are all equals.
But as of today, this is not true.

Init settings

init_settings has multiples particularities that make it different from the others sources:

all init variables must be loaded even if there are not defined in the model fields: depending on the extra settings, the model will raise a ValidationError (default, Extra.forbid) or ignore the fields.
init variables are loaded using field.alias while others sources use env_names.

More than simply messing up with unit tests, this would introduce breaking changes, which we might want to avoid.
Multiple solutions are possible:

Accept the following breaking changes:

Extra init kwargs will be always ignored (the extra will be useless)
Init kwargs will have to be named using the env_name rather the field.name (not really ideal 😕)

Treat InitSettingsSource as special case: either force it to be the first settings source (not acceptable for BaseSettings: Customization of field value priority via Config pydantic-settings#32) either always load all variables for this source. Also, this source will use a custom load_env_from_source method that will use field.name instead of env_names.
Add field.name to the env_name, so both env_names and field.name will be used: this could lead to an unexpected load of variables after pydantic's upgrade.
Move the loading strategy inside the SettingsSourceCallable: this will work well and increase the , but it will break the DRY pattern offered by the load_env_from_source method.

For now, I'll start my implementation with the solution 4) as this is the most complete and the closest from the actual source code.

kozlek · 2020-12-05T01:31:32Z

I've modified my implementation to include a way of customising the priority of settings sources.
I had to make some tradeoff in my implementation to avoid breaking changes, as explain in my previous post ; let me know if you want me to change something.
I also added some docs with examples to explain how to reorder the settings, add / remove settings sources.

samuelcolvin

this is looking great, just a few things to fix.

samuelcolvin · 2020-12-31T10:57:24Z

docs/usage/settings.md

+If the default order of priority doesn't match your needs, it's possible to change it by overriding a config method:
+
+```py
+class Settings(BaseSettings):


can you move this code (and the rest below) into python files in examples.

you might need to add one or two files instead of ...

Yep, it's cleaner this way 🙂

docs/usage/settings.md

pydantic/env_settings.py

tests/test_settings.py

michaeloliverx · 2021-01-16T12:31:02Z

This is a really useful feature. I would love to be able to load configuration via a local .toml config file while allowing overriding via environment variables.

PrettyWood

This customization will be very handy and is very explicit! LGTM! 👍
Samuel may still have some remarks though

PrettyWood

You just forgot to test your custom __repr__ for ...SettingsSource and to add a change file

kozlek · 2021-01-20T09:28:01Z

This customization will be very handy and is very explicit! LGTM! 👍
Samuel may still have some remarks though

Thanks 🙏

You just forgot to test your custom __repr__ for ...SettingsSource and to add a change file

You're right, I added the missing test and a change file to my latest commit. Tell me if we are missing anything else 🙂

changes/2107-kozlek.md

Co-authored-by: Eric Jolibois <em.jolibois@gmail.com>

samuelcolvin · 2021-02-11T16:56:05Z

thanks so much. I've extended the docs slightly as the original docs on this were pretty minimal.

kozlek · 2021-02-11T16:59:15Z

No problem, thanks for taking the time to merge this one 👌
Don't hesitate to ping me about settings-related issues & feature requests, it will be a pleasure to help when I have some time 😉

DomWeldon · 2021-02-13T15:18:24Z

As a thank-you to everyone on this thread, I spent yesterday browsing the master branch and saw this hook before I saw this issue. I was just testing why it didn't work now and saw that it was because this feature isn't released yet!

It's a really nice implementation and fits my needs exactly - thanks. Looking forward to this hitting a new release.

In the meantime, my draft implementation - to pull some secrets from SSM in prod - is at the gist below.

https://gist.github.com/DomWeldon/ce7e070283d97368cd9abc5be71b247d

kozlek mentioned this pull request Nov 9, 2020

Add extra settings sources to BaseSettings #2106

Closed

4 tasks

samuelcolvin reviewed Nov 29, 2020

View reviewed changes

samuelcolvin mentioned this pull request Nov 30, 2020

Add config to prioritize environment variables in BaseSettings #2154

Closed

4 tasks

samuelcolvin mentioned this pull request Nov 30, 2020

BaseSettings: Customization of field value priority via Config pydantic/pydantic-settings#32

Closed

4 tasks

kozlek force-pushed the extra-settings-sources branch from 3d1be86 to 7765fc0 Compare December 3, 2020 20:54

kozlek force-pushed the extra-settings-sources branch 2 times, most recently from c21de41 to e8d65f6 Compare December 5, 2020 01:25

samuelcolvin reviewed Dec 31, 2020

View reviewed changes

feat: Add the ability to add extra settings sources

d800fe5

kozlek force-pushed the extra-settings-sources branch from e8d65f6 to 8025750 Compare January 10, 2021 15:12

doc: Document "customise settings sources" feature

75b3faf

kozlek force-pushed the extra-settings-sources branch from 8025750 to 75b3faf Compare January 10, 2021 15:24

PrettyWood added the ready for review label Jan 19, 2021

PrettyWood approved these changes Jan 19, 2021

View reviewed changes

PrettyWood requested a review from samuelcolvin January 19, 2021 17:18

PrettyWood requested changes Jan 19, 2021

View reviewed changes

PrettyWood added awaiting author revision and removed ready for review labels Jan 19, 2021

PrettyWood removed the request for review from samuelcolvin January 19, 2021 17:21

tests: Add missing test and add change file

9bd413f

PrettyWood reviewed Jan 20, 2021

View reviewed changes

changes/2107-kozlek.md Outdated Show resolved Hide resolved

Update changes/2107-kozlek.md

266c0d8

Co-authored-by: Eric Jolibois <em.jolibois@gmail.com>

PrettyWood approved these changes Jan 20, 2021

View reviewed changes

PrettyWood added ready for review and removed awaiting author revision labels Jan 20, 2021

samuelcolvin added 3 commits February 11, 2021 14:03

improve docs for settings customise_sources

b102c38

fix docs building

28e709a

fix test :-(

25f0767

samuelcolvin merged commit 1155de8 into pydantic:master Feb 11, 2021

frederikaalund mentioned this pull request Mar 13, 2021

Command-line argument parsing #756

Closed

gerazenobi mentioned this pull request Mar 27, 2021

Allow BaseSettings to accept multiple .env files in Config as a list #1497

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add the ability to add extra settings sources #2107

feat: Add the ability to add extra settings sources #2107

kozlek commented Nov 9, 2020 •

edited by PrettyWood

codecov bot commented Nov 9, 2020 •

edited

samuelcolvin left a comment

samuelcolvin Nov 29, 2020

samuelcolvin Nov 29, 2020

samuelcolvin Nov 29, 2020

kozlek Nov 30, 2020

samuelcolvin Nov 29, 2020

kozlek Nov 30, 2020

samuelcolvin Nov 29, 2020

kozlek Nov 30, 2020

samuelcolvin commented Nov 30, 2020

kozlek commented Nov 30, 2020

kozlek commented Dec 4, 2020

kozlek commented Dec 5, 2020

samuelcolvin left a comment

samuelcolvin Dec 31, 2020

samuelcolvin Dec 31, 2020

kozlek Jan 10, 2021

michaeloliverx commented Jan 16, 2021

PrettyWood left a comment

PrettyWood left a comment

kozlek commented Jan 20, 2021

samuelcolvin commented Feb 11, 2021

kozlek commented Feb 11, 2021

DomWeldon commented Feb 13, 2021

		)

		def filter_relevant_env_vars(self, env_vars: Mapping[str, Optional[str]]) -> Dict[str, Optional[str]]:

	extra_settings = [source(self) for source in reversed(self.__config__.extra_settings_sources)]
	extra_settings = [self.load_env_vars_from_source(source) for source in reversed(self.__config__.extra_settings_sources)]

feat: Add the ability to add extra settings sources #2107

feat: Add the ability to add extra settings sources #2107

Conversation

kozlek commented Nov 9, 2020 • edited by PrettyWood

Change Summary

Related issue number

Checklist

codecov bot commented Nov 9, 2020 • edited

Codecov Report

samuelcolvin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelcolvin commented Nov 30, 2020

kozlek commented Nov 30, 2020

SettingsSourceCallable implementation

Settings priority

Custom source config

kozlek commented Dec 4, 2020

Init settings

kozlek commented Dec 5, 2020

samuelcolvin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaeloliverx commented Jan 16, 2021

PrettyWood left a comment

Choose a reason for hiding this comment

PrettyWood left a comment

Choose a reason for hiding this comment

kozlek commented Jan 20, 2021

samuelcolvin commented Feb 11, 2021

kozlek commented Feb 11, 2021

DomWeldon commented Feb 13, 2021

kozlek commented Nov 9, 2020 •

edited by PrettyWood

codecov bot commented Nov 9, 2020 •

edited