Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra settings sources to BaseSettings #2106

Closed
4 tasks done
kozlek opened this issue Nov 9, 2020 · 2 comments
Closed
4 tasks done

Add extra settings sources to BaseSettings #2106

kozlek opened this issue Nov 9, 2020 · 2 comments

Comments

@kozlek
Copy link
Contributor

kozlek commented Nov 9, 2020

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this feature/change is needed
  • After submitting this, I commit to one of:
    • Look through open issues and helped at least one other person
    • Hit the "watch" button on this repo to receive notifications and I commit to help at least 2 people that ask questions in the future
    • Implement a Pull Request for a confirmed bug

Feature Request

As of today, Pydantic supports loading settings from multiples sources using BaseSettings:

  • passed directly to the Settings instance
  • environment variables
  • .env file
  • docker secrets

This feature is great and saves me a lot of time during development (using .env) and in some production environments using the docker secrets feature 🙂🙏
However, some companies choose to manage env variables / secrets differently. Some stores them in a HashiCorp instance, key-value stores like ETCD or Consul, or they even use a dedicated API to retrieve them during the startup.

Out of the box solution

class Settings(BaseSettings):
    foo: str
    bar: str

    class Config:
        extra = 'ignore'


def retrieve_env_vars_from_external_source() -> Dict[str, Optional[str]]:

    ...


env_vars = retrieve_env_vars_from_external_source()
settings = Settings(**env_vars)

Here we simply retrieve our env variables and we pass them to the Settings instance. We have to set extra='ignore' in the ModelConfig to prevent ValidationError if our external source stores extra env variables.

However, we lose the precedence feature of Pydantic BaseSettings: by injecting the env variables in the first position of the BaseSettings precedence chain, we can't override a variable using the environment.
Also, it's very complicated to handle multiple external sources as we have to merge them first.

Extending Pydantic BaseSettings

Pydantic already has a full precedence mechanism in the _build_values function:

class BaseSettings(BaseModel):
    """
    Base class for settings, allowing values to be overridden by environment variables.

    This is useful in production for secrets you do not wish to save in code, it plays nicely with docker(-compose),
    Heroku and any 12 factor app design.
    """

    def __init__(
        __pydantic_self__,
        _env_file: Union[Path, str, None] = env_file_sentinel,
        _env_file_encoding: Optional[str] = None,
        _secrets_dir: Union[Path, str, None] = None,
        **values: Any,
    ) -> None:
        # Uses something other than `self` the first arg to allow "self" as a settable attribute
        super().__init__(
            **__pydantic_self__._build_values(
                values, _env_file=_env_file, _env_file_encoding=_env_file_encoding, _secrets_dir=_secrets_dir
            )
        )

    def _build_values(
        self,
        init_kwargs: Dict[str, Any],
        _env_file: Union[Path, str, None] = None,
        _env_file_encoding: Optional[str] = None,
        _secrets_dir: Union[Path, str, None] = None,
    ) -> Dict[str, Any]:
        return deep_update(
            self._build_secrets_files(_secrets_dir), self._build_environ(_env_file, _env_file_encoding), init_kwargs
        )

The solution would be to add the ability to add more arguments to the deep_update call, in the desired order (less relevant to most relevant source).
These arguments would be Callable[[BaseSettings], Dict[str, Optional[str]]]. It's important to pass a reference to the BaseSettings instance in order to extract the list of relevant fields and the fields configuration (env_names, is_complex, ...).

An example usage would be the following:

def load_env_vars_from_json_config_file(settings: BaseSettings) -> Dict[str, Optional[str]]:
    with open("config.json") as f:
        env_vars = json.load(f)
    return settings.filter_relevant_env_vars(env_vars)


class ETCDSettingsSource:
    def __init__(host: str = 'localhost', port: int = 2379):
        self.host = host
        self.port = port

    def __call__(settings: BaseSettings) -> Dict[str, Optional[str]]:
        client = etcd3.client()
        key_prefix = '/foo/bar/'
        env_vars = {
            metadata.key.decode().removeprefix(key_prefix).lower(): value.decode())}
            for value, metadata in client.get_prefix(key_prefix)
        }
        return settings.filter_relevant_env_vars(env_vars)


class Settings(BaseSettings):
    foo: str
    bar: str

    class Config:
        extra_settings_sources = [ETCDSettingsSource(host='etcd-server.lan'), load_env_vars_from_json_config_file]

Here we load the env variables in the following order:

  1. passed directly to the Settings instance
  2. environment variables
  3. ETCD key-value store
  4. config.json

This allows us to re-use the precedence mechanism and opens the door to BaseSettings plugins. In my opinion, plugins would be a real benefit as Pydantic cannot support all secrets storage internally.

PR (#2107 ) implements this example.

I will be happy to answer comments, add missing documentation and discuss further about that feature proposal 🤗

@nymous
Copy link

nymous commented Feb 24, 2021

Thank you so much for this issue (and the PR that just got merged! 🎉)

I wanted to ask the same almost a year ago, when I started working on a POC to extend Pydantic settings with Hashicorp Vault (see https://github.com/nymous/pydantic-vault). I ended up overloading the _build_values method, thinking it would be relatively safe as this was the only method I changed, but it broke with Pydantic 1.7 and the support for Docker secrets.
I have found this other project that went the same route https://github.com/kewtree1408/pydantic-azure-secrets, I should point them to here and tell them about the upcoming new way ^^

I still have a few questions: now that we can extend how Pydantic loads variables, what would be the best way to write a reusable provider (to be published on PyPI)? Should I just export a vault_config_settings_source() function that users can insert in their own BaseSettings class? Should I also provide a VaultBaseSettings class that they can inherit from, and that already has this custom source configured? For the properties, should I keep using the kwargs from the Field() function as I have written in my readme, or subclass it as VaultField?
Finally, how safe will this source customization be regarding to Pydantic updates? From the example in the PR https://github.com/samuelcolvin/pydantic/pull/2107/files#diff-401519a4e7dfa8678034b93ab8841dc323994f8e4457151118a02814d53aaaaa I can see that the customise_sources method takes init_settings, env_settings and file_secret_settings arguments, what will happen if Pydantic ever supports another way of loading values?
(All these questions might warrant a new issue, tell me and I will move it ^^')

Anyway, thanks for opening the issue that I never took the time for, thanks for working on the PR, and thanks for Pydantic for being awesome ❤️

@PrettyWood
Copy link
Member

closed by #2107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants