# Multi Environment Config Management

In this article, we will introduce how to use the multi environment config management pattern to manage your config values in your application.

## Why Config Management

Before diving into multi-environment topic, let's talk about config management itself. It's important to grasp the fundamental requirements of configuration management. First and foremost, imagine you have numerous variables or values scattered throughout your application code. Some of these values are reused in multiple places. Instead of duplicating them, it's best to declare them in one central location and use them wherever needed. This way, when you make changes to these values, you only have to update them in a single place, saving you time and effort.

Another crucial aspect is the use of variables instead of hard-coded strings. Why? Well, by declaring variables, you reduce the chances of making typing mistakes. Whenever you accidentally misspell something, the interpreter will promptly catch the error and alert you. It's far better to catch these errors early on than to discover them during production, potentially causing unwanted issues.

Now, let's talk about managing calculations and logical operations based on configuration values. You've probably encountered code snippets scattered all over your codebase, performing small calculations using these configuration values. It becomes a hassle to track and maintain them in this scattered manner. That's why it's beneficial to centralize these calculations in a dedicated place. By doing so, you can reference these logic snippets in your application code, making it more manageable and maintainable.

In summary, the purpose of config management is to:

1. Improve reusability: Consolidate variables and values to avoid updating multiple code locations.
2. Minimize typing errors: Use variables instead of hard-coded strings to minimize typing errors.
3. Enhance maintainability: Centralize logic and calculations based on configuration values for easier management.

Here are some guidelines to help you determine whether to move a code snippet to config management:

1. Consider declaring a field in the config object if you come across a generic value like int, str, list, or dict, and if:
    - There is a possibility that this value may change in the future. For instance, you may want to set the limit of the number of retries to 3 today but anticipate the need to change it to 5 tomorrow.
    - This value is used in multiple places throughout your application code.
2. If you encounter a Python object that is derived from other values declared in the config object, consider declaring a method or property method within the config object.

## What is Multi Environment Config Management

Usually, config object is a singleton within a project. You declare it once and use it everywhere. However, in enterprise projects, there are often multiple environments such as development (dev), integration (int), production (prod), and more. For example, the database connection string is different in dev and prod environment. In such cases, it becomes necessary to have multiple config objects, each representing a specific environment but sharing the same data schema. From an object-oriented programming perspective, you can think of each environment's config object as an instance of the config class.

## Declare a Data Model for Config Management

[config_patterns](https://github.com/MacHu-GWU/config_patterns-project) Python library ships with a module to declare a multi-environment config class in Python. Let's take a look at an example. In this example, there are two environments ``dev`` and ``prod``. And the config object has two attributes ``username`` and ``password``.

First, we import [dataclasses](https://docs.python.org/3/library/dataclasses.html) to decorate the config class. And we import following classes from the ``config_patterns.patterns.multi_env_json.api`` module (function and class defined in ``api`` module consider as stable public API):

- ``BaseEnvEnum``: the base class of the environment name enum class, to avoid typo when you use the environment name in your code.
- ``BaseEnv``: the base class of the per environment object, this is where you declare your constant config values as a class attributes and derived config attributes as a method or property method.
- ``BaseConfig``: the base class of the all-in-one config object, this is a namespace object to hold all the per environment config objects.

In [14]:
import os
# content of config_define.py
# -*- coding: utf-8 -*-

import typing as T
import os
import dataclasses

from config_patterns.patterns.multi_env_json.api import (
    BaseEnvEnum, # the base class of the environment name enum class
    BaseEnv, # the base class of the per environment object
    BaseConfig, # the base class of the all-in-one config object
)


class EnvEnum(BaseEnvEnum):
    dev = "dev" # development
    prod = "prod" # production


@dataclasses.dataclass
class Env(BaseEnv):
    username: T.Optional[str] = dataclasses.field(default=None)
    password: T.Optional[str] = dataclasses.field(default=None)

    @classmethod
    def from_dict(cls, data: dict):
        """
        This method defines how to create an instance of this class from a dict.

        Example:

            >>> Env.from_dict({"username": "user1", "password": "pass1"})
        """
        return cls(**data)

    @property
    def login_info(self) -> str:
        """
        This is a sample derived attribute.
        """
        return f"Hello {self.username}, please enter your password: "


@dataclasses.dataclass
class Config(BaseConfig):
    @property
    def dev(self) -> Env:
        """
        A shortcut to get the dev environment config object.
        """
        return self.get_env(EnvEnum.dev)

    @property
    def prod(self) -> Env:
        """
        A shortcut to get the dev environment config object.
        """
        return self.get_env(EnvEnum.prod)

    @classmethod
    def get_current_env(cls) -> str:
        """
        You may want a smarter way to determine the current environment.
        For example, you may define the local laptop is ``dev``, and the
        virtual machine is ``prod``.
        """
        if "IS_VM" in os.environ:
            return EnvEnum.prod.value
        else:
            return EnvEnum.dev.value

    @property
    def env(self) -> Env:
        """
        This is a shortcut to get the current environment object.
        """
        return self.get_env(self.get_current_env())

## Load Config Data from Json File

``config_patterns`` library support several backend options to store config data. We will only introduce local JSON file backend in this article. To learn more options like the AWS Cloud based S3 backend and Parameter store backend, you can read theses documents.

In [15]:
# The code below declares some helper function to pretty print the data.
import json
from rich import print as rprint

def jprint(data: dict):
    rprint(json.dumps(data, indent=4))

We have two JSON files: ``config.json`` for storing non-sensitive configuration data and ``config_secret.json`` for storing sensitive configuration data. Below, you can find the contents of these two JSON files. To gain a better understanding of the hierarchical JSON pattern for multi-environment configuration management, I recommend reading the following document:

- [Hierarchy Json Pattern for Config Management](https://github.com/MacHu-GWU/config_patterns-project/blob/main/example/hierarchy_json_example.ipynb)
- [Separate and Merge Non-Sensitive Data and Secret Data](https://github.com/MacHu-GWU/config_patterns-project/blob/main/example/separate_and_merge_non_sesitive_and_sensitive_data_example.ipynb)


In [16]:
from python_lib.config_paths import (
    path_config_v1,
    path_config_secret_v1,
)

rprint(path_config_v1.read_text())

In [17]:
rprint(path_config_secret_v1.read_text())

The ``config_patterns.patterns.multi_env_json.api.BaseConfig.read()`` method creates an all-environments-in-one config object as a name space to access per-environment config objects. It takes two mandatory arguments:

- ``env_class``: The class (Not instance) of the per-environment config object. So the name space knows what object to create for each environment.
- ``env_enum_class``: The enum class (Not instance) of the environment name enum class. So the name space knows the list of valid environment names.

It takes additional arguments based on the backend options. For local JSON file backend, it takes two additional arguments:

- ``path_config``: the path to the json file where you store non sensitive data
- ``path_secret_config``: the path to the json file where you store sensitive data


In [18]:
config = Config.read(
    env_class=Env,
    env_enum_class=EnvEnum,
    path_config=path_config_v1,
    path_secret_config=path_config_secret_v1,
)
rprint(config)

## Access Your Config Values in Application Code

In your application code, you could create the config object by reading the config storage. Then use the Python config object to access those config values.

The ``BaseConfig`` class has a ``get_env()`` method to get the per-environment config object. It takes one mandatory argument ``env_name``, it could be a string or ``BaseEnvEnum`` instance. It returns the per-environment config object. In this example, we create some property method shortcuts to access the per-environment config object.

```python
@dataclasses.dataclass
class Config(BaseConfig):
    @property
    def dev(self) -> Env:
        """
        A shortcut to get the dev environment config object.
        """
        return self.get_env(EnvEnum.dev)

    @property
    def prod(self) -> Env:
        """
        A shortcut to get the dev environment config object.
        """
        return self.get_env(EnvEnum.prod)

    @classmethod
    def get_current_env(cls) -> str:
        """
        You may want a smarter way to determine the current environment.
        For example, you may define the local laptop is ``dev``, and the
        virtual machine is ``prod``.
        """
        if "IS_VM" in os.environ:
            return EnvEnum.prod.value
        else:
            return EnvEnum.dev.value

    @property
    def env(self) -> Env:
        """
        This is a shortcut to get the current environment object.
        """
        return self.get_env(self.get_current_env())
```

In [19]:
config.dev

Env(project_name='my_project', env_name='dev', username='dev.user.v1', password='dev.password')

In [20]:
config.prod

Env(project_name='my_project', env_name='prod', username='prod.user.v1', password='prod.password')

In [21]:
config.env

Env(project_name='my_project', env_name='dev', username='dev.user.v1', password='dev.password')

In [22]:
config.get_env(EnvEnum.prod)

Env(project_name='my_project', env_name='prod', username='prod.user.v1', password='prod.password')

In [23]:
config.env.username

'dev.user.v1'

In [24]:
config.env.password

'dev.password'

In [25]:
config.env.login_info

'Hello dev.user.v1, please enter your password: '

## What Actually Happens Under the Hood?

Now, let's reveal how could the ``Config`` object parse the config data and generate the per-environment config object.

1. The ``Config.read()`` method reads the non sensitive data and sensitive data from the backend. And then uses the ``config_patterns.patterns.hierarchy.apply_shared_value`` and the ``config_patterns.patterns.merge_key_value.merge_key_value`` function to create a in-memory copy of the merged config data. The merged config data is a dict object.

In [26]:
jprint(config._merged)

2. When the ``Config.get_env(env_name=...)`` get called, it access the dictionary object by the ``env_name`` key, and calls the ``Config.Env.from_dict(...)`` method to create the ``Env``, per-environment config object. You are responsible to implement the ``Env.from_dict(...)`` method to define how you want to deserialize the config data into the per-environment config object.

```
@dataclasses.dataclass
class Env(BaseEnv):
    username: T.Optional[str] = dataclasses.field(default=None)
    password: T.Optional[str] = dataclasses.field(default=None)

    @classmethod
    def from_dict(cls, data: dict):
        """
        This method defines how to create an instance of this class from a dict.

        Example:

            >>> Env.from_dict({"username": "user1", "password": "pass1"})
        """
        return cls(**data)
```

## Summary

Local JSON file backend is great for development. It should be the source-of-truth of your config data, and you should check in the non-sensitive config data into Git to follow the "Configuration as Code" principal. However, you should not deploy the config JSON file alongside with your application code. Instead, the config data should be deployed to a deligated storage, and let your application code to read from it.

For cloud-based storage options, two excellent backend choices are [AWS SSM Parameter Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html) and [AWS S3](https://aws.amazon.com/s3/). These options provide built-in access management support, encryption at rest and in transit by default, as well as versioning control. They offer enhanced security and reliability for storing your config data. To learn more about these backend options, please refer to the following documents:

- [Multi Environment Config Management - SSM Backend](https://github.com/MacHu-GWU/config_patterns-project/blob/main/example/multi_env_json/multi_environment_config_with_ssm_backend.ipynb)
- [Multi Environment Config Management - S3 Backend](https://github.com/MacHu-GWU/config_patterns-project/blob/main/example/multi_env_json/multi_environment_config_with_s3_backend.ipynb)
