Skip to content

Environment Management

Pete Tollestrup edited this page Dec 7, 2022 · 6 revisions

Purpose

An integration product like FAM is unusual in that it supports "customers" (teams developing digital products) and those customers need a permanent environment for each of their DEV, TEST, and PROD environments. This document describes the FAM approach to managing these permanent environments while supporting ongoing FAM developments.

Customer Requirements

Each environment of a customer product (DEV, TEST, and PROD) requires a stable OIDC integration

Each customer needs to be able to configure three versions of their digital product (DEV, TEST, and PROD). Ideally, these three versions behave the same so that the customer can develop and test their product and be confident that their integration code will work when they deploy to production. The only changes between their environments should be configuration items like URLs or access keys.

In particular, the DEV and TEST integrations that are exposed to a customer are considered permanent production environments. Outages in these integrations may affect customer activities. The service levels to support non-production environments do not necessarily need to match the production environment, but outages and releases must still be communicated appropriately.

Each environment of a customer product (DEV, TEST, and PROD) must use the corresponding version of the identity provider

For example, the DEV integration for a product that uses BCeID Business must be authenticated with the DEV version of the BCeID login domain. In practice, this is not an issue for IDIR, since there is no "dev" or "test" IDIR domain (all IDIR integrations must use real IDIR accounts for testing). For BCeID, there is a "test" domain that is generally used for DEV and TEST environments. Nevertheless, FAM may in the future support an IDP that segregates DEV, TEST, and PROD domains and has different user sets in each.

The way to solve this is to have separate IDPs for DEV, TEST, and PROD for each identity provider, and to set up corresponding DEV, TEST and PROD clients for each digital product that integrate with the appropriate "level" of the IDP.

Each environment of a customer product (DEV, TEST, and PROD) must be able to have distinct authorization configuration data

For example, a digital product team may configure their DEV environment by assigning certain application roles to developers or configure their TEST environment by assigning certain application roles to QA resources. In production this may be inappropriate. Additionally, the users that have access control permissions for an application in PROD will almost certainly be restricted differently than in DEV and TEST.

FAM Operational Requirements

Cost Management

Each FAM environment costs a certain amount of money to maintain in AWS. Replicating environments unnecessarily is to be avoided. In particular, the RDS Proxy component has a significant minimum hourly charge that will be multiplied by the number of instances.

Configuration Management

The FAM operational process extensively uses "Infrastructure as Code" (IaC) and "Configuration as Code" (CaC) techniques to automate system management. In order to test the infrastructure and configuration code before applying changes to FAM production, it is important that the infrastructure and configuration code for lower environments is as similar as possible as production. Ideally, the FAM DEV and TEST environments are identical to PROD.

Maintaining a separate code base for each environment is risky and takes more effort. Testing in production would be required for each release if the code to deploy and configure PROD diverges from the code for the lower environments.

FAM Environment Management Strategy

Overview

The FAM team maintains a single instance of the architecture that is "Permanent" and supports customer DEV, TEST, and PROD environments. The FAM DEV and TEST environments have an identical architecture and configuration to that of PROD. All the environments share a set of OIDC configurations provided by the Pathfinder SSO Keycloak server.

Diagram

Description

As the diagram demonstrates, the FAM permanent environment includes configuration to support a DEV, TEST, and PROD version of each IDP. It also supports a permanent DEV, TEST, and PROD OIDC integration for each digital product (FOM being the example).

Note that there is a single Cognito User Pool and a single database. Not shown on the diagram is the fact that there is a single version of the FAM access management user interface.

Each version of FOM (DEV, "demo" and PROD) is considered a different Application in FAM and is managed separately. The DEV application client uses the DEV versions of the IDPs for login. "Demo" uses the TEST versions of the IDPs, and PROD uses the PROD versions of the IDPs.

Notes

  • The data that describes authorization access rules for all versions of a digital product (like FOM) all live in the same database and are managed by the same user interface. Governance and operations procedures for FAM are required to be strictly controlled to prevent users from being able to mistakenly access functions in FAM or in other applications.
  • In theory, only one IDIR IDP is required. The reason is that the user domain for IDIR is actually the same across all environments (i.e., there is no "test" IDIR domain with "test" IDIR accounts -- you have to use your own "real" IDIR for testing). This could be pointed to the FAM PROD IDIR integration on PROD Pathfinder SSO Gold. In practice, it is simpler to maintain separate IDPs for each of DEV, TEST, and PROD in order to be consistent with future IDPs that would support separate user domains for separate environments.
  • Similarly, BCeID Business only has two user domains: the actual production domain and a "test" domain. Effectively, the DEV and TEST IDPs at Keycloak behave exactly the same and we could have a single IDP in Cognito that supports both. Again, in practice it is simpler to be consistent and separate out an IDP for each environment.
  • It should be noted that Pathfinder SSO maintains a separate Keycloak server for each environment. The main reason for this is that the service levels for PROD are much higher and must be managed by increased resources including storage and CPU. The availability requirements for Pathfinder SSO PROD are higher as well. With FAM, we are using Cognito so the service levels are static across all the User Pools that and we don't have to manage or monitor the User Pools themselves (benefits of serverless architecture). For the Pathfinder SSO team to mimic their "permanent" environment, each non-permanent environment must also have three instances. So a full set of environments for Pathfinder SSO actually runs 9(!) Keycloak clusters in theory. That's just way too many side-by-side AWS architectures to manage in a cost-effective way if we don't need to do it.
Clone this wiki locally