Skip to content
Dave R edited this page Oct 3, 2022 · 26 revisions

lab-dev

A project to set up all the development tooling for a company/organization

It's opinionated, on how a company works, however, the opinions are simple (compared with others that could be implemented)

The project relies on Gitops and Kubernetes Operators (KubeOps). By using the Provided CRD's it will setup users, tenancies, services, and more.

Core Resources

The Core Resources will allow you to describe your software organisation

for example:

A Cloud / Cluster view

image

Access to the Clusters

image

as you look at the core resources below, think how it maps back to the above or even your setup

NOTE The following, altho can be used to set up a number of different infrastructures, there is one stand out based opinion

The team structure is flat, there are no teams that report to other teams, and the focus is delivery not reporting lines.

Organisation

The organization is where we create the top-level structure

image

  • Account - The engineers in the organization (will invite and remove engineers from the organization)
  • Tenancy - This is a group of engineers (accounts), services, and libraries.
  • Environment - The environments used for the development and running of services (develop, production, etc) - note one environment must represent production
  • Zone - groups of hosting technologies that are associated with an environment and location.
  • Platform-Services - These are special services that are reused across multiple environments and zones. This is not a resource unlike the others (but the icon represents several resources)
# /lab [REPO] <- manually created, but then this will assume ownership
#   dbones-labs.yaml
#
#   /platform-services
#     github.yaml
#     rancher.yaml
#     discord.yaml
#      
#   /users
#     dbones.yaml
#     bob.yaml
#     sammi.yaml
#
#   /tenencies
#     platform.yaml
#     galaxy.yaml
#     open-sourcerors.yaml
#
#   /environments
#     production.yaml
#     develop.yaml
#
#   /zones
#     apex.yaml
#     frontier.yaml

Platform Service

These are special services that reside outside of the environments we develop in. they are shared across, and now more commonly are cloud services.

image

  • Github - CORE service - other than the usual source control, this has a critical control mecanism for controlling the access (IdP).
  • Rancher - CORE service - this is how we implement GitOps (via fleet), and also control all compute clusters (adding the required RBAC controls)
  • Discord - (replaceable - vNext scope) communications tool which will be configured to support the Tenancy Model.
  • Vault - (vNext scope) a central secrets management component, as we set up services, teams, and databases, it will ensure all keys are rotated and safe.

Tenancy

This is a team of engineers, with the services and libraries that they develop.

Special conditions.

  • A Tenancy can be marked as a Platform Tenancy - these tenancies are allowed to modify the Org and Zones
  • Zone filter - this concept is to manage a tenancies access to certain zones (i.e. ones which are in the UK only or US)

image

  • Member - which Accounts are members, note an Account can be a member of many Tenancies, member can have different types of membership
  • FireFighter - this denotes if a member has been provided special access to fix something in an emergency situation. (this is not a Resource we create via gitops, but via kubeops and github issues)
  • Library - software packages only require up to CI / Artifact store (not deployment)
  • Service - deployable software requires CD, possible logins, and credentials
# /tenancy-galaxy [REPO] <- this is created from the above org repo
#   /members
#     dbones.yaml
#     sammi.yaml
#   /services
#     billing.yaml
#  /libraries
#    core.yaml
#   /cd
#     deployment yamls for the above components

Zone

A collection of clusters (Kubernetes, Postgres, etc), which are associated with an Environment, and location

can only be modified by a platform tenancy, with the correct filter.

image

  • Kubernetes - a compute cluster, used to set up RBAC, deployments
  • Postgres - database cluster, will setup databases, logins and roles
  • RabbitMq - (vNext) - message broker, will setup logins

the 3 resource types represent a configured component (the configuration of the component can happen in this repo too, or externally)

# /zone-frontier this is production [REPO] <- this is created from the above org repo
#   /clusters
#     kubernetes-aqua.yaml
#     postgres-spike.yaml
#     postgres-goku.yaml
#     rabbitmq-asuna.yaml
#   /cd <- optional (vNext)
#     deployment yamls for the above components

Core Resource Interactions

Resources are no island when one is created, it will update the state of several components around it.

User

the user is key to most components, so an account is added to allow components, without any access to non-public resources.

image

Tenancy

Tenancies are an RBAC mechanism in most components

It works slightly differently with Components in a Zone vs Components that are Platform Services.

with Zoned components, it will only set up access if the Tenancy Filter is met

image

Member

Each component will use the roles which are set up via the Tenancy, this part will assign access

here is an oversimplified version of what to consider

image

note that the membership has different levels

  • member - full member
  • guest - someone that can contribute to a tenancy, but is not a member
  • owner - is still just a member, just denotes someone that owns this tenancy
Data Compute Code Issues
Member prod - no access / non-prod - CRUD prod - read-only / non-prod - CRUD Push / can view private repo's Raise / Close / request FF / Approve FF
Guest Prod - no access / non-prod - CRUD prod - read-only / non-prod - CRUD Pull / can view private repo's Raise
Account not a member no access no access Pull on all internal repo's Raise

access to secrets is not allowed they should be linked up via the Services yaml.

Firefighter

a process to provide an account of the admin role in production for a limited amount of time

note this is the assignment of a role, not the creation of a burner account, any changes should be audited against the user, thus the role is assigned and then removed.

image

Service

Service is interesting as it will have multiple optional components that may in-act on it.

Also, service will be configured in a Tenancy (Repo), however, this design needs to support services being transitioned from one team to another (and not lose data)

image

  • Rancher - core - will setup the compute and deployment for the service
  • Github - core - setup the repo and access, not that the Tenancies repo will be used for the deployment yaml
  • Postgres (rabbitmq etc) - optional, needs to setup the service in the component, and allow the correct tenancy access

Ideally, Vault will be used, without this, ensuring the secret is available to the service in the cluster it will be running in is a challenge (under dev)

Credentials

Tech User

Each component that needs to be configured by this operator will require to provide credentials in the same namespace as the Organisation resource

there are 2 patterns we could employ (however there may be more)

  • Primer setup - this is where a component is set up, either manual or by another automation process. and the credentials are copied into a Kubernetes secret
  • Fleet setup - it is possible that within a Zone repository, to configure the component, and save credentials into a Kubernetes secret

image

  • Github - Primer setup - API key
  • Rancher - Primer setup - API key
  • Postgres - Primer setup / Fleet setup - Postgres login

note that access to Zone components is a tricky thing to do.

the initial design is that the control plane (rancher local cluster) will have access to the components via certain secure means

  • Kubernetes Clusters - this is using the out-of-the-box Rancher control mechanism, the cluster has HTTPS access to Rancher
  • Postgres / Rabbit - these may require a jump-box/bastion to allow a encrypted SSH Tunnel to the component. where the bastion will implement further security measures such as IP Whitelisting & PEM file auth (this would be in addition to the credentials login at the component).

User

to be confirmed - vNext will look to use Hashi Vault - this will change the management/access of these secrets

Even though the main aim of the game is to provide a closed/sealed architecture, we still need to provide engineers login access to components (in order to debug/operate)

in this case, there are 2 types of credential management that need to be considered

  • Global - the account is on a public component, where the user will create their own credentials. the company may want to enforce a policy of MFA in these cases. (Github, Discord, etc)
  • Managed - This is where the lab.dev will create the account, and therefore create the credentials. In this case, the credentials need to be made available to the user and also fully managed. (Postgres, RabbitMq, etc)

image

While in an initial implementation the goal is to allow the user access to only their credentials

Fleet deployment (wip)

using fleet is the power of this design, it handles the secrets issues for us, while providing a robust way to deploy

image

there are a few parts to consider

  • the fleet again needs access to assets in the default namespace of the downstream cluster
  • the service account must have permissions to setup assets in the correct namepaces

Fleet

in order to setup the permissions for a gitRepo, a centeral fleet repo is automatically setup

/fleet
    /clusters
        /aqua
            service123.yaml <- sets up the ns and role binding
            fleet.yaml <- sets the critra to select the cluster
        /spike
            service123.yaml
            fleet.yaml
    /tendencies
        tenancy1.yaml <- service account to deploy and bindings to the default lab roles + gitRepo (which will reference this sa and gitrepo)
    /lab
        lab.yaml <- clusterRole and Role to access the default NS and create Ns's
        tenancy.yaml <- clusterRole access for the deployment role in the service123xcy NS's

clusters are the only targeted part, the rest is (unilaterally) applied to all cluster

Tenancy

the tenancy repo is used to define what to deploy and where. master is what is the TRUE version for any of the deployment state (make use of pull request and fleet overlays)

# /tenancy-galaxy [REPO] <- this is created from the above org repo
#   /members
#     ...
#   /services
#     billing.yaml
#   /cd
#     /billing
#       fleet.yaml - setup overlays
#       /overlays
#         /target-zone
#     /service2
#       fleet.yaml - setup overlays
#     

Architecture

Lab devs architecture tries to endorse Resources are things that are reacted on, the existence of a resource means several components will react.

This is KEY to its extensibility.

Let's Take the following, which is to show the concepts at play

image

Flow

Setting up

  1. an Engineer will setup each technology, by adding the resource of the same name (orange), not fully shown in the diagram,
  • this may run against each technology with an according to Controller.

Adding a core element - such as adding a Tenancy

  1. An engineer adds a Tenancy
  2. a number of technologies will act on this "event", which involves creating several resources per technology (we are only showing 1 per technology)
  • (labdev) Postgres watches this and creates a Role Resource
  • (labdev) Rancher watches this and creates a GitRole Resource (which is a Rancher Resource)
  • (external) Teams watches this and create a Team Resource
  1. The "doer" controllers act, doing the correct CRUD against 1 thing
  • (labdev) Postgres runs SQL to create the role
  • (Rancher) Rancher Fleet creates the gitRepo to attach rancher to the correct git repo
  • (Teams extension) Teams create a Team for the Tenancy, using the Teams API

from one Tenancy Resource we update an eco-sytem, using

  1. choreography when a resource is created, multiple controllers react to a resource
  2. orchestration when creating downstream resources, a controller creates a number of resources (i.e. Postgres Tenancy Resource, would create several Role Resources)
  3. doing reconciling technical components from a technology resource

Core Resource

A set of resources, describing the world of a software development company.

Examples:Tenancy, Account, Environment, Service

The core Resource should be the only interaction point between the end-user (engineer) and all downstream components

these will be reacted to by Internal Technology provides in labdev or Extension Operator developed by anyone.

Technology

how each technology reacts to the Core Resources, is separated by technology, encapsulating its behavior and integration for maintainability.

is made of 2 parts (both are optional, depending on context)

  • Resource (gray area, rounded edges)
  • Controllers (green area, rounded edges)

Resource

Resources at this level are to be treated as internal to this technology and any downstream components.

The present resources of that technology, i.e

  • For Github: user, team, repository etc....
  • For Discord: user, categoty, channel etc

Resources are catagorized into groups (as it will differ who owns the resource)

  • Technology (blue) - these are ones developed inside the component (this is true for lab.dev, Rancher, and external operators) and represent a thing/concept that needs to be controlled (CRUD)
  • External (purple) - these are resources, owned by a 3rd party, which we can manipulate for 3rd to manage accordingly.
  • Configuration (orange) contains the setup of this technology in your situation, i.e. which accounts to use, or secrets, any defaults. for Github there are several options we can configure. These settings are then used within that provider

NOTE: Engineers should not edit directly or be aware of the internal.lab.dev resources (blue)

Controllers

For the lab.dev architecture, there 2 levels of controllers

  • Core Resource controllers - these react to Core Resources (i.e. Tenancy) and create Technology Resources (they should not contain component interaction code)
  • Technology Resource controller - placed in an Internal Folder, these react to the Technology Resource and manipulate the Technology Component to meet the desired State.

Dependency between Technology

most Technologies should be an isolated module (like a Service)

HOWEVER, they may be cases where a technology integration, an example of this is Rancher, where its RBAC is driven from the Teams and Users in Github.

image

in these cases, the Resources from Github may be required to set up Resources in Rancher (Rancher Controllers make use of GitHub Resources)

Use with Caution and keep to a minimum

Ids (isolating multiple ids which represent the same object)

There will be many occations where one resource will be represented in multiple systems.

A great example is User Ids. This id is in all these systems will be different, but all represent the user.

image

Ids fall into 2 categories

  • External - accounts created outside the control of lab.dev
  • Company - the resource which are in control of Lab.dev

In both cases, we require to map the Technology Resource back to the Core, and try to isolate the Technology ids at the edges

image

in the example, you can see both Technologies, can store particulars about the integration, which does not leak back into the main code

EXCEPTION: just for the UX, we allow the main Account to use ExternalReference to help setup (that is all), in theory, an engineer could create the github.internal.lab.dev/v1/User Resource, this supports that we did not want people to have to understand all the *.interal.lab.dev resources.