Skip to content

What is a build pipeline

Brian Oxley (binkley) edited this page May 17, 2024 · 17 revisions

What is a build pipeline?

The 10k-foot view

(Jump to the ground view.)

The point of this project is to show you ways to improve your build pipeline. But what is a build pipeline? Essentially it is:

The software steps that take you from an idea to something relilable and useful for your users.

But that definition begs further questions:

  • What is a "user"?
  • What are "steps"?
  • What is "reliable"?
  • What is an "idea"?

Rather than overload you with a bunch of text, let me show you some diagrams to explain the questions (imagine you and I are together at a whiteboard):

flowchart TB

idea("Idea for improvement")
card("Turn idea into potential work")
analysis("Flesh out work needed")
process("Discussion and moving work to "ready"")
dev("Programmers pick up ready work")
progress("Programmers turn work into reality")
push("Work goes to the CI pipeline")
ci("CI pipeline builds work")
ready("Work ready to deploy")
deploy("Work goes into production")
users("Users benefit from work")

subgraph idea_steps [Idea]
  idea-->card
  card-->analysis
  analysis-->process
end

subgraph work_steps [Work]
  process-->dev
  dev-->progress
  progress-->push
  push-->ci
  ci-->ready
end

style work_steps stroke:black,stroke-width:4px

work_steps-->|New ideas|idea_steps

subgraph deploy_steps [Users]
  ready-->deploy
  deploy-->users
  users-->|Start again|idea
end

deploy_steps-->|New ideas|idea_steps

OK, this is complicated, but where do I fit in, and how do I improve this process? How do me and my fellows be happy about the work that we do?

Note

The above diagram shows the "value stream map" at a high-level for your work. More detailed analysis breaks down these into individual areas; this page is interested in the work for developers.

Your ideas are a key part of this! They can be anything. Some examples but not limited to:

  • UI changes
  • API changes
  • Deployment changes
  • Monitoring & alerting changes
  • Workflow process changes ("ways of working")
  • Business process changes
  • Tech debt

Now let me draw a second diagram that drills down on the "work" part:

flowchart TB

red("Work on programming")
green("Local tests pass (unit, et al)")
refactor("Clean up code before sharing")

coverage("Local coverage passes")
local_checks("Local checks pass")
push("Push to shared CI")

pushed("New changes in CI")
ci("Run full build")
ci_checks("Same checks as local + CI-specific checks")
ready("Result ready to deploy")

subgraph work_steps [Red-green-refactor]
  refactor-->red
  green-->refactor
  red-->green
end

work_steps-->push_steps

subgraph push_steps [Prepare Git push]
  coverage-->local_checks
  local_checks-->push
end

push_steps-->ci_steps

subgraph ci_steps [CI]
  pushed-->ci
  ci-->ci_checks
  ci_checks-->ready
end

Now there is too much detail!

So in this project I want to walk you through your "build pipeline", and break down the parts. The goal is to make your project awesome for yourself, for other developers, and for your users.

Really answer the questions

Enough teasing, let's answer the questions. Notice that each answer to a question gets more open-ended:

What is a "user"?
This is anyone or any other system that benefits from your changes to your software or system. For a web frontend, this is obvious—people using your web application or site! For integration to other systems, this may be indirect—the remote system—but ultimately people are enjoying your work even when they do not know it. And it can be a manager (an actual person) in operations wanting to monitor your system and compare to other systems.
What are "steps"?
This is a tough question. "Steps" are anything in your process that can be discusssed and evaluated separately from other steps. Yep, this is vague.

At an immediate level, you may consider compiling your code as a separate step from automated code quality checks even though you have compiler plugins that are checking for quality (confusing, yes, more on this in another chapter). At a higher level, you may talk about local work on your machine as a separate step from what your CI system does to build your code.

So "step" is context-dependent.

What is "reliable"?
This is really **2 questions**: 1. Is my _build_ reliable? Can I trust it from local through production? 2. Is my _code_ reliable? Do users and stakeholders have confidence? This project talks about both questions.

A key point is to containerize your build, that is, be able to run your build via Docker in some form (or potentially a VM—virtual machine) that is identical between local and CI environments. This is what a reliable build looks like. It is the same regardless of running the build on a laptop on a disconnected airplane flight, and running in a multi-machine parallel CI environment.

Reliable code is much tougher to ensure. Now we are getting into a truly open-ended area, one where answers ask more questions:

  • If someone wants to measure code quality, can they do so (metrics)? What if they compare apples to oranges, and match your numbers against other projects?
  • There's a new dependency vulnerability problem in the press. How is your software affected (security)?
  • Can your system be down more than 0.01% of the year (SLA)?
  • When there are bugs (and all software has bugs), how fast do fixes deploy to production (cycle time)?
This project does not answer these open questions, but points you to tools to help you with answers.
What is an "idea"?
There are lots of great ideas to improve your softare or system; they come from all sorts of places: users, other developers, product owners, managers, other systems, cool thing you recently read about. How to you sort and manage these, and decide on what is important to work on now or later? _This is outside the scope of this project._

For this writing, assume you've had discussions, and relevant stakeholders are agreed on ordering of software changes: magically assume that work (cards) have all needed information. When you have a fast, reliable build pipeline, you can tell others in confidence that "yes, we can do that, and show you the result". This project shows you means to approach that happy place.

The ground view

A pipeline for local developers through CI ready to deploy looks like:

flowchart LR

local("Local all green")
ci("CI all green")
ready("Ready to show users")

local-->|Push code|ci
ci-->|Green build|ready

Let's break that up into more detail:

flowchart TB

headerMain["MAIN CODE"]
generateMain("Possibly generate code<br>(such as with annotations)")
preCompileMain("Pre-compile linting<br>of production code<br>such as code style")
compileMain("Compiler builds production code<br>or fails with suggestions")
staticMain("Static analysis of built code<br>such as security and bugs")

headerTest["UNIT TEST"]
generateTest("Possibly generate code")
preCompileTest("Pre-compile linting<br>of unit tests")
compileTest("Compiler builds unit test code")
staticTest("Static analysis of built code<br>(uncommon for tests but helpful)")
runTest("Run unit tests<br>THESE are where you spend most time")

headerIntegration["INTEGRATION TEST"]
generateIntegration("Possibly generate code")
preCompileIntegration("Pre-compile linting<br>of integration tests")
compileIntegration("Compiler builds integration test code")
staticIntegration("Static analysis of built code")
runIntegration("Run integration tests<br>THESE are often a pain point")

headerSystem["SYSTEM TEST"]
generateSystem("Possibly generate code")
preCompileSystem("Pre-compile linting<br>of system tests")
compileSystem("Compiler builds system test code")
staticSystem("Static analysis of built code")
runSystem("Run system tests<br>THESE need a lot of setup")

headerMain~~~generateMain-->preCompileMain-->compileMain-->staticMain
headerTest~~~generateTest-->preCompileTest-->compileTest-->staticTest-->runTest
headerIntegration~~~generateIntegration-->preCompileIntegration-->compileIntegration-->staticIntegration-->runIntegration
headerSystem~~~generateSystem-->preCompileSystem-->compileSystem-->staticSystem-->runSystem

And all of this is local before pushing to CI! But your build does this for you: you are writing the code and tests, and adjusting the build to meet your needs.

Common examples of local-vs-CI steps range from "everything in CI can be done local" including setting up other databases and remote systems, to "only CI can talk to other systems" in which case you need to babysit code pushes in CI to ensure they worked (a build monitor is helpful for this).

Clone this wiki locally