Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Post: Positive Signposting #63

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions content/blog/experiments-in-stateless-terraform.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: "Experiments in Stateless Terraform"
date: 2022-05-31T18:10:32+01:00
slug: ""
description: ""
keywords: []
draft: false
tags: []
math: false
toc: false
---

[This article about the history of statefulness in Terraform](https://www.bejarano.io/terraform-stateless/) really resonated with me. Over the last few months I have been playing with setting up Terraform in such a way that the state file was no-longer required.

This post is about sharing my observations from those experiments because what I found instead was that with the current way AWS works, while it is possible to use Terraform without state, there are some limitations with AWS that made this more difficult than you would expect.

The key issue you encounter when you no-longer know the state of your infrastructure is identifying which resources are no-longer needed. You either need to enumerate them, or to discard them without being aware that they exist. The first solution I tried was to use the AWS Account as an ephemeral container for everything that is built and destroyed. This guarantees to find everything and is reasonably quick to achieve.

I do recognise that what I'm attempting to do is a hack, Terraform and AWS are not designed to be used this way. Terraform has not made their decision to use state in isolation, they will have encountered the problems I further on in this article and the solution they went for was adding complexity through state. I don't see that as unreasonable, they made those decisions based upon the limitations of AWS at the time, and AWS made their own decisions based on assumptions about how their infrastructure was intended to be used.

## Becoming stateless

The concept is that everything you need is first built in to one AWS Account and when you want to release a new instance, re-create everything in a second AWS Account and fail over to it. The first AWS Account can then be removed. A separate AWS Account with a load balancer can then perform the switching when the new infrastructure is up and running. This is a form of [Green/Blue Deployment](https://www.redhat.com/en/topics/devops/what-is-blue-green-deployment).

## MEMBER_ACCOUNT_PAYMENT_INSTRUMENT_REQUIRED

There are a some limitations with this, and this is where I think it gets interesting.

My first thought to clearing out the old resources is to delete the AWS Account. However, the deletion of AWS Accounts from an AWS Organization requires the sub-account to have credit card details associated to the Account. That makes it a manual step. A manual step means no automated Green/Blue deployment.

<!--alex ignore black hole-->
There is an alternative. I call it the "black hole". This is an OU in your AWS Organization that has a policy on it that prevents all roles from being assumed in that Account. That way your resources cannot run and your cost will reach zero.

This then runs in to another issue. There are soft limits on the number of sub-Accounts that an AWS Organization can have. By default 10. When you're doing continuous deployments, you do the math on the number of Accounts you will need per day. It _is_ a _soft_ limit, so you can ask Amazon nicely for it to be increased, but I bet in a short amount of time you will have an unhappy Amazon asking you what you are up to and can you please stop.

## Better Solutions

The only other way of doing this is to follow the enumeration path. Fortunately tooling such as [cloud-nuke](https://github.com/gruntwork-io/cloud-nuke) exist to empty an AWS Account of resources. You can even be selective about allow-listing certain resources, but it is a very slow process and it [may not cover everything](https://github.com/gruntwork-io/cloud-nuke/issues/281).

However, the interesting part here was finding the limits of what was possible in AWS. Not something I encounter often. I don't have a lot of experience in other PaaS services so I would love to hear if doing this is possible in Azure or GCP.

I do wonder if its something Amazon will fix, but I also think it is a fundamental limitation in how AWS assumed Accounts would be used. Otherwise they would have no doubt made it easier.

A different, but unattainable solution I've started to see with services like [Fastly](https://docs.fastly.com/en/guides/working-with-services#editing-and-activating-versions-of-services) and [Doppler](https://docs.doppler.com/docs/versioning) is that they have configuration versioning built right in to their web UI.

I am hopeful that this is a tend towards a [ClickOps model](https://www.lastweekinaws.com/blog/clickops/) which has the ability to make Infrastructure as Code, and therefore managing state, redundant. The first PaaS services to do infrastructure versioning this way is going to get a lot of attention to me. However, I don't see this happening in AWS for a long time.
41 changes: 41 additions & 0 deletions content/blog/positive-signposting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: "Positive Signposting"
date: 2022-07-08T18:10:32+01:00
slug: ""
description: "Whenever possible, one of the rules I follow when providing IT Security guidance is to start from a position of positive messaging."
keywords: []
draft: false
tags: []
math: false
toc: false
---

As a society we have a tendency to tell people to stop doing something. It's more instinctive to say "Stop doing that!" rather than "Could you do this instead?". There are many reasons for this. Telling people to stop is the lowest effort solution. It's easier to know what not to do, than to be creative about what the alternatives are.

There are real world examples of this all over the place.

In my office there are paper-towels provided in the kitchens to be used for drying cutlery, crockery and our hands. We are also provided with a choice of bins to discard our waste items, a recycling bin, a food waste bin, and a general bin. Being environmentally conscious, I, and seemingly others, were discarding our paper-towels in the recycling bin. Apparently however, this is incorrect.

The solution the people who manage the office came up with was to provide a note to each kitchen in the building against the recycling bin saying "No paper towels".

So where _should_ the paper-towels go? I'd rather it wasn't destined for general waste . Food waste? Is paper made from food-like material? It isn't clear.

Another example of this is a very common sign you see around the suburbs where I live.

![A sign on the side of a building saying "NO BALL GAMES"'](/img/blog-post/positive-signposting/no-ball-games.png)

Presumably this was put up because those playing football were bouncing the ball off the wall, and the person living on the otherside of that wall wasn't happy about that, perhaps some broken windows, but we don't know.

That means cricket is out too. Badminton? That would be ok. Frisbee should be absolutely fine according to the sign.

The issue that I see in is that when presented with multiple options, negative signposting only removes one of them. That doesn't provide you any information on finding the right solution. This makes for a poor customer experience, and unwanted outcomes.

## The security angle

What I'm trying to illustrate here is that using ["common sense" is not the answer]({{< ref "./common-sense-as-bad-practice.md" >}}) and neither is blaming your customers, we all own the problem.

If you want to create an impact on a problem, you need to first analyse the problem and come up with solutions that work in most if not all cases, because our initial instincts are often [lead to undesirable outcomes]({{< ref "./behavioural-economics-second-order-thinking.md" >}}).

Or we could think about the problem and provide tools our customers want and in to services that allows them to make good, informed, decisions.

If something has gone wrong, don't mark the action as wrong and control for it, identify what went wrong and identify how to make it go right.