Should we have some automated build testing? #384

huwd · 2020-08-06T10:31:07Z

Was chatting with @injms and thinking perhaps, once a week we pull this repo and a bot / github action tries to run make all-apps and then reports back in some way if any build fails?

Strikes me that there's a few images in here that are lesser used, but might be returned to.
Especially whilst we're in a world of pausing projects to jump onto pressing national concerns, it would be good if there was a way to monitor which of the images have rotted away whilst we weren't looking?

Is this something we could do with a github action?

The text was updated successfully, but these errors were encountered:

benthorner · 2020-08-06T11:06:09Z

I've thought about this myself in the past. It's a hard problem to solve, especially on a local machine (where these apps may be in all manner of strange under-development states). Since all the error state is inside containers, it could be horrendous to try and debug a problem happening inside an action. The sheer number of steps to run make all-apps to completion means that the task is also highly likely to fail for transient reasons.

Suggestion: remove make all-apps, as it's not something we need/recommend to run

Perhaps it's worth breaking this down:

What problems could we prevent with simpler, unit-style tests?
What problems will never happen again?
What problems arise externally to this repo?

One thing to note: it's not necessary for make <app> to pass in order for development to happen. make <app> does a "full" build to cover any API dependencies; that's not necessary for tests.

huwd · 2020-08-06T11:33:58Z

Sure, good headings, so my starting point would be something like this:

What problems could we prevent with simpler, unit-style tests?

Where might those unit tests live? The problem i'd be hoping to address with a change is:

I return to an app after an absence and find I cannot build the govuk-docker image becuase updates have been made to the app that were not reflected in the docker-compose.yml or Makefile.

I wonder if it's a similar problem to smokey? I forget it exists, and I can deploy my changes because all tests pass within the repo... but I forget to check if the docker image still builds and don't see smokey fail until second line give me a tap on the shoulder.

🤔 I wonder if there's something like after a sucessful deploy trigger an attempt to build the deployed app if it is on an allowlist that govuk-docker supports. If the build fails, then report it somewhere?

What problems will never happen again?

I suspect you have some in mind, but I guess this will be the ones where we're doing initial setup - right? Things I'd expect to change are what an app depends_on: and perhaps things like versions of databases or images that get pulled. Those could introduce incompatibilities right?

What problems arise externally to this repo?

That's likely nearly all of them - as we're making changes to the apps and not these files. @injms you had some examples where changes to whitehall or some of it's dependencies had caused it to stop building, could you give us a case study here?

injms · 2020-08-07T12:21:19Z

One example is a naming change for the database exports on S3 caused the local Whitehall database replication to fail - fixed in #277.

Reinstalling Whitehall is tricky - so I'm guessing that most people won't try it unless they really have so (and I count myself among those people!) Because of this Whitehall can remain un-buildable without anyone noticing. It's this kind of small change in a far-off dependency that I'd like to catch before it becomes a problem.

Another thing last week was that to get Whitehall (frontend) running locally I needed to bump Rails to 5.2, and then change Whitehall's docker-compose.yml to use bin/rails server rather than rails server - or it would have a "can't find rails" error.

I think that it'd be really great to have some way of checking that a clean build works, from git clone to make {app} to ./replicate-{database-type}.sh {app} to govuk-docker-up (and anything inbetween that I've missed).

Whilst I thought about GitHub Actions, it's the thing I'm most comfortable with - so likely there's a 'I have a hammer and every problem looks like a nail' thought process here.

benthorner · 2020-08-07T17:26:07Z

Let's see what we can do. First, we need to make this issue closable: so far, if we achieved the stated goals - every app totally works in a throw-away environment - we would have re-implemented GOV.UK and 2ndline.

To clarify: when I was talking about "unit" tests, I actually meant these "integration" ones 🙄

Second, we need to make this issue worthwhile: the effort required to close it should be less than the effort required to fix the individual problems we are experiencing. So let's focus on those specific problems first:

Remove make of e2e whitehall #374: this could be prevented in future with a test to check a Makefile only uses stacks that exist for the project.
Update govuk-docker config for service-manual-frontend #373: this could be prevented in future with a test to force all apps to run in privileged mode (there's little downside to this, for us).
router-api: rename env var HOST to BINDING #372, DNS resolution strategy incompatible with Rails 6 #176: these are just warnings, not actual problems (yet).
Add redis support for email-alert-frontend #370, Fix missing Redis env var for transition-lite #379: these are initial setup errors: whoever set up this app can't have checked all the tests were passing (so what was the point of the original PRs??)
Fix typo in authenticating-proxy/Makefile #380: this is a one-time error, since we will never make this change again. For initial setup, we should check make <app> runs to completion.
search-api fails to build #383: this could be prevented in future with a test to ensure all Ruby commands are run with a "bundle exec" or "bin/" prefix.
contacts-admin fails to build #381: this could be prevented in future by DRYing-up this particular step into a generic one
finder-frontend fails to build without publishing-api running #223: I've raised a PR to close this one (Remove data insertion step for Search API #385)
Mongo DB has also been an issue recently. This could be prevented in future by using versioned volumes (and distinct services) for each DB, like we do/did for ES.
Update MySQL replication script #277: not sure about this one. I'm leaning towards it being a one-off that's easier to fix than to try and write a test around.

I'm pleased to find a lot of the issues we're experiencing are easily preventable. Ultimately, I'm trying to avoid this repo ending up like this one, which we of course, all love.

What problems arise externally to this repo?

I think we may find: relatively few. The reason for this is that, at least in theory, anyone making a change to a repo will be doing so with GOV.UK Docker as their dev environment, finding the problem, and fixing it.

Should we have some automated build testing?

Perhaps. I'm seeing it as a high effort endeavour. On the other hand, I appreciate some of the examples above have come out of doing this manually. However, I suspect the problems and potential fixes above will tie us over for quite some time, so it may be imperfect, but good enough, to assume someone will run make all-apps every so often.

Next steps

We could try and implement some or all of the proposals above. Maybe put them into a card to work on.
We should continue this discussion to agree what is worthwhile about this issue, and when it is closable.
Make sure we can run make all-apps to completion at the moment, on a few different machines.

Thanks for raising this, by the way. I've been putting off thinking about that last Mongo DB problem, and this has been a good prompt to actually think about it.

huwd · 2020-08-17T08:53:17Z

I think those are good principles and I agree that I like how focused and simple govuk-docker is. I can appreciate where steps have been taken in bits like the docker basefile to prevent complexity creeping in.

I'm interesting that you see this as a high effort thing, I guess to go back to the first commnet, you said:

The sheer number of steps to run make all-apps to completion means that the task is also highly likely to fail for transient reasons.

Could I get a better sense of what you were thinking about there? Is that perhaps like dockerhub has gone down for 5 minutes or something?

I'm still wondering if the lowest complexity thing would be to:

Make sure we can run make all-apps to completion at the moment, on a few different machines. (<- so i'll keep going through and trying to fix issues where the build trundles out, then ask a few others to try and build off master).
Once we're happy declare the repo "clean".
Set up a github action that says:
1. Once a week run make all-apps, If it errors, report the repo it errored on (and 🤷‍♂️ at the idea you'd need to find others after it).
2. Perhaps also set up something that says if there are any changes within a project folder, rebuild just that project (I'd know how to do this on concourse, would need to check for a github action but i'm sure there's a way to do monitoring?)

Bit I'm really unsure of is where we could do this. I guess building everything will take a lot of minutes / resources. So i'm wondering if concourse and paas would be a better place?

So my suggestion would be:
Either we all agree that the list @benthorner gave above are kind of "one off" teething issues, and that once we've got make all-apps working this eventuality will be far rarer? (@injms do you think that is true of the instance you had?)

Or, having got make all-apps to work, we try some kind of conditional building with either Ian's github action hammer, or my concourse one?

benthorner · 2020-08-19T08:59:05Z

I'm interesting that you see this as a high effort thing.
...
Could I get a better sense of what you were thinking about there?

Sure, it's worth thinking about how such an approach would work:

We could make this a prerequisite check for PRs. It would be weird to do this on individual app repos, as this would take some time to setup, and look odd next to existing GitHub Actions. Having make all-apps as a PR check on this repo would slow down PRs a lot, and is mostly unnecessary (we seldom make changes that affect every app).
We could make this a scheduled job. This leads to the question: who will care when the job fails - enough to actually fix the failures? I'm basing my scepticism on the perpetual non-green state of the running GOV.UK platforms.

The reason for the transient failures is that makeing an app involves connecting to the Internet: rubygems.org, github.com, various package sites (if we're rebuilding an image). So "lots of apps" * "lots of sites" = "doom".

If we're playing this agile, a simpler starting point could be to be to plan another make-all-apps-and-fix-everything piece of work e.g. in 6 months. If we find it's useful, we could schedule another; at some point, we may think it's worth doing more frequently (e.g. if we're not catching things quickly enough), up to the point where we automate it.

We can also do some of the above suggestions to help prevent recurring problems.

How about:

A card to "Run make all-apps to completion and raise/fix all issues"
A card to "Add tests for preventable config issues":
- Investigate forcing all Rails apps to run in "privileged" mode (to prevent this)
- Investigate forcing all Ruby commands in Makefiles run with bundle / bin (to prevent this)
- Investigate using a generic step for Postgres / Mongo create/migrate (to prevent this)
A card to "Add a PR template to ensure individual apps actually work":
- Does make <app> run to completion, without errors?
- Can you run the tests without any failures?
- Can you start the app (if applicable) and visit key parts of it?

Once we've done these things, we could consider this issue closed. How does that sound? Would you be willing to write up some cards (on the "GOV.UK Developer Tools" board) to work on this?

huwd · 2020-09-30T17:12:58Z

Humm was trying to go through these all, and generated a bunch of issues, but realise i was tripping myself over trying to exclude "broken" ones, that would then be called upon by others and break. Was a bad testing methodology on my part.

I've deleted those issues, apologies if folks got email alerts.

None the less we've got these so far, these are real i believe:

So far here's what I've got:

huwd · 2020-10-07T13:27:42Z

Good news, I can get everything to build except:

bouncer (which we know about), Ben's fix is inbound
generic-ruby-library (which should be excluded from any make command that tries to list things in projects anyway, not a problem since this PR)
specalist-publiusher 🤔 humm thought we'd solved this one... but perhaps not. Will try to reproduce some more and open a new issue.

After that they all build 🎉

So i'll track down the last of these, ensure we get good tickets then tick them off.
After which I'll just periodically run a make on everything to check it's still working, and we can do the review of if anything more complicated is needed to ensure everything remains well.

If others want to try, here's the command I ran.
Start in your govuk-docker repo:

ls projects/*/Makefile | xargs -L 1 dirname | xargs -L 1 basename | grep -v generic-ruby-library | grep -v bouncer | grep -v specialist-publisher | xargs make

benthorner · 2020-10-12T08:27:04Z

Whoop, thorough job @huwd 🏅.

specalist-publiusher 🤔 humm thought we'd solved this one... but perhaps not. Will try to reproduce some more and open a new issue.

I tried this myself and make specialist-publisher runs to completion. Have you pulled recently?

So i'll track down the last of these, ensure we get good tickets then tick them off.

Is there anything we want to carry forward from my comment?

huwd · 2020-10-15T13:18:42Z

Humm yeah just confirmed make specialist-publisher works on mine too.
Ok so as of today everything builds...

I guess yeah, what do we want to do now?

huwd · 2020-10-15T13:56:47Z

How about:

A card to "Run make all-apps to completion and raise/fix all issues"

A card to "Add tests for preventable config issues":
    Investigate forcing all Rails apps to run in "privileged" mode (to prevent this)
    Investigate forcing all Ruby commands in Makefiles run with bundle / bin (to prevent this)
    Investigate using a generic step for Postgres / Mongo create/migrate (to prevent this)

A card to "Add a PR template to ensure individual apps actually work":
    Does make <app> run to completion, without errors?
    Can you run the tests without any failures?
    Can you start the app (if applicable) and visit key parts of it?

Once we've done these things, we could consider this issue closed. How does that sound? Would you be willing to write up some cards (on the "GOV.UK Developer Tools" board) to work on this?

This sounds like a good plan to me, will write up some cards.
And count this as closed

benthorner · 2020-10-15T15:15:24Z

Thanks @huwd!

Investigate forcing all Rails apps to run in "privileged" mode (to prevent this)

This one shouldn't be necessary for much longer, once this is merged in.

huwd self-assigned this Aug 6, 2020

huwd mentioned this issue Sep 30, 2020

Generic Ruby Library will not build (and shouldn't try to!) #407

Closed

huwd closed this as completed Oct 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we have some automated build testing? #384

Should we have some automated build testing? #384

huwd commented Aug 6, 2020 •

edited

benthorner commented Aug 6, 2020

huwd commented Aug 6, 2020

injms commented Aug 7, 2020 •

edited

benthorner commented Aug 7, 2020

huwd commented Aug 17, 2020 •

edited

benthorner commented Aug 19, 2020

huwd commented Sep 30, 2020 •

edited

huwd commented Oct 7, 2020

benthorner commented Oct 12, 2020

huwd commented Oct 15, 2020

huwd commented Oct 15, 2020

benthorner commented Oct 15, 2020

Should we have some automated build testing? #384

Should we have some automated build testing? #384

Comments

huwd commented Aug 6, 2020 • edited

benthorner commented Aug 6, 2020

huwd commented Aug 6, 2020

What problems could we prevent with simpler, unit-style tests?

What problems will never happen again?

What problems arise externally to this repo?

injms commented Aug 7, 2020 • edited

benthorner commented Aug 7, 2020

Next steps

huwd commented Aug 17, 2020 • edited

benthorner commented Aug 19, 2020

huwd commented Sep 30, 2020 • edited

huwd commented Oct 7, 2020

benthorner commented Oct 12, 2020

huwd commented Oct 15, 2020

huwd commented Oct 15, 2020

benthorner commented Oct 15, 2020

huwd commented Aug 6, 2020 •

edited

injms commented Aug 7, 2020 •

edited

huwd commented Aug 17, 2020 •

edited

huwd commented Sep 30, 2020 •

edited