-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we have some automated build testing? #384
Comments
I've thought about this myself in the past. It's a hard problem to solve, especially on a local machine (where these apps may be in all manner of strange under-development states). Since all the error state is inside containers, it could be horrendous to try and debug a problem happening inside an action. The sheer number of steps to run
Perhaps it's worth breaking this down:
One thing to note: it's not necessary for |
Sure, good headings, so my starting point would be something like this: What problems could we prevent with simpler, unit-style tests?Where might those unit tests live? The problem i'd be hoping to address with a change is:
I wonder if it's a similar problem to smokey? I forget it exists, and I can deploy my changes because all tests pass within the repo... but I forget to check if the docker image still builds and don't see smokey fail until second line give me a tap on the shoulder. 🤔 I wonder if there's something like after a sucessful deploy trigger an attempt to build the deployed app if it is on an What problems will never happen again?I suspect you have some in mind, but I guess this will be the ones where we're doing initial setup - right? Things I'd expect to change are what an app What problems arise externally to this repo?That's likely nearly all of them - as we're making changes to the apps and not these files. @injms you had some examples where changes to whitehall or some of it's dependencies had caused it to stop building, could you give us a case study here? |
One example is a naming change for the database exports on S3 caused the local Whitehall database replication to fail - fixed in #277. Reinstalling Whitehall is tricky - so I'm guessing that most people won't try it unless they really have so (and I count myself among those people!) Because of this Whitehall can remain un-buildable without anyone noticing. It's this kind of small change in a far-off dependency that I'd like to catch before it becomes a problem. Another thing last week was that to get Whitehall (frontend) running locally I needed to bump Rails to 5.2, and then change Whitehall's I think that it'd be really great to have some way of checking that a clean build works, from Whilst I thought about GitHub Actions, it's the thing I'm most comfortable with - so likely there's a 'I have a hammer and every problem looks like a nail' thought process here. |
Let's see what we can do. First, we need to make this issue closable: so far, if we achieved the stated goals - every app totally works in a throw-away environment - we would have re-implemented GOV.UK and 2ndline.
Second, we need to make this issue worthwhile: the effort required to close it should be less than the effort required to fix the individual problems we are experiencing. So let's focus on those specific problems first:
I'm pleased to find a lot of the issues we're experiencing are easily preventable. Ultimately, I'm trying to avoid this repo ending up like this one, which we of course, all love.
I think we may find: relatively few. The reason for this is that, at least in theory, anyone making a change to a repo will be doing so with GOV.UK Docker as their dev environment, finding the problem, and fixing it.
Perhaps. I'm seeing it as a high effort endeavour. On the other hand, I appreciate some of the examples above have come out of doing this manually. However, I suspect the problems and potential fixes above will tie us over for quite some time, so it may be imperfect, but good enough, to assume someone will run Next steps
Thanks for raising this, by the way. I've been putting off thinking about that last Mongo DB problem, and this has been a good prompt to actually think about it. |
I think those are good principles and I agree that I like how focused and simple I'm interesting that you see this as a high effort thing, I guess to go back to the first commnet, you said:
Could I get a better sense of what you were thinking about there? Is that perhaps like dockerhub has gone down for 5 minutes or something? I'm still wondering if the lowest complexity thing would be to:
Bit I'm really unsure of is where we could do this. I guess building everything will take a lot of minutes / resources. So i'm wondering if concourse and paas would be a better place? So my suggestion would be: Or, having got |
Sure, it's worth thinking about how such an approach would work:
The reason for the transient failures is that If we're playing this agile, a simpler starting point could be to be to plan another make-all-apps-and-fix-everything piece of work e.g. in 6 months. If we find it's useful, we could schedule another; at some point, we may think it's worth doing more frequently (e.g. if we're not catching things quickly enough), up to the point where we automate it. We can also do some of the above suggestions to help prevent recurring problems. How about:
Once we've done these things, we could consider this issue closed. How does that sound? Would you be willing to write up some cards (on the "GOV.UK Developer Tools" board) to work on this? |
Humm was trying to go through these all, and generated a bunch of issues, but realise i was tripping myself over trying to exclude "broken" ones, that would then be called upon by others and break. Was a bad testing methodology on my part. I've deleted those issues, apologies if folks got email alerts. None the less we've got these so far, these are real i believe: So far here's what I've got: |
Good news, I can get everything to build except:
After that they all build 🎉 So i'll track down the last of these, ensure we get good tickets then tick them off. If others want to try, here's the command I ran.
|
Whoop, thorough job @huwd 🏅.
I tried this myself and
Is there anything we want to carry forward from my comment? |
Humm yeah just confirmed I guess yeah, what do we want to do now? |
This sounds like a good plan to me, will write up some cards. |
Was chatting with @injms and thinking perhaps, once a week we pull this repo and a bot / github action tries to run
make all-apps
and then reports back in some way if any build fails?Strikes me that there's a few images in here that are lesser used, but might be returned to.
Especially whilst we're in a world of pausing projects to jump onto pressing national concerns, it would be good if there was a way to monitor which of the images have rotted away whilst we weren't looking?
Is this something we could do with a github action?
The text was updated successfully, but these errors were encountered: