-
Notifications
You must be signed in to change notification settings - Fork 6
Restrict merge access to most GOV.UK repos #52
Conversation
887a366
to
6cf8ae5
Compare
This restriction is necessary for apps with Continuous Deployment enabled i.e. where "merge" is the same as "deploy". Currently, the set of developers without production access may include people without security clearance, for whom we have to assume a worst case scenario. As part of moving to Continuous Deployment, we will incrementally override each repo to add this restriction. Rather than adding lots of overrides, we should try to make a clear policy decision about whether non-production developers should be able to merge PRs. Exceptions to this: - govuk-puppet - some changes only apply to Integration, such as initial PRs for someone joining GOV.UK [1] Reasons for this change: - We avoid building up a set of overrides for Continuous Deployment. - We more clearly align deploying and change with merging it (since the person merging it is always able to). - People deploying changes can have more visibility of them, having merged the changes they are deploying themselves. - A higher barrier to merge for new joiners should promote the good habit of doing branch deploys for any manual testing. Reasons against this change: - Some loss of satisfaction of merging your own PR, even though you couldn't actually deploy it to Production. - Some additional workload on developers with production access, who will now be asked to merge PRs, as well as deploy them. [1]: https://docs.publishing.service.gov.uk/manual/get-started.html#6-get-ssh-access-to-integration
6cf8ae5
to
e0d593b
Compare
This isn't going to cause a problem for my team, but for now we're working outside the usual apps, on stuff which is currently only deployed to the PaaS (and all of us have production access anyway). So I'll leave it to people with more normal teams to approve. Though I'm not too sure about
Does that mean only people with production access will be able to approve PRs for puppet, so they can verify that the change isn't going to affect staging or production? |
@barrucadu perhaps I need to clarify what I mean by an "exclusion". The affect of this change will be that only developers with production access will be able to click the "merge" button; anyone can still approve. I'm suggesting we exclude Puppet (i.e. keep the repo as it is now) because it's sort of OK to merge a PR without deploying further if it only applies to Integration (that, and we're not likely to enable CD for it!). Does that answer your question? |
This will restrict merging Dependabot's many PRs (although maybe less soon!) to people with (permanent?) production access. Currently, there are significantly fewer production-cleared developers compared to the stupid number of repos, and a stupid number of Dependabot PRs. Is that sustainable? Should we relate "learning about how to keep software up to date, effectively test it and be confident in its operation" with "being able to deploy to Production"? Partially related: we're potentially going to hire more juniors, how do we ensure that they are getting the rounded education they need in how to be software engineers if they can't self-merge their own PRs and sometimes see the pitfalls and brokenness of doing that too early, and debug broken stuff in integration? I know that times have changed, but I'm not sure I'd have got where I am today - having diligently waited years for production access - with these restrictions if they'd been around six years ago. Y'know? How do we ensure that people take care and pride in the changes that they make, if they don't get the satisfaction of merging them themselves? It's about trust at the end of the day, and I think that this erodes some of that. I have more thoughts, but I can't form them in my brain right now and at the risk of this turning into an essay, I'm going to stop here. This should give more than enough to think about! |
Ok, that clears things up for me. Thanks. |
@issyl0 thanks for your thoughts. I'll try and respond here.
I think so. Anyone can still review and approve a PR, which is the real time consuming part. Having to ask someone to click a green button before deploying isn't a significant change.
It sounds like there's a need to let people break Integration on the master branch, so they learn to not break Integration on the master branch. How do they learn this at the moment?
Are you saying we need to be able to press a green button in order to motivate us to do a good job?
Our current process means that, of the people without prod access, we can't tell who is security cleared and not. The argument of this PR is really that we're a little too trusting of this group at the moment. How trusting should we be of people without production access? |
Seems like the main downside is more "process" / "friction" for teams (e.g. checking if changes merged, finding some to merge, making sure that they also deploy it etc. etc). Although maybe this isn't that much more compared to the overhead of just needing someone else to deploy. The end goal is enabling CD, which should overall reduce the "process". For me it's a question of "does the extra work of slowly adding manual overrides as we rollout CD worth the extra "process" of merge restrictions without CD?" And if we do want add this blanket rule, should we wait closer until we are rolling out CD across all apps to minimise extra "process"? |
Thanks for explaining your thoughts @theseanything. If we think of this PR in isolation, then I agree there's little motivation to make a change. The reason I've raised this is because I can see it being a thorny issue in the upcoming RFC and subsequent rollout for Continuous Deployment. This would be a shame, because what we're talking about here isn't related to CD - it's just a tangential compatibility change. I'll try and list the alternatives I see:
So to answer your questions: I think it is worth making this change now: partly for the advantages ☝️, which support CD but may also be beneficial in general; and partly for the learning we get from applying it on its own. How do you see it playing out? |
Definitely agree that we shouldn't trying to change our security processes now. (but possibly worth revisiting in future)
To be honest, I really don't have a strong view for either way.
Slightly lean towards the incremental approach - as it allows us to rollout slowly and tackle issues as they come up, and avoids getting into an all or nothing situation. Accompanying it with CD will potentially lessen any "process" for teams and helps people realise why the change. From what we've noticed on Smart Answers, it's really been a non-issue. There was slight confusion as to why someone couldn't merge - but it was easy to explain and didn't result in any consequences. |
Thanks for raising this Ben. Sorry for not getting involved in the discussion sooner. My main concern is that this will be disempowering for developers without "production access". Admittedly they're mostly disempowered already by not being able to deploy their own changes, so this may not be making things that much worse. When we're moving to continuous deployment I think it's a small price to pay to introduce this restriction, but I'm not sure about rolling it out everywhere in preparation - that feels like it will be disempowering with little benefit. I guess that makes me in favour of an incremental approach (i.e. add this restriction as we roll CD out to new apps). I know you said you wanted to avoid rewriting our whole security process, but (without wanting to derail the conversation entirely) I think it's worth considering it here. I think the main issue is that our levels of permission aren't granular enough to represent the jobs that we want people to do. Currently, when someone starts on GOV.UK they get access to push code, deploy to integration, ssh onto integration machines etc., but there's no way for them to affect production (other than merging a PR and waiting for some patsy to deploy it for them). The next level of access is full production access, which basically means being an administrator of everything - ssh onto any machine, super admin in signon, AWS admin, etc. Because production access means potentially using a bunch of very sharp tools, we want to make sure people aren't going to cut themselves (or the rest of us) before giving them that level of access. Consequently, that means there's a long, arduous process people have to go through before getting production access. Gating being able to merge PRs on having done two shadow 2ndline shifts and one ordinary 2ndline shift feels overly burdensome. Personally, I think it would be better to have something more along the lines of:
... but that probably wants a separate RFC. What I'd suggest regarding this PR is:
|
Disempowering! That's the word that sums up my long, emotive, meandering comment above. |
@richardTowers thanks for all your thoughts on this. I like the sound of the revised process for granting access - CD should make it easier to implement, since we won't actually need to grant access beyond Integration. Looking over the comments, I'm reading the conclusion as: close this PR and apply it incrementally. Before I do this, I'd like us to agree (in principle) what the future will look like, which is really why I raised this PR:
@issyl0 @richardTowers @theseanything @barrucadu have I got that right? (You can just 👍). I'd much rather found out any lingering concerns we have now than have a new version of this discussion on the RFC. |
https://trello.com/c/TUB9OGWD/152-prepare-and-run-a-beta-trial-of-continuous-deployment
Deadline for discussion: 2020-06-30
Follow-up tasks: