Skip to content

Latest commit

 

History

History
42 lines (34 loc) · 8.73 KB

on-call.md

File metadata and controls

42 lines (34 loc) · 8.73 KB

On-call check

Part of the Multi-team Software Delivery Assessment (README)

Copyright © 2018-2021 Conflux Digital Ltd

Licenced under CC BY-SA 4.0 CC BY-SA 4.0

Permalink: SoftwareDeliveryAssessment.com

Based on selected criteria from the following books:

Definition of on-call: for this assessment, "on-call" means being available and responsible for diagnosing and fixing (through workarounds or updated code) any problems in the live/production systems that relate to software that you and your team creates and evolves. You might be available during working hours or outside of working hours

NOTE: The subject of on-call is very emotive and there is significant context and nuance behind the assessment criteria here. We recommend that you read at least these two articles:

  1. On Call Shouldn’t Suck: A Guide For Managers
  2. On-call doesn’t have to suck

Try to understand the social context in which the criteria for Tired and Inspired would make sense. At one extreme, paying people 3x or 4x normal salary to be on-call could incentivize more bugs reaching the live systems (because the more problems that occur in live, the more money they get paid for being on-call); conversely, having on-call open only to those people with compatible home lives could exclude many people with home care responsibilities, depriving them of valuable experience.

Purpose: Assess the approach to on-call support within the software system. 

Method: Use the Spotify Squad Health Check approach to assess the team's answers to the following questions, and also capture the answers:

Question Tired (1) Inspired (5)
1. Purpose of on-call - How would you define "on-call"? On-call is a way to get developers to fix problems that people in Support or Live Services don't know how to fix. On-call is a sensing mechanism to help teams build better software.
2. Benefits of on-call - What are some ways in which the software benefits by having developers on-call? Bugs are fixed quickly. The needs of all kinds of users can be better understood by having team members on-call. We can better empathise with primary/secondary/tertiary users by seeing the problems for ourselves.
3. Reward - How are you rewarded for being on-call out of working hours? Significant compensation/money - 3x or 4x normal salary - plus additional time off. The more bugs in the software that reach live/production, the more money we make. We are recognized for our increasing skills as engineers: experience from on-call counts towards our performance reviews. We may also get some time off to recover from out-of-hours on-call and/or some additional money for out-of-hours on-call. Overall, on-call feels valuable for our careers.
4. On-call UX - What is the User Experience (UX) / Developer Experience (DevEx) of being on-call at the moment? It is painful and slow to diagnose problems. The tools and access rights make diagnosis exciting and an opportunity to learn.
5. Learning from on-call - What happens to knowledge gained during on-call? How is the software improved based on on-call experiences? Little time is allocated to fix problems after they are discovered when on-call. Learning from on-call is used to prioritise key aspects of the team's work.
6. Attitude to on-call - Under what circumstances would on-call not be a burden? On-call would not be a burden if we never had to do it. On-call is not a burden - it's a privilege to be able to learn how the software actually works.
7. Future on-call - What would be needed for this team/squad to be happy to be on-call? We would want significant additional money/compensation. We would want a great UX/DevEx and opportunity to learn when on-call.
8. Tooling for on-call - What tooling or process is missing, ineffective, or insufficient at the moment in relation to on-call? All aspects of the on-call experience are ineffective. Only very small things feel like a problem.
9. Improving on-call - How much time do you spend as a team improving the on-call experience? How often do you work on improvements to on-call? We don't have time to improve the on-call experience. We make improvements and tweaks to on-call every week - it's continuous and part of our remit.
10. Flexibility of on-call - How flexible is the on-call rota or schedule? In what ways does the schedule meet the different needs of team members? Everyone must follow the same on-call schedule, including out-of-hours work. Team members have flexibility to fit on-call work around their personal commitments and/or can opt to do on-call work solely during working hours (office hours).
11. Accessibility of on-call - How accessible is on-call? Specifically, what proportion of your team members are actually on-call regularly? Only one or two people from our team typically go on-call. Other people find on-call too difficult or confusing. Everyone on our team takes part regularly in the on-call rota, whether during office ours or out-of-hours. We all share our on-call experiences and learning.