Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt triage Service-Level Objectives (SLOs) #9820

Open
jyasskin opened this issue Jan 19, 2024 · 9 comments
Open

Adopt triage Service-Level Objectives (SLOs) #9820

jyasskin opened this issue Jan 19, 2024 · 9 comments
Labels

Comments

@jyasskin
Copy link
Member

jyasskin commented Jan 19, 2024

<edit-for-tldr>
Some basic goals:

  • Make sure someone goes over every new issue and PR in a reasonable amount of time and thinks about how important they are.
  • Provide a way to set deadlines for the more important issues.
  • Notice when we miss those deadlines, so we know to allocate more time to triaging/fixing issues.

</edit-for-tldr>

I've set up a system at https://speced.github.io/spec-maintenance/ that attempts to describe whether groups are triaging their new issues quickly, and then keeping on top of the issues they describe as urgent or important. This is designed to make spec maintenance fit into the ways Google management prioritizes code maintenance, but I'm hoping that it'll also 1) fit into the ways other organizations prioritize work, and 2) help external observers see what's going on, and 3) help WGs focus on what they've decided is important.

I'd like to talk to the CSSWG about whether the overall idea sounds like it'll help you, and then what changes you'll need in order to adopt it. After talking to @tabatkins a bit, our rough proposal is to mark every open issue with the Priority: Eventually label (meaning, no SLO for closing the issues), so you start with no SLO violations, and then start triaging from there.

A sketch of the current design:

  • SLOs of 1 week to triage issues (which includes PRs), 2 weeks to resolve urgent issues, and 3 months to resolve important issues, were based on data about how long it takes to discuss and close existing issues on web specifications, summarized at https://github.com/speced/spec-maintenance?tab=readme-ov-file#triage. These could change if you think particular other times would work better.
  • It doesn't count time while the issue was closed, while a PR was a draft, or while the issue was marked "needs reporter feedback" and the reporter hasn't responded yet.
  • It currently includes time before an issue was marked as urgent or important, but Tab pointed out I should change that: Count SLOs from when their label was added speced/spec-maintenance#4
  • SLO compliance is available in JSON files like https://speced.github.io/spec-maintenance/w3c/csswg-drafts.json. Currently the history of SLO compliance (for pretty graphs) isn't anywhere public, but it's straightforward to collect, given a place to store the data.

What do you think?

@svgeesus
Copy link
Contributor

Firstly, yes we need something like this and there are issues and PR that are severely outstanding. Now on to the details:

I see that CSS WG is listed as having "no triage labels" which I presume means we don't have all of the ones listed. We do have, and use, triage labels Agenda+ (and also others, which are not listed):

image

Notice that in addition to Agenda+ we have Agenda+ F2F , Agenda+ Whiteboard, and Agenda+ TPAC (which means we need face to face time and a whiteboard, not just a phone conference.

The TPAC one, and also Agenda+ I18n, also imply the need for cross-working-group discussion. Which is one special case of something which seems to be missing from the proposed organizational scheme: multi-stakeholder consensus.

One could easily add Agenda+ A11y or Agenda+ Privacy for example, for common pairwise groupings. But, and I am thinking specifically of one long-running example here, there are cases where an issue is blocked, for years, because (in this case) I18n wants one thing, and Privacy wants the exact opposite thing, an we can't move forward because all the proposed solutions fundamentally disenfranchise some group of constituents.

Which brings me to another example: Needs Reporter Feedback. I agree by the way that this label is needed. But again it implies a simple two-party stakeholder situation: the reporter, and the WG.

More commonly, in my experience, is that an issue (and even potential solution) is raised by a reporter from Implementer A, and what we need is feedback from Implementers B, C and D which is what takes the time as issue prioritization is not the same among all implementers. Often an issue has been around for a while but internally A suddenly promoted it to super urgent and needs a solution like, yesterday so they can ship it while B is working on it but with a 3 to 6 month timescale and C doesn't want to do it at all but certainly doesn't want the WG to settle on the solution proposed by A.

@svgeesus
Copy link
Contributor

svgeesus commented Jan 23, 2024

Data point: CSS WG has 71 open, 1817 closed PR, the oldest of which is from Feb 2017.

@svgeesus
Copy link
Contributor

svgeesus commented Jan 23, 2024

Assuming we don't want to impose one-month or even 3-month SLOs on all the horizontal review groups, then the Horizontal Review issues such as this one would need to be excluded. They are started when horizontal review is first triggered, and typically will be open for a bare minimum of 3 to 6 months (often much longer) until all reviews are completed and the spec has transitioned to CR Snapshot. These all have the Administrative Tracker label (another of the triage labels that CSS WG has been using for a while now).

Although, that label also gets used for "hey folks, this should really be FPWD right" and "time for Proposed Recommendation, surely" issues which I do want to be part of the SLO.

@svgeesus
Copy link
Contributor

@jyasskin ideally, "needs reporter feedback" ahead of this weeks call.

@jyasskin
Copy link
Member Author

jyasskin commented Jan 23, 2024

Yes, the "no triage labels" mark really means "no labels this tool understands", and it isn't exactly that either since I took Tab's suggestion of treating "agenda+" as having a 90-day deadline. The practical effect of that mark is to expand the set of issues that get treated as "triaged": With it, any issue with a non-author reply is "triaged", and without the mark, only issues with a known label count as "triaged".

I'm also intentionally not giving /agenda\+.*/ a deadline, since "Agenda+ TPAC" could legitimately be marked that way for most of a year. But I could teach the tool to understand any of your existing labels and give them custom deadlines if y'all want. The countervailing pressures are that 1) it'd be good to align on some common labels that all web groups can share, and 2) my summary dashboard is going to have to simplify the structure to fit all groups into a common display. But making it useful to you can override those.

I can easily expand the set of "Needs * Feedback" labels that pause the SLOs. Or we could decide that if an issue needs feedback from non-WG folks, and the WG doesn't think it's important enough to actively hound those folks to finish the feedback, then the issue probably shouldn't have a WG-side SLO. Labeling the issue with Priority: Eventually (however that gets spelled) will make it not matter whether the SLO is paused. Issues about getting wide review, especially, should not have an SLO from the CSS side. They should ideally be paired with an issue for the wide review group, which does have an SLO.

I also want to make sure we're clear on how many issues should have an SLO/deadline. I'd expect that most issues would not have an SLO, and you'd intentionally mark an issue as urgent or important when the right stakeholders agree to work on it more intensely. Another possible pattern would be for an implementer to mark an issue as urgent/important when they want an answer quickly, but then for the WG as a whole to remove the SLO again if the other implementers aren't willing to focus on that issue quickly. But maybe I'm wrong to expect those patterns! I'd like this call to tell me what patterns to expect.

@svgeesus
Copy link
Contributor

The countervailing pressures are that 1) it'd be good to align on some common labels that all web groups can share, and 2) my summary dashboard is going to have to simplify the structure to fit all groups into a common display.

Oh, absolutely. And my strong preference is that all groups use the exact same spelling and the same colors for those labels, too. People in multiple groups with be glad of the usability boost that provides.

Or we could decide that if an issue needs feedback from non-WG folks, and the WG doesn't think it's important enough to actively hound those folks to finish the feedback, then the issue probably shouldn't have a WG-side SLO.

Right, in some cases an issue is open because "it would be nice to let this implementer know" or "that survey of web technologies might usefully gather stats on feature foo, next year" and I don't want that to negatively impact a SLO rating score, but also don't want such issues prematurely closed to boost the metric.

I also want to make sure we're clear on how many issues should have an SLO/deadline. I'd expect that most issues would not have an SLO, and you'd intentionally mark an issue as urgent or important when the right stakeholders agree to work on it more intensely.

That wasn't clear, so thanks for saying so.

Also in the interests of being clear: I think the overall idea is great, and love how it is data driven based on analyzing GitHub issues across all of W3C (and beyond). But I am also aware that once a thing is important, and gets an automated metric, mostly people focus on improving the metric not improving the thing, which can lead to antipatterns like premature issue closure, not using tags like "Needs more tests" that would delay closure, and so on. So I want to help get it right.

@smfr smfr changed the title Adopt triage SLOs Adopt triage Service-Level Objectives (SLOs) Jan 24, 2024
@fantasai
Copy link
Collaborator

Fwiw, SLOs I'd like the CSSWG to maintain include:

  • Less than 20 Agenda+ items in the backlog.
  • Issues not remaining Needs Edits for an extended period of time.
  • Maximum 3 weeks between WG-approved edits and publishing to /TR, and ideally only 1-2 days unless additional review is needed for those edits.

@fantasai fantasai added the meta label Jan 31, 2024
@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed Adopt triage Service-Level Objectives (SLOs), and agreed to the following:

  • RESOLUTION: adopt this triaging tool
  • RESOLVED: Adopt triage Service-Level Objectives (SLOs)
The full IRC log of that discussion <Frances> Jeffrey: system in place to look at every issue when it comes in and take to deal with or if it is an open ended issue and force the deadlines if we miss them
<Frances> Jeffrey: would like feedback on changes to the design from the working group, add to Elik'a suggestion to phrase them in terms of GitHub labels and an automated system. Everything in discussion looks implementable.
<chrishtr> I'm a big fan of this, it will help us avoid accidental mistakes
<Frances> Alan: Sounds great, we can try a few things out. Some might be useful, some might not.
<astearns> ack fantasai
<Zakim> fantasai, you wanted to ask about F2F? and to
<florian> q+
<jyasskin> q+ to extra agenda+ labels
<Frances> fantasai: previously had agenda+ in a previous upcoming agenda. Need prioritization especially with 50+ items. Possible agenda+ urgent. Could be a good change to the labels.
<Frances> fantasai: prioritization is not obvious currently
<fantasai> s/had agenda/had only agenda/
<fantasai> s/prioritization/Wrt prioritizing all issues, priortization/
<Frances> jeffrey: extra agenda+ labels, have a system the other working groups can adopt, consistent meaning across working groups. Endorse agenda+ later or low priority.
<fantasai> s/is not obvious/is not always obvious/
<jyasskin> q-
<Frances> Jeffrey: Expect that the default label does not have a deadline and default of no deadline
<astearns> ack florian
<Frances> Florian: agenda+ may mean we have discussed synchronously enough, put it in the call. Or possibly that we need to ship now. ready for a decision vs we need a decision now.
<fantasai> s/synchronously/asynchronously/
<fantasai> s/ship now/ship, need a decision now/
<Frances> Florian: Has to be different than a company such as in x weeks. Need to give enough room from some that are late. Focusing on triage is more important.
<Frances> Jeffrey: Can't assign work, this is volunteer. Labels are under the control of the working group, companies can't assign labels.
<Frances> Jeffrey: Create a blocksshipping label for a higher priority label.
<astearns> ack fantasai
<Frances> fantasai: Rather than blocks shipping, possibly urgent instead.
<Frances> jeffrey: There is already an urgent label. agenda+ and urgent labels are reasonable set of labels
<fantasai> s/instead./instead, since things can be urgent for different reasons, not only blocking shipping/
<Frances> Alan: Other comments?
<Frances> Jeffrey: Adopt labels and label things. Label everything with priority eventually label and label them as they are triaged.
<Frances> fantasai: Something can be urgent but not important and vice versa. One axis.
<florian> q+
<fantasai> S/One axis/Two axes. But I think you want one axis/
<astearns> ack florian
<Frances> Jeffrey: Something better than numbers, can possibly go by numbers.
<fantasai> s/Something/Want to something/
<Frances> Florian: If something is urgent, need it soon. Might not be important, need to answer it quickly.
<chrishtr> q+
<fantasai> s/important/urgent to everybody, but is important to at least one person in the WG/
<Frances> jeffrey: Might make sense to use eventually label.
<astearns> ack chrishtr
<Frances> fantasai: Agree, makes sense
<Frances> Chris: urgent and important are names for p0 and p1. urgent means it needs to happen soon. It is on the same axis.
<Frances> Chris H: concept of soon, concept of right away, and pick names.
<Frances> Jeffrey: Possibly rename priorities to soon, eventually, and right now
<schenney> ?
<schenney> q+
<jyasskin> q+ to say that Google's practice isn't gospel
<Frances> chris H: asap one is used sparingly, could cause a compat risk. Need to possibly indicate when getting done soon. Vs eventually does not need to get done as soon.
<astearns> ack schenney
<fantasai> s/makes sense/makes sense. But importance is a different axis, so if need a level between Urgent and Eventually that means Soon, just call it Soon?
<Frances> florian: plenty information to triage things accordingly.
<bradk> “Eventually” means could be years?
<jyasskin> bradk: Yes.
<Frances> fantasai: Soon could be good, if we have a context for the discussion. Not so many issues for agenda+, outside of agenda+ in issues and triage, editor needs to handle things as separate. Priority just sits on issues outside of it.
<fantasai> s/bradk:/bradk,
<astearns> ack jyasskin
<Zakim> jyasskin, you wanted to say that Google's practice isn't gospel
<Frances> jeffrey: Could use agenda+ instead of soon.
<schenney> Sorry, there are two distinct concerns in my mind. One is "what do we do now" and one is "how do we have data to inform future decisions". We need to consider the latter even if we don't act on the data.
<fantasai> s/Soon could be good, if we have a context for the discussion/For all Agenda+, handling soon is good, because otherwise lose context and discussion is weak/
<fantasai> s/Not so many issues/Should not have so many issues/
<Frances> alan: any objections?
<fantasai> s/, outside/. Outside/
<fantasai> s/issues outside of it/issues outside Agenda+ also, would be expected to affect triage I suppose
<astearns> RESOLUTION: adopt this triaging tool
<Frances> PROPOSAL: Adopt triage Service-Level Objectives (SLOs)
<Frances> RESOLVED: Adopt triage Service-Level Objectives (SLOs)
<Frances> Github-bot take up https://github.com//issues/9850

@schenney-chromium
Copy link
Contributor

During the meeting there was disagreement about adding labels for both "priority" and "severity". In the meeting I think these were more often referred to as urgency and importance.

I wish to make the point that adding the labels does not mean we need to use the labels, but it does mean we would have historical data to inform any future changes to the triage policies. For that reason I would support separating the concepts of priority (aka urgency) and severity (aka importance) in labeling issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants