GMD: issue resolution efficiency - What are abandoned issues? #5

aswanipranjal · 2018-05-17T09:21:06Z

Following #1, What exactly are abandoned issues and how can they be calculated?

Are there any use cases available?
Is there a formula to calculate it?
A solid definition?

GeorgLink · 2018-05-23T12:21:06Z

I'll throw this out there:

I consider issues abandoned if there has been no activity within the last XX days.

Activity can be new comments, new reactions, new pull requests, changed labels, changed milestones, or changed assignments.

What the threshold of XX days is supposed to be, that can be a parameter chosen by the user, after all each community works differently, but I think 30 days might be a good start.

jgbarah · 2018-05-23T13:02:36Z

I agree with @GeorgLink. However, maybe we should also have into account issues that are closed without fixing (eg, "won't fix"), but that's a bit more tricky (see below).

I think we can, for now, go with the approach @GeorgLink suggests, but in that case, I would change the name to "unattended issues", which is maybe more precise for that case. And is a useful metric, indeed.

We can also open a more general discussion about what exactly we want to track. I think this comes from an extension of the concept of "abandoned reviews" in Gerrit. There, it has a very precise meaning: reviews that were closed as "abandoned", meaning that the proposer of the patch is no longer working on it (thus, abandoning the patch, and signalling reviewers that no review is needed).

But for issues, that meaning is maybe difficult to capture, because it would be when somebody opens a ticket, and because whatever the reason, closes it before it is fixed. I think this is unusual, and I'm not sure to which extent useful... If any, could be a measure of engagement: I care so much, that I come to the issue tracking system to close this bug report, which was not addressed, but I found it was addresses somewhere else (for example).

So, in short, the "unattended" concept seems more appropriate to me.

jgbarah · 2018-05-23T13:15:45Z

Re-reading the metric, maybe the ratio for efficiency would be better defined based on issues open / issues closed (instead of issues abandoned) during a certain period. In the end, it is related to the throughput, I think.

aswanipranjal · 2018-05-25T05:12:41Z

@jgbarah: shouldn't it be: [number of issues closed (with a fix)] / [Number of total issues created]?

jgbarah · 2018-05-25T09:29:20Z

@jgbarah: shouldn't it be: [number of issues closed (with a fix)] / [Number of total issues created]?

There are two issues with that:

If you're interesting in how you cope with the issues being open, it is the ratio close/open what matters. In that case, it is not relevant why the issue was closed.
If you're interested in how much you are fixing problems, yes, your definition is more accurate. But AFAIK, there is no way, at least in some issue trackers, such as GitHub's, to determine if an issue was closed with a fix or not. Besides, there are issues that you really don't need to fix (such as questions).

So, I would stick to the definition I proposed above.

GeorgLink · 2018-05-25T17:16:24Z

I like @aswanipranjal's proposal of putting Number of total issues in the denominator, because otherwise the 4/2 will show up the same as 20/10 and 1/0 is undefined.

Total number of issues will also be time-boxed, of course.

aswanipranjal · 2018-05-27T17:40:07Z

@GeorgLink, @jgbarah: to summarize, we can have 2 outputs for this metric:

abandoned issues i.e the ones which in which there was no comment/reaction or activity of any sorts for a certain XX duration of time (initially 30 days).
[total number of issues that were closed with a fix]/[total number of issues in the repo] (here: fix is equals to a PR corresponding to that issue).

jgbarah · 2018-06-05T21:11:32Z

@aswanipranjal, the absence of a pr does not always mean the bug was not fixed (it depends on whether the project is using always prs, and how they are using them). Besides, if the ticket was not about a real bug (bug a question, for example), you really don't need to close it by fixing anything.

So, I would stick to issues closed / issues open per month. That has a clear interpretation in terms of throughput (if < 1, the project is not coping with all the opening activity), and is easy to compute.

If you want, we can later explore some metric related to actually fixing bugs, but I think that should at least some discipline by the project (such as consistently using pull requests, and using "bug" tag in issues, for example). But I'd leave this to a later step.

@GeorgLink I completely agree with you and @aswanipranjal on having closed in the numerator.

For abandoned issues, given that we really don't know if they are abandoned, or just inactive, I would use inactive. In fact, for being more precise, it could be inactive-period, such as for example `inactive-30d' (meaning inactive for 30 at least days, as of now).

So, if you agree, it would be:

inactive-period issues (we could for now settle on inactive-30d). Issues that as of now have been inactive (no comment) for period time.
inactive-period pull requests (we could for now settle on inactive-30d). Pull request that as of now have been inactive (no comment) for period time.
efficiency-period for issues (we could for now settle on `efficiency-30d'). Issues closed during period / issues open during period.
efficiency-period for pull requests (we could for now settle on `efficiency-30d'). Pull requests closed (merged or closed) during period / issues open during period.

Please, let us know any feedback, and if positive, we can start proposing a pull request to the text defining the metrics.

GeorgLink · 2018-06-06T00:45:15Z

These are all sensible. +1 I would like one clarification on "issues closed during period / issues opened during period" Issues opened is clear. Issues closed: are those the same issues that were opened, but whether they were closed at any time after their creation? Or are issues closed the ones that were closed during that time period but could have been created before? The problem with the latter is that when we receive no new issues but close 1, then we have an undefined fraction of 1/0 ...

aswanipranjal · 2018-06-06T16:20:26Z

So, I would stick to issues closed / issues open per month. That has a clear interpretation in terms of throughput (if < 1, the project is not coping with all the opening activity), and is easy to compute.

Okay, for now we stick to this.

If you want, we can later explore some metric related to actually fixing bugs, but I think that should at least some discipline by the project (such as consistently using pull requests, and using "bug" tag in issues, for example). But I'd leave this to a later step.

Yeah! I was thinking that generally there is a PR to an issue which closes it, so we could get relations and derive some useful data from that. We can look into it later.

inactive-period issues (we could for now settle on inactive-30d). Issues that as of now have been inactive (no comment) for period time.
inactive-period pull requests (we could for now settle on inactive-30d). Pull request that as of now have been inactive (no comment) for period time.

This works too!

Apart from this, I have the same question as Georg. @jgbarah can you elaborate a little more on issues closed during period/ issues open during period?

jgbarah · 2018-06-06T22:03:33Z

@GeorgLink said:

Issues opened is clear. Issues closed: are those the same issues that were opened, but whether they were closed at any time after their creation? Or are issues closed the ones that were closed during that time period but could have been created before? The problem with the latter is that when we receive no new issues but close 1, then we have an undefined fraction of 1/0 ...

In my opinion, the metric which is useful is "how much you close during this period related to how much is open". If you're closing less than it is opened (whenever the closed stuff was opened), you're accumulating work for the future. If you're closing more than is opened (whenever the closed stuff was opened), you're shortening the queue of pending work. For the metric to be this way, you need to consider all tickets closed.

Why not "closed bugs that were open during the period"? Because usually it doesn't matter for this when the bug was open. A closed bug is a closed bug: less technical debt, work done. If we only consider issues open during the current period, we're somehow favoring issues open during some "artificial" period, when computing throughput. Besides, we also have metrics on how long does it take to close issues to know about if bugs are being closed quickly or not...

I see the problem you mention. If we want to avoid it, we need to have a more elaborate metric, which is never zero in the denominator. A simple way of doing that is "issues open - issues closed", but that's not relative. For making it relative, you need to divide by something which gives you an idea of the total amount of work... What about using "bugs still open at the beginning of the period plus bugs open during the period"? That would be zero only if there were no bugs open during the period and no pending bugs when the period starts, but in that case, there will be no closed bugs either, so it would be 0/0 which we could assume to be 0 (not fully correct, but maybe fair enough). And it would be a weird case, anyway.

If this is the case, the metric would be "total number of bugs closed during the period / (total number of bugs opened during the period + total number of bugs open at the beginning of the period). In this case, the interpretation would be: the closer to 1, the less bugs remaining open at the end of the period the project is coping well with pending work), the closer to 0, the more bugs remaining open (the project is not coping well with pending work). If the number remains stable over time, project is closing at about the same pace (relative to the pending work), if it decreases, the project is closing less stuff than it should be closing...

What do you think?

jgbarah · 2018-06-06T22:14:28Z

BTW, a complementary metrics would be the backlog: how many bugs remain open at the end of the period. This captures the effect that maybe throughput is sort of good, because the project is focusing on closing bugs which are recent, but ignoring for longer and longer periods bugs that are old (and maybe difficult to close). The backlog can also be absolute or relative, and usually (in my opinion) relative is more useful. Following the spirit of the discussion above, it would be something like "total number of bugs still open at the end of the period / (total number of bugs opened during the period + total number of bugs open at the beginning of the period). If the number is larger than 1, the backlog is increasing. If it is lower than 1, the backlog is decreasing.

GeorgLink · 2018-06-06T22:17:54Z

How about, instead of building a ratio, displaying the two numbers for the same timeframe:

jgbarah · 2018-06-07T21:52:41Z

That's a nice presentation, thanks for the suggestion. But still, in addition to that (which I like to have), we need a number, that we can compare from project to project... Your presentation is a good complementary view of issues opened and issues closed in the same chart, but doesn't show to which extent the project is coping with pending work, I think...

GeorgLink · 2018-06-07T21:55:09Z

Ageed

aswanipranjal · 2018-06-11T10:43:43Z

@jgbarah, @GeorgLink: We can create a presentation that Georg mentioned above for

total number of bugs still open at the end of the period / (total number of bugs opened during the period + total number of bugs open at the beginning of the period) and
total number of bugs closed during the period / (total number of bugs opened during the period + total number of bugs open at the beginning of the period
showing the contrast between the two.

Do you have any other suggestions regarding how these fractions (also issues closed / issues open per month) should be visualised?

After the discussion in #5, this tries to apply the results and consensus of that discussion, to start defining this metric. Closes: #5.

jgbarah · 2018-06-12T22:55:13Z

I've tried to condense the results of this discussion in pull request #12.

Patching README

aswanipranjal mentioned this issue Jun 4, 2018

Better enrichment of Git and GitHub raw indices to calculate metrics for Manuscripts chaoss/grimoirelab-elk#364

Closed

jgbarah added a commit that referenced this issue Jun 12, 2018

Detailed definition of Issue Resolution Efficiency

df277ff

After the discussion in #5, this tries to apply the results and consensus of that discussion, to start defining this metric. Closes: #5.

jgbarah mentioned this issue Jun 12, 2018

Add detailed definition of Issue Resolution Efficiency #12

Merged

sgoggins closed this as completed in #12 Aug 15, 2018

GeorgLink mentioned this issue Jan 23, 2019

First version of a roadmap #74

Merged

jgbarah pushed a commit that referenced this issue Apr 10, 2019

Merge pull request #5 from chaoss/master

a10dc47

Patching README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GMD: issue resolution efficiency - What are abandoned issues? #5

GMD: issue resolution efficiency - What are abandoned issues? #5

aswanipranjal commented May 17, 2018

GeorgLink commented May 23, 2018

jgbarah commented May 23, 2018

jgbarah commented May 23, 2018

aswanipranjal commented May 25, 2018

jgbarah commented May 25, 2018

GeorgLink commented May 25, 2018

aswanipranjal commented May 27, 2018

jgbarah commented Jun 5, 2018

GeorgLink commented Jun 6, 2018 via email

aswanipranjal commented Jun 6, 2018

jgbarah commented Jun 6, 2018 •

edited

Loading

jgbarah commented Jun 6, 2018

GeorgLink commented Jun 6, 2018

jgbarah commented Jun 7, 2018

GeorgLink commented Jun 7, 2018 via email

aswanipranjal commented Jun 11, 2018

jgbarah commented Jun 12, 2018

GMD: issue resolution efficiency - What are abandoned issues? #5

GMD: issue resolution efficiency - What are abandoned issues? #5

Comments

aswanipranjal commented May 17, 2018

GeorgLink commented May 23, 2018

jgbarah commented May 23, 2018

jgbarah commented May 23, 2018

aswanipranjal commented May 25, 2018

jgbarah commented May 25, 2018

GeorgLink commented May 25, 2018

aswanipranjal commented May 27, 2018

jgbarah commented Jun 5, 2018

GeorgLink commented Jun 6, 2018 via email

aswanipranjal commented Jun 6, 2018

jgbarah commented Jun 6, 2018 • edited Loading

jgbarah commented Jun 6, 2018

GeorgLink commented Jun 6, 2018

jgbarah commented Jun 7, 2018

GeorgLink commented Jun 7, 2018 via email

aswanipranjal commented Jun 11, 2018

jgbarah commented Jun 12, 2018

jgbarah commented Jun 6, 2018 •

edited

Loading