Skip to content

feat: add community experience dashboard#2636

Merged
Startrekzky merged 3 commits into
apache:mainfrom
merico-ai:hez-add-community-exp-dashboard
Aug 4, 2022
Merged

feat: add community experience dashboard#2636
Startrekzky merged 3 commits into
apache:mainfrom
merico-ai:hez-add-community-exp-dashboard

Conversation

@hezyin
Copy link
Copy Markdown
Contributor

@hezyin hezyin commented Jul 29, 2022

Add a new dashboard that computes 8 key metrics for dev experience in the community. Here's the metric list:

Issue Metrics:

  1. Time to Initial Issue Response
  2. Issue Resolution Time
  3. Issue Response Rate within SLA (user can customize their SLA in the dashboard)
  4. Number of Good First Issues (user can customize the issue label that represents good first issues in their repo)

PR Metrics:

  1. Time to Initial PR Review
  2. PR Resolution Time
  3. PR Resolution Rate within SLA (user can customize their SLA in the dashboard)
  4. PR Closed W/O Merging Rate

Future work includes differentiating between community issues/PRs vs core team issues/PRs.

Example Screehshot:

Screen Shot 2022-07-29 at 1 59 36 PM

@hezyin hezyin requested a review from Startrekzky July 29, 2022 21:00
]
}
],
"title": "Issue Response Rate within SLA [Last Month]",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add or comment_rank is null in the last line? Otherwise, the non-responded issue won't be calculated.

with issue_comment_list as(
  select
    i.id as issue_id,
    i.url,
    i.title,
    i.created_date as issue_created_date,
    ic.id as comment_id,
    ic.created_date as comment_date,
    ic.body,
    case when ic.id is not null then rank() over (partition by i.id order by ic.created_date asc) else null end as comment_rank
  from
    lake.issues i
    join lake.board_issues bi on i.id = bi.issue_id
    join lake.boards b on bi.board_id = b.id
    left join lake.issue_comments ic on i.id = ic.issue_id
  where
    date(i.created_date) BETWEEN
      curdate() - INTERVAL DAYOFMONTH(curdate())-1 DAY - INTERVAL 1 month and
      curdate() - INTERVAL DAYOFMONTH(curdate()) DAY
    and b.id in ($repo_id)
)

select
  100 * sum(case when (TIMESTAMPDIFF(MINUTE, issue_created_date,comment_date))/60 < $iir_sla then 1 else null end) / count(*)
from issue_comment_list
where comment_rank = 1 or comment_rank is null

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

]
}
],
"title": "PR Resolution Rate within SLA [Last Month]",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Issue response time, shall we remove and status = 'closed? Otherwise, the opened PR will not be calculated.

select
  100 * sum(case when TIMESTAMPDIFF(Minute, created_date, closed_date) / 1440 < $prrt_sla then 1 else 0 end) / count(*)
from 
	pull_requests pr
where 
  date(created_date) BETWEEN
    curdate() - INTERVAL DAYOFMONTH(curdate())-1 DAY - INTERVAL 1 month and
    curdate() - INTERVAL DAYOFMONTH(curdate()) DAY
	and pr.base_repo_id in ($repo_id)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Even though it's not necessary, I added closed_date to the condition to make the query easier to understand.

select
  100 * sum(case when closed_date and TIMESTAMPDIFF(Minute, created_date, closed_date) / 1440 < $prrt_sla then 1 else 0 end) / count(*)
from 
	pull_requests pr
where 
  date(created_date) BETWEEN
    curdate() - INTERVAL DAYOFMONTH(curdate())-1 DAY - INTERVAL 1 month and
    curdate() - INTERVAL DAYOFMONTH(curdate()) DAY
	and pr.base_repo_id in ($repo_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding closed_date to the condition will affect the result of count(*) and the final result

"metricColumn": "none",
"queryType": "randomWalk",
"rawQuery": true,
"rawSql": "select\n count(*)\nfrom\n lake.issues i\n join lake.board_issues bi on i.id = bi.issue_id\n join lake.boards b on bi.board_id = b.id\n join lake.issue_labels il on il.issue_id = i.id\nwhere\n il.label_name = \"$label_gfi\" and\n i.status != 'DONE' and\n b.id in ($repo_id)",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's safer to count(distinct i.id) then count(*)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count(distinct.id) is indeed more error tolerant, but it's also more prone to hiding data duplication bugs. Maybe we should go with count(*) here?

@hezyin hezyin force-pushed the hez-add-community-exp-dashboard branch from e421e95 to 3b2ebef Compare August 4, 2022 02:35
Copy link
Copy Markdown
Contributor

@Startrekzky Startrekzky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Startrekzky Startrekzky merged commit 67105cd into apache:main Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants