Skip to content

FIX: PR Size is getting duplicated when using query from the document…#6661

Merged
Startrekzky merged 3 commits into
apache:mainfrom
naveenreddymanukonda:fix#6651
Dec 20, 2023
Merged

FIX: PR Size is getting duplicated when using query from the document…#6661
Startrekzky merged 3 commits into
apache:mainfrom
naveenreddymanukonda:fix#6651

Conversation

@naveenreddymanukonda
Copy link
Copy Markdown
Contributor

…ation given #6651

⚠️ Pre Checklist

Please complete ALL items in this checklist, and remove before submitting

  • I have read through the Contributing Documentation.
  • I have added relevant tests.
  • I have added relevant documentation.
  • I will add labels to the PR, such as pr-type/bug-fix, pr-type/feature-development, etc.

Summary

What does this PR do?

Does this close any open issues?

Closes xx

Screenshots

Include any relevant screenshots here.

Other Information

Any other information that is important to this PR.

"metricColumn": "none",
"rawQuery": true,
"rawSql": "with _pr_commits_data as(\n SELECT\n DATE_ADD(date(pr.created_date), INTERVAL -$interval(date(pr.created_date))+1 DAY) as time,\n pr.id as pr_id,\n prc.commit_sha,\n sum(c.additions)+sum(c.deletions) as loc\n FROM \n pull_requests pr\n left join pull_request_commits prc on pr.id = prc.pull_request_id\n left join commits c on prc.commit_sha = c.sha\n join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos' \n WHERE\n $__timeFilter(pr.created_date)\n and pm.project_name in ($project)\n group by 1,2,3\n)\n\nSELECT \n time,\n sum(loc)/count(distinct pr_id) as 'PR Size'\nFROM _pr_commits_data\nGROUP BY 1",
"rawSql": "with _pr_commits_data as(\n SELECT\n DATE_ADD(date(pr.created_date), INTERVAL -$interval(date(pr.created_date))+1 DAY) as time,\n pr.id as pr_id,\n prc.commit_sha,\n sum(c.additions)+sum(c.deletions) as loc\n FROM \n pull_requests pr\n left join pull_request_commits prc on pr.id = prc.pull_request_id\n and pr.merge_commit_sha = prc.commit_sha\n left join commits c on prc.commit_sha = c.sha\n join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos' \n WHERE\n $__timeFilter(pr.created_date)\n and pm.project_name in ($project)\n and pr.original_status = 'MERGED'\n group by 1,2,3\n)\n\nSELECT \n time,\n sum(loc)/count(distinct pr_id) as 'PR Size'\nFROM _pr_commits_data\nGROUP BY 1",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @naveenreddymanukonda , can you use the SQL below? There're two major differences:

  1. There's no need to join table 'pull_request_commits' for speed purposes
  2. It has to join table 'project_mapping' to get the project_name, so that the project filter in this dashboard could work.
with _pr_commits_data as(
  SELECT
    DATE_ADD(date(pr.created_date), INTERVAL -MONTH(date(pr.created_date))+1 DAY) as time,
    pr.id as pr_id,
    pr.merge_commit_sha,
    sum(c.additions)+sum(c.deletions) as loc
  FROM 
    pull_requests pr
    left join commits c on pr.merge_commit_sha = c.sha
    join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos' 
  WHERE
    $__timeFilter(pr.created_date)
    and pm.project_name in ($project)
  group by 1,2,3
)

SELECT 
  time,
  sum(loc)/count(distinct pr_id) as 'PR Size'
FROM _pr_commits_data
GROUP BY 1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with _pr_commits_data as(
SELECT
DATE_ADD(date(pr.created_date), INTERVAL -MONTH(date(pr.created_date))+1 DAY) as time,
pr.id as pr_id,
pr.merge_commit_sha,
sum(c.additions)+sum(c.deletions) as loc
FROM
pull_requests pr
left join commits c on pr.merge_commit_sha = c.sha
join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos'
WHERE
$__timeFilter(pr.created_date)
and pm.project_name in ($project)
and pr.original_status = 'MERGED'
group by 1,2,3
)

SELECT
time,
sum(loc)/count(distinct pr_id) as 'PR Size'
FROM _pr_commits_data
GROUP BY 1

"metricColumn": "none",
"rawQuery": true,
"rawSql": "with _prs as(\n SELECT\n pr.id,\n pr.url,\n pr.created_date,\n pr.merged_date,\n pr.author_id,\n prc.commit_sha,\n c.additions + c.deletions as loc,\n u.id as user_id,\n u.name as user_name,\n t.id as team_id,\n t.name as team\n FROM pull_requests pr\n join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos' \n left join pull_request_commits prc on pr.id = prc.pull_request_id\n left join commits c on prc.commit_sha = c.sha\n join user_accounts ua on pr.author_id = ua.account_id\n join users u on ua.user_id = u.id\n join team_users tu on u.id = tu.user_id\n join teams t on tu.team_id = t.id\n WHERE\n $__timeFilter(pr.created_date)\n and pm.project_name in ($project)\n)\n\nselect\n DATE_ADD(date(created_date), INTERVAL -$interval(date(created_date))+1 DAY) as time,\n sum(case when team_id in ($team1) then loc else null end)/(select count(distinct user_id) from team_users where team_id in ($team1)) as \"Team1: PR Size\",\n sum(case when team_id in ($team2) then loc else null end)/(select count(distinct user_id) from team_users where team_id in ($team2)) as \"Team2: PR Size\",\n sum(loc)/(select count(*) FROM users) as \"Org: PR Size\"\nFROM _prs\nGROUP BY 1\nORDER BY 1",
"rawSql": "with _prs as(\n SELECT\n pr.id,\n pr.url,\n pr.created_date,\n pr.merged_date,\n pr.author_id,\n prc.commit_sha,\n c.additions + c.deletions as loc,\n u.id as user_id,\n u.name as user_name,\n t.id as team_id,\n t.name as team\n FROM pull_requests pr\n join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 'repos' \n left join pull_request_commits prc on pr.id = prc.pull_request_id\n and pr.merge_commit_sha = prc.commit_sha\n left join commits c on prc.commit_sha = c.sha\n join user_accounts ua on pr.author_id = ua.account_id\n join users u on ua.user_id = u.id\n join team_users tu on u.id = tu.user_id\n join teams t on tu.team_id = t.id\n WHERE\n $__timeFilter(pr.created_date)\n and pm.project_name in ($project)\n and pr.original_status = 'MERGED'\n)\n\nselect\n DATE_ADD(date(created_date), INTERVAL -$interval(date(created_date))+1 DAY) as time,\n sum(case when team_id in ($team1) then loc else null end)/(select count(distinct user_id) from team_users where team_id in ($team1)) as \"Team1: PR Size\",\n sum(case when team_id in ($team2) then loc else null end)/(select count(distinct user_id) from team_users where team_id in ($team2)) as \"Team2: PR Size\",\n sum(loc)/(select count(*) FROM users) as \"Org: PR Size\"\nFROM _prs\nGROUP BY 1\nORDER BY 1",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes to this query.

Copy link
Copy Markdown
Contributor

@Startrekzky Startrekzky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Startrekzky Startrekzky merged commit d5276df into apache:main Dec 20, 2023
abeizn pushed a commit that referenced this pull request Dec 21, 2023
#6661)

* FIX: PR Size is getting duplicated when using query from the documentation given #6651

* FIX: PR Size is getting duplicated when using query from the documentation given #6651

* FIX: PR Size is getting duplicated when using query from the documentation given #6651

---------

Co-authored-by: Naveen Manukonda <naveen.manukonda@phenompeople.com>
@Startrekzky Startrekzky added the cherrypick-completed Use this alongside needs-cherrypick-* labels after the PR has been cherrypicked. label Dec 21, 2023
@Startrekzky
Copy link
Copy Markdown
Contributor

Hi @naveenreddymanukonda , this has been cherrypicked to v0.20 release and will be included in the next beta version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherrypick-completed Use this alongside needs-cherrypick-* labels after the PR has been cherrypicked. needs-cherrypick-v0.20

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants