Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding pull_request_id index to pull_request_commits/comments tables #7559

Merged
merged 8 commits into from
Jun 14, 2024

Conversation

sstojak1
Copy link
Contributor

@sstojak1 sstojak1 commented Jun 2, 2024

⚠️ Pre Checklist

Please complete ALL items in this checklist, and remove before submitting

  • I have read through the Contributing Documentation.
  • I have added relevant tests.
  • I have added relevant documentation.
  • I will add labels to the PR, such as pr-type/bug-fix, pr-type/feature-development, etc.

Summary

What does this PR do?
Adding pull_request_id index to pull_request_commits and pull_request_comments tables.

Does this close any open issues?

Closes 7557

Screenshots

Include any relevant screenshots here.

Other Information

Any other information that is important to this PR.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. component/framework This issue or PR relates to the framework type/refactor This issue is to refactor existing code labels Jun 2, 2024
type addPullRequestIdIndexToPullRequestCommits struct{}

type pullRequestCommits20240602 struct {
PullRequestId string `gorm:"index"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/apache/incubator-devlake/blob/main/backend/core/models/domainlayer/code/pull_request_commit.go#L28, PullRequestId and CommitSha are the primary keys, and it works as expected in db tables.

So I think there is no need to add a new index in this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true in case you are filtering by CommitSha or CommitSha + PullRequestId.
If you are filtering only by PullRequestId the composite index won't be used - docs

image
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with PullRequestId index:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh.. The order of the primary key columns seems off to me, the PullRequestId should be placed before the CommitSha column.
However, adding a secondary index is acceptable IMO if adjusting the order is too hard to do so.

Copy link
Contributor Author

@sstojak1 sstojak1 Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could work I think:
ALTER TABLE pull_request_commits DROP PRIMARY KEY;
ALTER TABLE pull_request_commits ADD PRIMARY KEY (pull_request_id, commit_sha);
However, this change might impact the rest of the application, especially if there are custom queries that end users have that expect commit_sha to be first in the index sequence.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sstojak1 True, we should consider which one is more likely to happen. It is easier to predict someone might need to group records by pull_request_id while it is harder to think of how commit_sha being prioritized could be beneficial for any case, maybe "Counting how many PRs contain the same commit?" which doesn't make too much sense.

Copy link
Contributor Author

@sstojak1 sstojak1 Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I changed the pull_request_commits script. Now it's changing the primary key columns order.
image
image

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 3, 2024
@sstojak1 sstojak1 requested a review from klesh June 3, 2024 19:13
Copy link
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are two tables handled differently?
Besides, can we support postgresql as well? Apache DevLake supports pg as a storage option.

@sstojak1
Copy link
Contributor Author

sstojak1 commented Jun 4, 2024

Why are two tables handled differently? Besides, can we support postgresql as well? Apache DevLake supports pg as a storage option.

Those two tables don't have the same primary key definition. That's the reason why migration scripts are different.

Is the PostgreSQL supported? There are also a few scripts where this kind of schema check is added...
Also, check this out:
https://devlake-io.slack.com/archives/C03APJ20VM4/p1713164942109869
https://devlake-io.slack.com/archives/C03APJ20VM4/p1716386100745249

@sstojak1 sstojak1 requested a review from klesh June 4, 2024 13:43
@sstojak1
Copy link
Contributor Author

sstojak1 commented Jun 6, 2024

Hi @klesh any updates here? Did you manage to check my last comment?

@klesh
Copy link
Contributor

klesh commented Jun 6, 2024

@sstojak1 Hi, sorry for the late reply.
We support using PostgreSQL as a storage option for advanced users who prefer writing their own dashboards without any preset dashboards.
In other words: the devlake implementation written in Golang supports both PG and Mysql while the embedded dashboards for grafana only available for Mysql

@klesh
Copy link
Contributor

klesh commented Jun 6, 2024

@sstojak1 And yes, we have some scripts designated MySQL, sometimes because they are MySQL specific nuances, which were unnecessary for PG from what I know.

@sstojak1
Copy link
Contributor Author

sstojak1 commented Jun 6, 2024

@klesh makes sense, thanks for the info. I have added support for PostgreSQL. Please take a look.
Result when locally running Devlake with PostgreSQL:
image

klesh
klesh previously approved these changes Jun 7, 2024
Copy link
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 7, 2024
@klesh
Copy link
Contributor

klesh commented Jun 7, 2024

Hi, @sstojak1 we are so close, there is one conflict left before I can merge the PR. Would you like to resolve it? Thanks.

# Conflicts:
#	backend/core/models/migrationscripts/register.go
@sstojak1
Copy link
Contributor Author

sstojak1 commented Jun 7, 2024

@klesh done!

@sstojak1
Copy link
Contributor Author

@klesh can we merge it? :)

@klesh klesh merged commit 13b0b43 into apache:main Jun 14, 2024
9 of 10 checks passed
@sstojak1 sstojak1 deleted the feat-add-pull-request-id-index branch June 14, 2024 10:33
klesh pushed a commit that referenced this pull request Jul 16, 2024
…7559)

* adding pull_request_id index to  pull_request_commits/comments tables

* change the pull_request_commits primary key columns order

* adding Apache license header

* only run for mysql

* adding support for postgres

---------

Co-authored-by: Josip Stojak <Josip.Stojak@infobip.com>
klesh pushed a commit that referenced this pull request Jul 16, 2024
…7559)

* adding pull_request_id index to  pull_request_commits/comments tables

* change the pull_request_commits primary key columns order

* adding Apache license header

* only run for mysql

* adding support for postgres

---------

Co-authored-by: Josip Stojak <Josip.Stojak@infobip.com>
abeizn pushed a commit that referenced this pull request Jul 16, 2024
…7559) (#7744)

* adding pull_request_id index to  pull_request_commits/comments tables

* change the pull_request_commits primary key columns order

* adding Apache license header

* only run for mysql

* adding support for postgres

---------

Co-authored-by: sstojak1 <18380216+sstojak1@users.noreply.github.com>
Co-authored-by: Josip Stojak <Josip.Stojak@infobip.com>
@abeizn
Copy link
Contributor

abeizn commented Jul 16, 2024

@sstojak1 Already released in v1.0.1-beta3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/framework This issue or PR relates to the framework lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files. type/refactor This issue is to refactor existing code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Refactor][DORA] Refactor CalculateChangeLeadTimeMeta subtask
4 participants