Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle branch name with prefix in GitHub migration #20357

Merged
merged 14 commits into from
Nov 3, 2022

Conversation

harryzcy
Copy link
Contributor

@harryzcy harryzcy commented Jul 13, 2022

GitHub allows releases with target commitish refs/heads/BRANCH, which then causes issues in Gitea after migration. This fix handles cases that a branch already has a prefix.

Fixes #20317

wxiaoguang
wxiaoguang previously approved these changes Jul 14, 2022
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not bad, since the ref inside Gitea was not clear before.

Update: as discussed below, there should be a clear solution for it, instead of making it more unclear.

@GiteaBot GiteaBot added the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label Jul 14, 2022
@zeripath
Copy link
Contributor

zeripath commented Jul 14, 2022

Have you considered what happens if you actually want to refer to refs/heads/branch? (i.e. refs/heads/refs/heads/branch)

@zeripath
Copy link
Contributor

I have been intermittently going through the code to double check that this actually still works - have you considered how your code will or will not break that?

I think we need to be very very very very very careful with just switching to allow this kind of thing. When Gitea wants to check if there is a branch it should be ensuring that the ref head prefix is there - it should not be removing things.

It will cause problems.

@wxiaoguang
Copy link
Contributor

Yup, I agree with your point. I have thought about the possibility that this change might change the behavior, that's what I mean "not bad" (indeed not ideal enough), anyway the ref messy problem is a longstanding problem.

If we want to strict the ref usages inside Gitea and strict the behaviors, I think there should be some comments and tests to make sure the logic is clear and make sure future developers can know what is the right thing to do.

@zeripath
Copy link
Contributor

I should add: does this code affect protected branches ? Is there a way of abusing this arbitrary removal of prefix to break through protected branches?

@wxiaoguang wxiaoguang dismissed their stale review July 14, 2022 12:00

need a more clear solution

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jul 14, 2022
@wxiaoguang wxiaoguang removed the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label Jul 14, 2022
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jul 14, 2022

My thought was this change only causes problem if a user has the branch refs/heads/refs/heads/BRANCH, which means they use refs/heads/BRANCH for their branch name, which is already somewhat confusing, so it could be considered as the user's decision.

And yes, this change is not ideal (and as your option, not correct, and now I also agree that it's wrong), so it's better to make GetBranchCommitID etc covered by tests and add more comments for it, to avoid abuse or incorrect change.

@harryzcy
Copy link
Contributor Author

I agree that we must be cautious about this, and the current change may not be correct (will cause issues in other parts).

For how Gitea should be handling tags, I think we could follow how git rev-parse handles <refname>

<refname>, e.g. master, heads/master, refs/heads/master
A symbolic ref name. E.g. master typically means the commit object referenced by refs/heads/master. If you happen to have both heads/master and tags/master, you can explicitly say heads/master to tell Git which one you mean. When ambiguous, a <refname> is disambiguated by taking the first match in the following rules:

  1. If $GIT_DIR/<refname> exists, that is what you mean (this is usually useful only for HEAD, FETCH_HEAD, ORIG_HEAD, MERGE_HEAD and CHERRY_PICK_HEAD);
  2. otherwise, refs/<refname> if it exists;
  3. otherwise, refs/tags/<refname> if it exists;
  4. otherwise, refs/heads/<refname> if it exists;
  5. otherwise, refs/remotes/<refname> if it exists;
  6. otherwise, refs/remotes/<refname>/HEAD if it exists.

Then, when both refs/heads/<refname> and refs/heads/refs/heads/<refname> exists, the latter should be explicitly expressed by heads/refs/heads/<refname> or refs/heads/refs/heads/<refname>. I think this behavior should be consistent throughout.

@zeripath
Copy link
Contributor

I think therefore we agree that this PR is not the correct thing to do then.

We should be trying to always pass a ref when we can instead of arbitrarily removing the prefix.

By always adding the ref prefix we avoid the ambiguity within git's refname resolution.

The problem is that git's refname behaviour is sensible as a cli tool but for a server we have to be more explicit.

@harryzcy
Copy link
Contributor Author

The Releases page should allow both refs/heads/<refname> and <refname>.

I think now the changes only apply to the Releases page and won't break any other parts.

@codecov-commenter
Copy link

codecov-commenter commented Jul 14, 2022

Codecov Report

Merging #20357 (c3d4c98) into main (498352c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main   #20357   +/-   ##
=======================================
  Coverage   46.90%   46.91%           
=======================================
  Files         981      981           
  Lines      136262   136266    +4     
=======================================
+ Hits        63915    63924    +9     
+ Misses      64501    64498    -3     
+ Partials     7846     7844    -2     
Impacted Files Coverage Δ
services/migrations/github.go 65.55% <100.00%> (+0.20%) ⬆️
modules/queue/queue_channel.go 78.70% <0.00%> (-1.86%) ⬇️
modules/queue/queue_disk_channel.go 68.33% <0.00%> (-1.25%) ⬇️
services/pull/check.go 27.95% <0.00%> (-1.19%) ⬇️
services/pull/pull.go 40.18% <0.00%> (-0.62%) ⬇️
routers/api/v1/repo/pull.go 47.00% <0.00%> (-0.18%) ⬇️
modules/queue/queue_bytefifo.go 49.09% <0.00%> (+1.80%) ⬆️
modules/git/repo_attribute.go 72.63% <0.00%> (+2.10%) ⬆️
modules/charset/charset.go 75.36% <0.00%> (+3.62%) ⬆️
modules/process/manager_exec.go 93.02% <0.00%> (+6.97%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@harryzcy harryzcy requested a review from wxiaoguang July 22, 2022 08:22
@wxiaoguang
Copy link
Contributor

If I understand correctly, the new code is still "guessing" (ResolveBranch) whether the target is a branch name or a reference. To make the logic stable and clear, could the target be clarified to be a reference?

And cc @zeripath , how do you think?

@harryzcy
Copy link
Contributor Author

If I understand correctly, the new code is still "guessing" (ResolveBranch) whether the target is a branch name or a reference. To make the logic stable and clear, could the target be clarified to be a reference?

A branch name and a reference appeared to have the same effect in git. I think at best we can detect if a reference has the BranchPrefix and try to resolve it as a reference as is.

@zeripath
Copy link
Contributor

I think the correct place to fix this is in the migration - where we should remove/add the refs/heads/ prefix as necessary.

Once things are in Gitea itself we should really be assuming that a reference to refs/heads/main is a reference to /refs/heads/refs/heads/main.

@harryzcy harryzcy changed the title Handle branch name with prefix already Handle branch name with prefix in migration Jul 23, 2022
@harryzcy
Copy link
Contributor Author

I think now the branch names are resolved during migration. Could you please help to review the changes? @zeripath @wxiaoguang

@wxiaoguang
Copy link
Contributor

Thank you for asking me for review, I am quite busy recently and haven't thought about how to make the branch/tag/commit-id more consistent in this case. Maybe other maintainers could help out.

@wxiaoguang
Copy link
Contributor

@zeripath how do you think about current approach?

@lunny lunny added the type/bug label Oct 30, 2022
Copy link
Contributor

@zeripath zeripath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we currently don't prefix things with refs/heads/ I think this makes things consistent and therefore it can be merged. We should however, consider doing the opposite of this PR and instead always add the prefix in other contexts but I don't have time right now to go through all of the migration types to get this correct.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Nov 2, 2022
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Current approach only affects the GithubDownloaderV3, I think it's fine to take it as a quick fix (patch).

To make the commitish clear, it still needs a lot of work in the future.

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Nov 2, 2022
@wxiaoguang wxiaoguang merged commit 81ea4f9 into go-gitea:main Nov 3, 2022
@wxiaoguang wxiaoguang changed the title Handle branch name with prefix in migration Handle branch name with prefix in GitHub migration Nov 3, 2022
@harryzcy harryzcy deleted the branch-with-prefix branch November 3, 2022 06:48
zjjhot added a commit to zjjhot/gitea that referenced this pull request Nov 4, 2022
* giteaofficial/main:
  Remove Gusted as Gitea maintainer  (go-gitea#21676)
  Fix token generation when using INTERNAL_TOKEN_URI (go-gitea#21669)
  Clean up formatting on install page (go-gitea#21668)
  Add Webhook authorization header (go-gitea#20926)
  feat: notify doers of a merge when automerging (go-gitea#21553)
  Remove deprecated DSA host key from Docker Container (go-gitea#21522)
  Alter package_version.metadata_json to LONGTEXT (go-gitea#21667)
  Handle branch name with prefix in GitHub migration (go-gitea#20357)
  [skip ci] Updated translations via Crowdin
  Split migrations folder (go-gitea#21549)
@harryzcy
Copy link
Contributor Author

GitHub has a blog post today about ambiguous branch names. I'm commenting here for references, and it maybe something to watch to make the commitish clear.

@GiteaBot GiteaBot added the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label Mar 23, 2023
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. lgtm/need 1 This PR needs approval from one additional maintainer to be merged. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Commit diff count is off for repos migrated from GitHub
6 participants