Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility in gitbucket migration #30316

Closed
jam7 opened this issue Apr 7, 2024 · 9 comments
Closed

Incompatibility in gitbucket migration #30316

jam7 opened this issue Apr 7, 2024 · 9 comments
Labels
topic/repo-migration Migrate repos from other platforms to Gitea, or from Gitea to them type/bug type/upstream This is an issue in one of Gitea's dependencies and should be reported there

Comments

@jam7
Copy link
Contributor

jam7 commented Apr 7, 2024

Description

Thank you for development. Thank you for implementing migration from gitbucket (#16767). It helps me a lot. However, it doesn't work if a gitbucket repository has many issues or PRs. I inspect it and find the source of problem, so I'm posting this bug report.

While I'm checking the log of migration, I notice there are several logs like Request get issues 49/1, but in fact get 39. This is the source of gitbucket migration problem caused by following incompatibility

  • Gitea treat it, fewer issues than perPage issues, as a mark of end of information.
  • Gitbucket doesn't return full 49 issues if there are missing issues in the situation like #1 issue, #2 PR, #3 issue.

Easiest way to patch this is to change the migration mechanism. For example, it is possible to change the mechanism to perform migration until an empty array of issues is returned from GetIssue() routine.

+++ b/services/migrations/migrate.go
@@ -331,6 +331,10 @@ func migrateRepository(ctx context.Context, doer *user_model.User, downloader ba

                for i := 1; ; i++ {
                        issues, isEnd, err := downloader.GetIssues(i, issueBatchSize)
+                       isEnd = false
+                       if len(issues) == 0 {
+                               break;
+                       }
                        if err != nil {
                                if !base.IsErrNotSupported(err) {
                                        return err

This works fine while the number of missing issues is small. If there are many missing numbers more than 98 in issues like #1 issue, #2 PR, ..., #99 PR, #100 issue, this doesn't work well. I have no idea how to improve the mechanism, so I'm just posting this bug report.

Gitea Version

1.21.10

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

Gitea in docker, gitea/gitea:1.21.10.

Database

None

@jam7 jam7 added the type/bug label Apr 7, 2024
@lunny
Copy link
Member

lunny commented Apr 7, 2024

But how did you know it's end if there is one middle page that contained issues that are all pull requests?

@jam7
Copy link
Contributor Author

jam7 commented Apr 7, 2024

Yes. That is the problem and I have no idea how to improve the mechanism, so I'm just posting this bug report.

@jam7
Copy link
Contributor Author

jam7 commented Apr 7, 2024

Maybe, it's something what gitbucket and other github clone must take care of. But, it's still good to know the reason of this kind of problems... You know. ;-)

@lunny
Copy link
Member

lunny commented Apr 7, 2024

image

Maybe we can rewrite the function GithubDownloaderV3.GetIssues to use this fields(Next Page, Last Page) to confirm whether it's end.

@techknowlogick techknowlogick added the topic/repo-migration Migrate repos from other platforms to Gitea, or from Gitea to them label Apr 7, 2024
jam7 added a commit to jam7/gitea that referenced this issue Apr 7, 2024
Change the migration mechanism for gitbucket as mentioned in
go-gitea#30316.  This doesn't work
coorectly for all cases, so I don't PR this.  Leave this patch
as is for ppl need this.
@jam7
Copy link
Contributor Author

jam7 commented Apr 8, 2024

Thank you for suggestion. Unfortunately, gitbucket doesn't support those fields (Next Page, Last Page).

$ http_proxy= HTTP_PROXY= curl -L -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" "http://localhost:50856/api/v3/repos/CWB-LIB/Library/pulls?state=all&sort=created&direction=asc&page=6&per_page=49" -u USER:PASSWD -I
HTTP/1.1 200 OK
Date: Mon, 08 Apr 2024 02:48:31 GMT
Set-Cookie: JSESSIONID=node06o594avhud4rgctaefj2zi1z102.node0; Path=/; HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: application/json
Transfer-Encoding: chunked

github.com

$ curl -L -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" "http://api.github.com/repos/go-gitea/gitea/issues?state=all&sort=created&direction=asc&page=1&per_page=49" -I
HTTP/1.1 301 Moved Permanently
Content-Length: 0
Location: https://api.github.com/repos/go-gitea/gitea/issues?state=all&sort=created&direction=asc&page=1&per_page=49

HTTP/2 200
server: GitHub.com
date: Mon, 08 Apr 2024 03:03:52 GMT
content-type: application/json; charset=utf-8
cache-control: public, max-age=60, s-maxage=60
vary: Accept, Accept-Encoding, Accept, X-Requested-With
etag: W/"db7bdfe2b0d8646e5d376ddaef072ae2cd1d0160dfdf82de1f39758b3eb8f264"
x-github-media-type: github.v3; format=json
link: <https://api.github.com/repositories/72495579/issues?state=all&sort=created&direction=asc&page=2&per_page=49>; rel="next", <https://api.github.com/repositories/72495579/issues?state=all&sort=created&direction=asc&page=615&per_page=49>; rel="last"
x-github-api-version-selected: 2022-11-28
access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset
access-control-allow-origin: *
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'
x-ratelimit-limit: 60
x-ratelimit-remaining: 58
x-ratelimit-reset: 1712548983
x-ratelimit-resource: core
x-ratelimit-used: 2
accept-ranges: bytes
x-github-request-id: D4D4:167ECD:1A8817B:1AA9EFD:66135E98

@lunny lunny added the type/upstream This is an issue in one of Gitea's dependencies and should be reported there label Apr 8, 2024
@lunny
Copy link
Member

lunny commented Apr 8, 2024

So maybe gitbucket needs to make some improvements to their API, otherwise, we cannot know whether the pagination is ending.

@jam7
Copy link
Contributor Author

jam7 commented Apr 9, 2024

Just an update. I've read gitbucket source code and noticed it doesn't support per_page at all. I check again the actual message from gitea log. It is 2024/04/09 07:40:41 ...migrations/github.go:434:GetIssues() [T] Request get issues 49/1, but in fact get 25. The internal per_page of gitbucket is just 25. This is the real source of this problem. Sorry for wrong description in previous posts.

So, it may be possible to use 25 as a hard coded per_page value for only gitbucket migration until gitbucket changes it. Is such kind of patch acceptable for gitea? If it's acceptable, I'll try to create a PR.

@lunny
Copy link
Member

lunny commented Apr 9, 2024

Just an update. I've read gitbucket source code and noticed it doesn't support per_page at all. I check again the actual message from gitea log. It is 2024/04/09 07:40:41 ...migrations/github.go:434:GetIssues() [T] Request get issues 49/1, but in fact get 25. The internal per_page of gitbucket is just 25. This is the real source of this problem. Sorry for wrong description in previous posts.

So, it may be possible to use 25 as a hard coded per_page value for only gitbucket migration until gitbucket changes it. Is such kind of patch acceptable for gitea? If it's acceptable, I'll try to create a PR.

Thank you for the investigation. I think it's acceptable. Please send a PR.

jam7 added a commit to jam7/gitea that referenced this issue Apr 10, 2024
Change to use an internal perPage of gitbucket as maxPerPage numer.
Description is available on go-gitea#30316.
techknowlogick pushed a commit that referenced this issue Apr 12, 2024
This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!
GiteaBot pushed a commit to GiteaBot/gitea that referenced this issue Apr 12, 2024
This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
go-gitea#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!
silverwind pushed a commit that referenced this issue Apr 12, 2024
Backport #30392 by @jam7

This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!

Co-authored-by: Kazushi (Jam) Marukawa <jam@pobox.com>
@jam7
Copy link
Contributor Author

jam7 commented Apr 14, 2024

A patch is merged and the problem is fixed, so I'm closing this issue. Thank you.

@jam7 jam7 closed this as completed Apr 14, 2024
lunny pushed a commit to lunny/gitea that referenced this issue Apr 14, 2024
This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
go-gitea#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!
silverwind pushed a commit that referenced this issue Apr 14, 2024
Backport #30392 

This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!

Co-authored-by: Kazushi (Jam) Marukawa <jam@pobox.com>
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this issue Apr 16, 2024
This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
go-gitea/gitea#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!

(cherry picked from commit 7af074dbeebc3c863618992b43f84ec9e5ab9657)
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this issue Apr 16, 2024
Backport #30392 by @jam7

This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
go-gitea/gitea#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!

Co-authored-by: Kazushi (Jam) Marukawa <jam@pobox.com>
(cherry picked from commit b941d7485b53e5dd093a1cce3c9ff47c91d4fc58)
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this issue Apr 17, 2024
Backport #30392

This patch improves the migration from gitbucket to gitea.

The gitbucket uses it's own internal perPage value (= 25) for paging and
ignore per_page arguments in the requested URL. This cause gitea to
migrate only 25 issues and 25 PRs from gitbucket repository. This may
not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has
this problem.

This patch change to use this internally hardcoded perPage of gitbucket
as gitea's maxPerPage numer when migrating from gitbucket. There are
several perPage values in gitbucket like 25 for Isseus/PRs and 10 for
Releases. Some of those API doesn't support paging yet. It sounds
difficult to implement, but using the minimum number among them worked
out very well. So, I use 10 in this patch.

Brief descriptions of problems and this patch are also available in
go-gitea/gitea#30316.

In addition, I'm not sure what kind of test cases are possible to write
here. It's a test for migration, so it requires testing gitbucket server
and gitea server, I guess. Please let me know if it is possible to write
such test cases here. Thanks!

Co-authored-by: Kazushi (Jam) Marukawa <jam@pobox.com>
(cherry picked from commit b6379d2f167551560c870d2d705269c9ba6fc3bc)
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Jul 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
topic/repo-migration Migrate repos from other platforms to Gitea, or from Gitea to them type/bug type/upstream This is an issue in one of Gitea's dependencies and should be reported there
Projects
None yet
Development

No branches or pull requests

3 participants