Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit build status not updating (intermittent) #14059

Closed
1 of 4 tasks
bobemoe opened this issue Dec 19, 2020 · 12 comments · Fixed by #17061
Closed
1 of 4 tasks

Commit build status not updating (intermittent) #14059

bobemoe opened this issue Dec 19, 2020 · 12 comments · Fixed by #17061
Labels
Milestone

Comments

@bobemoe
Copy link
Contributor

bobemoe commented Dec 19, 2020

  • Gitea version (or commit ref): 1.14.0+dev-414-g4ef751215
  • Git version: 2.20.1
  • Operating system: Debian based
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite

Description

Occasionally one of my Drone builds will not update it's status properly, leaving the commit as "Build in progress" with the yellow circle, when actually checking Drone it shows "Successful", or "Failed".

This only happens occasionally, the only pattern I can see is its only been for PR's so far, and also when the server is under heavy load. I first noticed when I ran "update branch" on 4 PR's at the same time, each calls 2 builds, so I have many builds queued.

From the logs, the error seems to be a SQL duplicate value in the unique index.

[17/Dec/2020:09:27:30 +0000] "GET /api/v1/repos/xxxx/xxxx/raw/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8/.drone.yml HTTP/1.1" 200 2553 "-" "Go-http-client/1.1"
[17/Dec/2020:09:27:30 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 201 638 "-" "Go-http-client/1.1"
[17/Dec/2020:09:27:31 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 201 636 "-" "Go-http-client/1.1"
[17/Dec/2020:09:45:09 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 201 638 "-" "Go-http-client/1.1"
[17/Dec/2020:09:45:22 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 201 636 "-" "Go-http-client/1.1"
[17/Dec/2020:09:50:59 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 201 636 "-" "Go-http-client/1.1"
[17/Dec/2020:09:50:59 +0000] "POST /api/v1/repos/xxxx/xxxx/statuses/a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8 HTTP/1.1" 500 375 "-" "Go-http-client/1.1"
2020/12/17 09:50:59 ...pi/v1/repo/status.go:64:NewCommitStatus() [E] CreateCommitStatus: NewCommitStatus[repo_id: 1, user_id: 1, sha: a12437fc1f5b4153a3e25f9b8d82a39d
20b7e7e8]: Insert CommitStatus[/home/git/gitea-repositories/xxxx/xxxx.git, a12437fc1f5b4153a3e25f9b8d82a39d20b7e7e8]: Error 1062: Duplicate entry '5-1-a12437fc1f5b
4153a3e25f9b8d82a39d20b7e7e8' for key 'UQE_commit_status_repo_sha_index'

Looking at how the index is generated, I'm guessing two status updates came in simultaneously and both generated the same value? Maybe a lock on the db table is needed during index increment?

// Get the next Status Index
var nextIndex int64
lastCommitStatus := &CommitStatus{
SHA: opts.SHA,
RepoID: opts.Repo.ID,
}
has, err := sess.Desc("index").Limit(1).Get(lastCommitStatus)
if err != nil {
if err := sess.Rollback(); err != nil {
log.Error("NewCommitStatus: sess.Rollback: %v", err)
}
return fmt.Errorf("NewCommitStatus[%s, %s]: %v", repoPath, opts.SHA, err)
}
if has {
log.Debug("NewCommitStatus[%s, %s]: found", repoPath, opts.SHA)
nextIndex = lastCommitStatus.Index
}
opts.CommitStatus.Index = nextIndex + 1
log.Debug("NewCommitStatus[%s, %s]: %d", repoPath, opts.SHA, opts.CommitStatus.Index)
opts.CommitStatus.ContextHash = hashCommitStatusContext(opts.CommitStatus.Context)

I notice that file has been updated yesterday, but looking at the code it looks like the index is still generated the same way. I will update my gitea anyway and keep an eye on things. Just happened again on latest master. I've updated Gitea version above.

@lunny lunny added the type/bug label Dec 19, 2020
@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 24, 2020

I've managed to recreate this on my dev instance using a script to fire simultaneous commit updates at the API.

Interestingly it didn't trigger the error when using sqlite3, however when I switch to mariadb10.3 it happens.

I used bash script:

n="test" c=0 ; while [ "$c" -lt 10 ] ; do ((c=c+1)) ; (curl -X POST "http://localhost:3000/api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b" -H  "accept: application/json" -H  "Content-Type: application/json" -d "{  \"context\": \"test\",  \"description\": \"$n $c\",  \"state\": \"pending\",  \"target_url\": \"http://localhost\"}" &) ; done

Which gives:

gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33530
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33540
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33536
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33550
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33546
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33570
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33554
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33558
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33562
gitea_1         | 2020/12/24 17:53:33 Started POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b for 172.18.0.1:33566
gitea_1         | 2020/12/24 17:53:34 ...pi/v1/repo/status.go:64:NewCommitStatus() [E] CreateCommitStatus: NewCommitStatus[repo_id: 1, user_id: 1, sha: 11616565f5b0d4491a0e06140b070215f53643e7]: Insert CommitStatus[/src/data/git/gitea-repositories/test/test3.git, 11616565f5b0d4491a0e06140b070215f53643e7]: Error 1062: Duplicate entry '62-1-11616565f5b0d4491a0e06140b070215f53643e7' for key 'UQE_commit_status_repo_sha_index'
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 342.516088ms
gitea_1         | 2020/12/24 17:53:34 ...pi/v1/repo/status.go:64:NewCommitStatus() [E] CreateCommitStatus: NewCommitStatus[repo_id: 1, user_id: 1, sha: 11616565f5b0d4491a0e06140b070215f53643e7]: Insert CommitStatus[/src/data/git/gitea-repositories/test/test3.git, 11616565f5b0d4491a0e06140b070215f53643e7]: Error 1062: Duplicate entry '62-1-11616565f5b0d4491a0e06140b070215f53643e7' for key 'UQE_commit_status_repo_sha_index'
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 447.751879ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 323.652796ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 351.012977ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 370.496669ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 383.056226ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 498.687752ms
gitea_1         | 2020/12/24 17:53:34 ...pi/v1/repo/status.go:64:NewCommitStatus() [E] CreateCommitStatus: NewCommitStatus[repo_id: 1, user_id: 1, sha: 11616565f5b0d4491a0e06140b070215f53643e7]: Insert CommitStatus[/src/data/git/gitea-repositories/test/test3.git, 11616565f5b0d4491a0e06140b070215f53643e7]: Error 1062: Duplicate entry '66-1-11616565f5b0d4491a0e06140b070215f53643e7' for key 'UQE_commit_status_repo_sha_index'
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 501.800019ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 415.291119ms
gitea_1         | 2020/12/24 17:53:34 Completed POST /api/v1/repos/test/test3/statuses/11616565f5b0d4491a0e06140b070215f53643e7?access_token=513b2ed758ca2ba520dd0209beb6ab594317880b 0  in 425.528454ms

I'm still thinking a lock on the db table would solve this.

Or maybe the index field could be auto increment? I'm not sure why there is an index field?

Looking at the xorm docs it looks like sess.ForUpdate() https://gitea.com/xorm/xorm/src/branch/master/session.go#L243 may be the solution but I'm not sure how I specify the table to lock or would it lock the whole db? I will have a play with this now and report back.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 24, 2020

So with sess.forUpdate() called soon after sess is created, it changes the error from Error 1062: Duplicate entry to Error 1213: Deadlock found

gitea_1         | 2020/12/24 18:15:06 ...pi/v1/repo/status.go:64:NewCommitStatus() [E] CreateCommitStatus: NewCommitStatus[repo_id: 1, user_id: 1, sha: 11616565f5b0d4491a0e06140b070215f53643e7]: NewCommitStatus[/src/data/git/gitea-repositories/test/test3.git, 11616565f5b0d4491a0e06140b070215f53643e7]: Error 1213: Deadlock found when trying to get lock; try restarting transaction

I guess I could catch that error and recall the NewCommitStatus function? Though why not just catch duplicate entry error and retry.

It doens't look like x.ForUpdate() is uses anywhere else in Gitea codebase? I fear using it here might make deadlock errors start appearing elsewhere!?

Not sure I know what I'm doing enough to solve this one.

Maybe using an autoincrement index may be better?

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 24, 2020

It looks like the index field is just used to get the first/last or order the commit statuses for a specific commit? But I'm not sure.

If this is the case, then we could do this using the id field which is already autoincrement. Doesn't matter if its not contiguous or starts from 1.

We could remove the index field then. Probably break all sorts! :/

@lunny
Copy link
Member

lunny commented Dec 26, 2020

sess.ForUpdate() only support MySQL but not all databases Gitea supported.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 26, 2020

OK, so sess.ForUpdate() is out of the question then.

So I think maybe the solution is to check for Error 1062: Duplicate entry and if so retry?

How would I test for that error, err.code==1062 or err=='Error 1062: Duplicate entry' or something? Sorry excuse my poor Go.

@lunny
Copy link
Member

lunny commented Dec 27, 2020

But this is only for mysql, for other database it will return different error. Maybe we could just retry serval times one returned error as a temporary solution.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 27, 2020

OK. I think I would be able to make a retry loop, say 3 to 5 times perhaps? maybe with a couple hundred miliseconds delay between retries?

Though as you say it feels like a temporary workaround.

I'm tempted to remove that index column, and refactor the sorting to use the id column.

@lunny
Copy link
Member

lunny commented Dec 27, 2020

The index column could be used to know the sequence for commit statuses of one commit. Maybe we should not remove it.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 27, 2020

Yes. But the same sequence can also be known by ordering by the id field.

MariaDB [git]> select id,`index`,sha,state from commit_status where sha='f0e325d9a81450c3f5a81249d19149a74d5e85c9' order by `index`; 
+------+-------+------------------------------------------+---------+
| id   | index | sha                                      | state   |
+------+-------+------------------------------------------+---------+
| 1905 |     1 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1906 |     2 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1907 |     3 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | failure |
| 1912 |     4 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1913 |     5 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1914 |     6 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | error   |
+------+-------+------------------------------------------+---------+
6 rows in set (0.000 sec)

MariaDB [git]> select id,`index`,sha,state from commit_status where sha='f0e325d9a81450c3f5a81249d19149a74d5e85c9' order by `id`; 
+------+-------+------------------------------------------+---------+
| id   | index | sha                                      | state   |
+------+-------+------------------------------------------+---------+
| 1905 |     1 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1906 |     2 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1907 |     3 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | failure |
| 1912 |     4 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1913 |     5 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | pending |
| 1914 |     6 | f0e325d9a81450c3f5a81249d19149a74d5e85c9 | error   |
+------+-------+------------------------------------------+---------+
6 rows in set (0.001 sec)

If this is its only use then I could refactor to use id. If there needs to be a contiguous counter starting from 1 then this could be calculated when looping through the results.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 27, 2020

If there is another reason for the index field then this of course is not feasible.

It gives me another idea though. We could insert the record with a null index, then select all records for that sha order by id, loop through counting and update any with null index to the correct index count.

Again seems a bit of a kludge. There is a possibility the table will be queried from elsewhere while a row has a null index.

@bobemoe
Copy link
Contributor Author

bobemoe commented Dec 27, 2020

I've removed index and refactored accordingly, everything builds and drone integration still works fine :) I will create PR so we can review there.

@parnic-sks
Copy link

We are running into this issue as well on our postgres-backed Gitea. Thanks for working on a solution, @bobemoe! For now, our calling script has baked in delayed retries to try and avoid this.

Sorting by ID does seem like a good workaround. There's also already timestamps for created/updated which could be guaranteed to be sortable, regardless of the backing database server type.

@lunny lunny added this to the 1.16.0 milestone Sep 16, 2021
@go-gitea go-gitea locked and limited conversation to collaborators Oct 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants