Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard unread data of git cat-file #29297

Merged
merged 3 commits into from Feb 22, 2024
Merged

Conversation

KN4CK3R
Copy link
Member

@KN4CK3R KN4CK3R commented Feb 21, 2024

Fixes #29101
Related #29298

Discard all read data to prevent misinterpreting existing data. Some discard calls were missing in error cases.

@KN4CK3R KN4CK3R added type/bug backport/v1.21 This PR should be backported to Gitea 1.21 labels Feb 21, 2024
@KN4CK3R KN4CK3R added this to the 1.22.0 milestone Feb 21, 2024
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Feb 21, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 21, 2024
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Feb 21, 2024
}
return nil

return DiscardFull(b.rd, b.n+1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Changed behaviour here. Old code just double called b.cancel but new code may error. But the go spec defines double closes as undefined behaviour, so it should be fine.

@KN4CK3R
Copy link
Member Author

KN4CK3R commented Feb 21, 2024

Reason why the code sometimes works and sometimes not is in #29298.

KN4CK3R added a commit that referenced this pull request Feb 21, 2024
Fixes the reason why #29101 is hard to replicate.
Related #29297

Create a repo with a file with minimum size 4097 bytes (I use 10000) and
execute the following code:
```go
gitRepo, err := gitrepo.OpenRepository(db.DefaultContext, <repo>)
assert.NoError(t, err)

commit, err := gitRepo.GetCommit(<sha>)
assert.NoError(t, err)

entry, err := commit.GetTreeEntryByPath(<file>)
assert.NoError(t, err)

b := entry.Blob()

// Create a reader
r, err := b.DataAsync()
assert.NoError(t, err)
defer r.Close()

// Create a second reader
r2, err := b.DataAsync()
assert.NoError(t, err) // Should be no error but is ErrNotExist
defer r2.Close()
```

The problem is the check in `CatFileBatch`:

https://github.com/go-gitea/gitea/blob/79217ea63c1f77de7ca79813ae45950724e63d02/modules/git/repo_base_nogogit.go#L81-L87
`Buffered() > 0` is used to check if there is a "operation" in progress
at the moment. This is a problem because we can't control the internal
buffer in the `bufio.Reader`. The code above demonstrates a sequence
which initiates an operation for which the code thinks there is no
active processing. The second call to `DataAsync()` therefore reuses the
existing instances instead of creating a new batch reader.
KN4CK3R added a commit to KN4CK3R/gitea that referenced this pull request Feb 21, 2024
Fixes the reason why go-gitea#29101 is hard to replicate.
Related go-gitea#29297

Create a repo with a file with minimum size 4097 bytes (I use 10000) and
execute the following code:
```go
gitRepo, err := gitrepo.OpenRepository(db.DefaultContext, <repo>)
assert.NoError(t, err)

commit, err := gitRepo.GetCommit(<sha>)
assert.NoError(t, err)

entry, err := commit.GetTreeEntryByPath(<file>)
assert.NoError(t, err)

b := entry.Blob()

// Create a reader
r, err := b.DataAsync()
assert.NoError(t, err)
defer r.Close()

// Create a second reader
r2, err := b.DataAsync()
assert.NoError(t, err) // Should be no error but is ErrNotExist
defer r2.Close()
```

The problem is the check in `CatFileBatch`:

https://github.com/go-gitea/gitea/blob/79217ea63c1f77de7ca79813ae45950724e63d02/modules/git/repo_base_nogogit.go#L81-L87
`Buffered() > 0` is used to check if there is a "operation" in progress
at the moment. This is a problem because we can't control the internal
buffer in the `bufio.Reader`. The code above demonstrates a sequence
which initiates an operation for which the code thinks there is no
active processing. The second call to `DataAsync()` therefore reuses the
existing instances instead of creating a new batch reader.
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Feb 22, 2024
silverwind pushed a commit that referenced this pull request Feb 22, 2024
Backport #29298
Fixes the reason why #29101 is hard to replicate.
Related #29297

Create a repo with a file with minimum size 4097 bytes (I use 10000) and
execute the following code:
```go
gitRepo, err := gitrepo.OpenRepository(db.DefaultContext, <repo>)
assert.NoError(t, err)

commit, err := gitRepo.GetCommit(<sha>)
assert.NoError(t, err)

entry, err := commit.GetTreeEntryByPath(<file>)
assert.NoError(t, err)

b := entry.Blob()

// Create a reader
r, err := b.DataAsync()
assert.NoError(t, err)
defer r.Close()

// Create a second reader
r2, err := b.DataAsync()
assert.NoError(t, err) // Should be no error but is ErrNotExist
defer r2.Close()
```

The problem is the check in `CatFileBatch`:


https://github.com/go-gitea/gitea/blob/79217ea63c1f77de7ca79813ae45950724e63d02/modules/git/repo_base_nogogit.go#L81-L87
`Buffered() > 0` is used to check if there is a "operation" in progress
at the moment. This is a problem because we can't control the internal
buffer in the `bufio.Reader`. The code above demonstrates a sequence
which initiates an operation for which the code thinks there is no
active processing. The second call to `DataAsync()` therefore reuses the
existing instances instead of creating a new batch reader.
@lunny lunny enabled auto-merge (squash) February 22, 2024 03:22
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Feb 22, 2024
@lunny lunny merged commit d6811ba into go-gitea:main Feb 22, 2024
26 checks passed
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Feb 22, 2024
GiteaBot pushed a commit to GiteaBot/gitea that referenced this pull request Feb 22, 2024
Fixes go-gitea#29101
Related go-gitea#29298

Discard all read data to prevent misinterpreting existing data. Some
discard calls were missing in error cases.

---------

Co-authored-by: yp05327 <576951401@qq.com>
@GiteaBot GiteaBot added the backport/done All backports for this PR have been created label Feb 22, 2024
lunny pushed a commit that referenced this pull request Feb 22, 2024
Backport #29297 by @KN4CK3R

Fixes #29101
Related #29298

Discard all read data to prevent misinterpreting existing data. Some
discard calls were missing in error cases.

Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Co-authored-by: yp05327 <576951401@qq.com>
zjjhot added a commit to zjjhot/gitea that referenced this pull request Feb 23, 2024
* giteaofficial/main:
  Start to migrate from `util.OptionalBool` to `optional.Option[bool]` (go-gitea#29329)
  Add slow SQL query warning (go-gitea#27545)
  Unify organizations header (go-gitea#29248)
  Frontport changelogs of minor releases (go-gitea#29337)
  Support SAML authentication (go-gitea#25165)
  Upgrade to fabric 6 (go-gitea#29334)
  Don't show third-party JS errors in production builds (go-gitea#29303)
  Remove bountysource (go-gitea#29330)
  Remove unnecessary "Str2html" modifier from templates (go-gitea#29319)
  Ignore the linux anchor point to avoid linux migrate failure (go-gitea#29295)
  Remove jQuery from the repo commit functions (go-gitea#29230)
  Remove unnecessary "Safe" modifier from templates (go-gitea#29318)
  Remove jQuery from the image pasting functionality (go-gitea#29324)
  Improve the `issue_comment` workflow trigger event (go-gitea#29277)
  Properly migrate automatic merge GitLab comments (go-gitea#27873)
  Refactor cmd setup and remove deadcode (go-gitea#29313)
  small cache when get user id on interation (go-gitea#29296)
  Discard unread data of `git cat-file` (go-gitea#29297)
  Don't install playwright twice (go-gitea#29302)

# Conflicts:
#	templates/home.tmpl
brechtvl added a commit to blender/gitea that referenced this pull request Feb 23, 2024
…tea#29310)"

This reverts commit ed5e0c8.

This causes Edit File on larger files to time out.
@KN4CK3R KN4CK3R deleted the fix-nogogit-discard branch February 24, 2024 22:52
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/done All backports for this PR have been created backport/v1.21 This PR should be backported to Gitea 1.21 lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid output in batch_reader.go
5 participants