Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix artifact v4 upload above 8MB #31664

Merged
merged 18 commits into from
Sep 22, 2024

Conversation

ChristopherHX
Copy link
Contributor

@ChristopherHX ChristopherHX commented Jul 20, 2024

Multiple chunks are uploaded with type "block" without using "appendBlock" and eventually out of order for bigger uploads.
8MB seems to be the chunk size

This change parses the blockList uploaded after all blocks to get the final artifact size and order them correctly before calculating the sha256 checksum over all blocks

Fixes #31354

@pull-request-size pull-request-size bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 20, 2024
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jul 20, 2024
@github-actions github-actions bot added modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code labels Jul 20, 2024
@ChristopherHX
Copy link
Contributor Author

ChristopherHX commented Jul 20, 2024

Doesn't work yet.

40MB are fixed now but ca. 800MB still cause checksum errors, are there some asynchronious things going on...

Parallel Chunk updates has been seen, 6 of 9 missing chunks found in log as duplicates

@ChristopherHX
Copy link
Contributor Author

ChristopherHX commented Jul 20, 2024

After some tests the chunks are still out of order, investigating to implement blockList and store chunks using their name specified per query

During merging read the block list to build the reader over every chunk

@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 20, 2024
Copy link
Contributor Author

@ChristopherHX ChristopherHX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is fixed like that, but we need to look at the security of storing a blockid in blob storage or find alternatives

Uploaded bytes 746586112
Uploaded bytes 754974720
Uploaded bytes 763363328
Uploaded bytes 771751936
Uploaded bytes 780140544
Uploaded bytes 788529152
Uploaded bytes 796917760
Uploaded bytes 805306368
Uploaded bytes 807753807
Finished uploading artifact content to blob storage!
SHA256 hash of uploaded artifact zip is 6feeaf6b88049a4c6ced1240ed3911afaa819229cd082999be5c81eff09c33f1
Finalizing artifact upload
Artifact artifact.zip successfully finalized. Artifact ID 22
Artifact artifact has been successfully uploaded! Final size is 807753807 bytes. Artifact ID is 22
Artifact download URL: http://localhost:3000/test/artifact-upload-big/actions/runs/39/artifacts/22

Comment on lines 330 to 334
_, err := r.fs.Save(fmt.Sprintf("tmp%d/block-%d-%d-%s", task.Job.RunID, task.Job.RunID, ctx.Req.ContentLength, blockid), ctx.Req.Body, -1)
if err != nil {
log.Error("Error runner api getting task: task is not running")
ctx.Error(http.StatusInternalServerError, "Error runner api getting task: task is not running")
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a blockid, delay ordering chunks to the end and use it's blockid to form the name

I notice that this might be a security issue, because an attacker could control the filesystem name, but what alternatives do we have?
Santizing is probably the way to go here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to base64url encoding of the blockid that doesn't allow to traverse the filesystem or do other bad things

Comment on lines 338 to 339
case "blocklist":
_, err := r.fs.Save(fmt.Sprintf("tmp%d/%d-blocklist", task.Job.RunID, task.Job.RunID), ctx.Req.Body, -1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we got the final block order by blockid in xml, save it for the merge step

Comment on lines 380 to 392
log.Warn("Error merge chunks: %v", err)
chunkMap, err := listChunksByRunID(r.fs, runID)
if err != nil {
log.Error("Error merge chunks: %v", err)
ctx.Error(http.StatusInternalServerError, "Error merge chunks")
return
}
chunks, ok = chunkMap[artifact.ID]
if !ok {
log.Error("Error merge chunks")
ctx.Error(http.StatusInternalServerError, "Error merge chunks")
return
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My tests don't upload a blockmap yet, fallback try old broken mode

@@ -123,6 +123,49 @@ func listChunksByRunID(st storage.ObjectStorage, runID int64) (map[int64][]*chun
return chunksMap, nil
}

func listChunksByRunIDV4(st storage.ObjectStorage, runID int64, artifactID int64, blist *BlockList) ([]*chunkFileItem, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntetise a chunk list with the correct start/end and artifact entries, based on the xml Blocklist

@ChristopherHX ChristopherHX marked this pull request as ready for review July 21, 2024 17:06
@VAllens
Copy link

VAllens commented Jul 23, 2024

I also encountered the same problem.
When can this PR be merged?
Has there been any progress?

@silverwind
Copy link
Member

look at the security of storing a blockid in blob storage

Isn't the storage considered "private" anyways?

@ChristopherHX
Copy link
Contributor Author

look at the security of storing a blockid in blob storage

Isn't the storage considered "private" anyways?

Idk if the gitea object storage is protected against this pattern tmpv4267/block-267-9-/../some/custom/path

All runs of (all users?) use the same artifact storage container

I double fixed that now so, it's no potential problem anymore

@silverwind
Copy link
Member

Ah I see where you are getting at. Yes, cross-repo contamination should be avoided.

@silverwind
Copy link
Member

Is a test easily possible here? Might be valuable.

@ChristopherHX
Copy link
Contributor Author

Is a test easily possible here?

A test that uploads a block with a blockid like ../some/path and verify that it worked without failure?
Optionally check that all files created in object storage are in the correct folder

Yes I think this should be possible to do for me tomorrow

@ChristopherHX
Copy link
Contributor Author

ChristopherHX commented Jul 24, 2024

My PR seem to have issues with minio and azurite in backend tests...

Could it be that the blob storage that has been written by storage.Actions.Save(...) is not immediately available via storage.Actions.Open

e.g. for minio I get an error like The specified key does not exist, not shure if it comes from encoding/xml or minio

e.g. for azureite file does not exist

local backend has a 100% success rate

I'm pretty shure the file has been created a few cycles before the open call

Do you have any advice here?

EDIT
Is this something else? And the key is never stored on disk for minio/azureite, 5s retries didn't work

EDIT 2
The minio error is from encoding/xml...

EDIT 3
The minio error comes from minio, when reading the stream after the chunk has been deleted between open and read

EDIT 4
All known issues are gone and tests are working fine

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Sep 7, 2024
@lunny lunny added this to the 1.23.0 milestone Sep 7, 2024
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Sep 21, 2024
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Sep 21, 2024
@lafriks lafriks merged commit b594cec into go-gitea:main Sep 22, 2024
26 checks passed
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Sep 22, 2024
@VAllens
Copy link

VAllens commented Sep 23, 2024

goooooooooooooooooood !!!

zjjhot added a commit to zjjhot/gitea that referenced this pull request Sep 23, 2024
* giteaofficial/main:
  [skip ci] Updated licenses and gitignores
  Fix rename branch permission bug (go-gitea#32066)
  Fix artifact v4 upload above 8MB (go-gitea#31664)
  [skip ci] Updated translations via Crowdin
  Add bin to Composer Metadata (go-gitea#32099)
  Fix wrong last modify time (go-gitea#32102)
  Fix upload maven pacakge parallelly (go-gitea#31851)
  Repo Activity: count new issues that were closed (go-gitea#31776)
  Count typescript files as frontend for labeling (go-gitea#32088)
  Use camo.Always instead of camo.Allways (go-gitea#32097)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large Actions artifacts (> ~8MB) will fail checksum when merging their chunks
7 participants