Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to parallel uploading for Azure blob storage #4291

Merged
merged 6 commits into from
Nov 9, 2022
Merged

Conversation

wwwjn
Copy link
Contributor

@wwwjn wwwjn commented Oct 26, 2022

Reasons for making this change

Related issues

#4286

Screenshots

Checklist

  • I've added a screenshot of the changes, if this is a frontend change
  • I've added and/or updated tests, if this is a backend change
  • I've run the pre-commit.sh script
  • I've updated docs, if needed

@wwwjn wwwjn requested a review from epicfaace October 26, 2022 08:35
@wwwjn wwwjn self-assigned this Oct 26, 2022
@wwwjn
Copy link
Contributor Author

wwwjn commented Oct 31, 2022

Test result:
TLDR: After using parallel, we can reduce the waiting time when some block is being committed.

In PR #4275, we minimize the size of MIN_WRITE_SIZE and MAX_WRITE_SIZE. This means when uploading a file, we need more put() operations and more commit() operations, which slow down the uploading process. So we need to parallel uploading process to speed up.

Measure the actual upload time of a file content:

  1. without parallel: 17625s

Image

  1. with parallel: 2690s

Image

codalab/lib/beam/blobstorageuploader.py Outdated Show resolved Hide resolved
codalab/lib/beam/blobstorageuploader.py Outdated Show resolved Hide resolved
@wwwjn wwwjn merged commit 6108f13 into master Nov 9, 2022
@wwwjn wwwjn deleted the parallel-chunk branch November 9, 2022 06:43
@leilenah leilenah mentioned this pull request Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants