-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Introduce threading to improve upload speed #46
Conversation
else: | ||
self.zip_root_path = root_path | ||
|
||
max_workers = self.xblock_settings.get("THREADPOOLEXECUTOR_MAX_WORKERS", 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is unzipping each file one by one the part that is causing very slow response times?
I'd like to avoid resorting to multithreading, if possible. It's a little too complex for my taste. Could we unzip locally to a temporary directory before uploading the folder in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that issue reporter mentioned slow upload speed over S3. While uploading scorm package we extract it and then upload/save each file one by one. It appears extracting(unzipping) the files is not taking that much time because reporter mentioned speed on local storage is fine which points to upload to S3 being cause of slowness. These changes introduce parallel upload to S3 to reduce upload time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you manage to reproduce the issue? I understand it would be a bit tricky to setup an S3 bucket just for that purpose. But I would like to be sure that we're addressing the right thing here. @dyudyunov are you able to test this fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi and thanks for mentioning me!
I'll try the fix ASAP (most probably on the next week)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dyudyunov have you been able to test it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upload time was close to 30sec according to logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! Now I'm wondering whether we can find a solution that does not involve multithreading. Is there a native storage API that would support copying an entire directory, instead of copying data file by file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that in django-storages>=1.14 which is being used in Quince release it is using Threading to optimize upload of files on S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, do we still have to implement threading ourselves or is it handled natively by django-storages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discuss plan is to implement it ourselves for pre quince version and In quince we should remove our implementation and rely on multi threading support of django-storages.
Introduce threading to concurrently handle file extract and upload chore: added TODO comment
baced35
to
a8f6b6d
Compare
@regisb could you please take look at this one? |
I recently uploaded a SCORM package, sized at 6.2 MB, and it took approximately 67 seconds to complete. My network speeds at the time were as follows: download speed at 14.5 Mbps and upload speed at 18.5 Mbps. |
@Abdul-Muqadim-Arbisoft thanks for the feedback. Which version of Tutor did you use? Did you also try this PR? |
I used tutor quince and the above stats I provided are for this pr |
@Abdul-Muqadim-Arbisoft this issue was report with tutor 16 or prior version so you have test it with tutor 16 or less. |
@ziafazal I will test it today with tutor 16 |
I tested it on tutor 16.1.8, it uploads a SCORM package with s3 storage configured, sized at 6.2 MB, and it takes approximately 7 minutes 30 secs to complete. My network speeds at the time were as follows: download speed at 30 Mbps and upload speed at 19.3 Mbps. |
Closing this PR as 1) the issue is not present on quince, and 2) the changes/fix do not work entirely as expected. |
This PR introduce threading to concurrently handle file extract and upload, which would improve upload speed of SCORM package and fix #42