Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipart: Avoid reading all parts at once before upload #703

Merged
merged 1 commit into from
Oct 2, 2018

Conversation

vadmeste
Copy link
Member

The multipart upload was reading all parts and putting them
in memory before calling for thread manager to upload parts
in parallel.

This commit changes the behavior to read a part only when a
worker becomes ready to upload.

fixes #626

fast_quit = False
while not self.tasks_queue.empty():
func, args, kargs = self.tasks_queue.get()
while True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a task throws an exception, fast_quit is set to True. IIUC, the loop proceeds to the next iteration, 'pops' the next task out of the queue but exits the loop without executing it. So, the 'popped' task will not be run by any of the workers in the thread pool? Is this behaviour acceptable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a task throws an exception, fast_quit is set to True. IIUC, the loop proceeds to the next iteration, 'pops' the next task out of the queue but exits the loop without executing it. So, the 'popped' task will not be run by any of the workers in the thread pool? Is this behaviour acceptable?

Yes, if one thread has an exception, it is not needed for other threads to continue uploads of parts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if one thread has an exception, it is not needed for other threads to continue uploads of parts

I will modify the code a little bit, only the current thread is quitting, not other threads as I thought

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will modify the code a little bit, only the current thread is quitting, not other threads as I thought

I removed fast_quit variable, now we will test directly on exceptions_queue and see if it is empty before continuing to run tasks

minio/thread_pool.py Show resolved Hide resolved
for _ in range(self.num_threads):
Worker(self.tasks_queue, self.results_queue, self.exceptions_queue)

def result(self):
""" Return the result of all called tasks """
# Send nil to all threads to cleanly stop them
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) nit: s/nil/None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) nit: s/nil/None

Done

The multipart upload was reading all parts and putting them
in memory before calling for thread manager to upload parts
in parallel.

This commit changes the behavior to read a part only when a
worker becomes ready to upload.
Copy link
Collaborator

@Praveenrajmani Praveenrajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kannappanr kannappanr merged commit b0f2b71 into minio:master Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stream multipart PUT operation
4 participants