Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make it possible to change # of retries #102

Merged
merged 1 commit into from Mar 24, 2017
Merged

Conversation

@shaform
Copy link
Contributor

@shaform shaform commented Dec 3, 2016

No description provided.

@shaform shaform force-pushed the shaform:retry branch 6 times, most recently from 94d1167 to 85258ed Dec 3, 2016
@@ -744,6 +744,10 @@ def __exit__(self, type, value, traceback):
self.close()


def s3_iter_bucket_process_key_with_args(args):

This comment has been minimized.

@piskvorky

piskvorky Dec 3, 2016
Member

I'd prefer kwargs -- too easy to make a mistake in the order of arguments otherwise. Better use named arguments (beyond the mandatory key, which can be positional).

This comment has been minimized.

@shaform

shaform Dec 3, 2016
Author Contributor

Sure, I've made the change. In fact, I think functools.partial should be a better solution. Unfortunately, Python 2.6 does not support it.
ref. http://stackoverflow.com/questions/4463275/python-multiprocessing-a-function-with-several-inputs

@shaform shaform force-pushed the shaform:retry branch from 85258ed to d506a25 Dec 3, 2016
@tmylk
Copy link
Contributor

@tmylk tmylk commented Dec 4, 2016

Thanks for the PR!
May I ask to add a test of this new feature, similar to https://github.com/RaRe-Technologies/smart_open/pull/96/files#diff-36c1f33ff2d12844dc387270ca11d040R694

@shaform shaform force-pushed the shaform:retry branch 5 times, most recently from 05143ce to db3374a Dec 4, 2016
@shaform
Copy link
Contributor Author

@shaform shaform commented Dec 4, 2016

Thanks for the reply. I've added the test cases.

@shaform
Copy link
Contributor Author

@shaform shaform commented Dec 4, 2016

It appears the test sometimes hangs (#62) possibly because mock_s3 does not go well with multiprocessing. So I've fixed it as well with a new commit.

@tmylk
Copy link
Contributor

@tmylk tmylk commented Dec 22, 2016

@shaform Not sure about the last commit. It lowers our test coverage signifcantly. Will merge only the first one.

@shaform
Copy link
Contributor Author

@shaform shaform commented Dec 22, 2016

@shaform
Copy link
Contributor Author

@shaform shaform commented Dec 22, 2016

@tmylk All right, now I look at the code, I find that there are actually other places that have smart_open.smart_open("", 'wb') being tested. But then I couldn't find out why the coverage would be decreased. Could you share the coverage report?

@shaform shaform force-pushed the shaform:retry branch from bf0f8b7 to dab7292 Mar 23, 2017
@shaform
Copy link
Contributor Author

@shaform shaform commented Mar 23, 2017

@tmylk Ok. I've decided to revert the second commit. This makes the test more likely to fail due to #62 though.


if MULTIPROCESSING:
logger.info("iterating over keys from %s with %i workers" % (bucket, workers))
pool = multiprocessing.pool.Pool(processes=workers)
iterator = pool.imap_unordered(s3_iter_bucket_process_key, keys)
iterator = pool.imap_unordered(s3_iter_bucket_process_key_with_kwargs, keys)

This comment has been minimized.

@tmylk

tmylk Mar 23, 2017
Contributor

Please use partial instead of defining a new function s3_iter_bucket_process_key_with_kwargs. The number of retries is the same for each worker.

This comment has been minimized.

@shaform

shaform Mar 23, 2017
Author Contributor

@tmylk
Surely, but as mentioned earlier, this would cause errors in Python 2.6. Specifically, it's because partial conflicts with multiprocessing in Python 2.6: https://bugs.python.org/issue5228.

In order to make it work on Python 2.6, I've made some changes according to this thread:
https://mail.python.org/pipermail/python-dev/2009-February/086039.html

@tmylk
Copy link
Contributor

@tmylk tmylk commented Mar 23, 2017

Thanks. Just a decorative change requested.

@shaform shaform force-pushed the shaform:retry branch from dab7292 to 69bac6b Mar 23, 2017
@tmylk tmylk merged commit 5c2e0a6 into RaRe-Technologies:master Mar 24, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@tmylk
Copy link
Contributor

@tmylk tmylk commented Mar 24, 2017

Apologies, didn't notice the discussion above. Thanks for this new feature!


IS_PY2 = (sys.version_info[0] == 2)

if IS_PY2:
import httplib
if sys.version_info[1] == 6:
import copy_reg

This comment has been minimized.

@piskvorky

piskvorky Mar 24, 2017
Member

@shaform What is this cleverness?

We try not to write code that is too clever, and if we must, we comment it heavily.

Please include detailed comments about the motivation and behaviour of these blocks of code.

This comment has been minimized.

@shaform

shaform Mar 24, 2017
Author Contributor

@piskvorky
How about this: shaform@3154b06.

This comment has been minimized.

@piskvorky

piskvorky Mar 25, 2017
Member

Isn't it easier and more readable to just create a named function?

Named functions (at the module level) are pickleable without a problem. We don't have to monkey-patch pickling and use obscure functional currying.

I'm really -1 on this solution, unless there's a very good reason it cannot be done in a more straightforward manner.

This comment has been minimized.

@shaform

shaform Mar 25, 2017
Author Contributor

@piskvorky @tmylk I've put the earlier implementation here https://github.com/shaform/smart_open/tree/retry_alt

I could send another pull request if you like.

This comment has been minimized.

@tmylk

tmylk Mar 27, 2017
Contributor

Thanks @shaform. Simplified version merged in #118

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.