New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPT+ENH: determine config to query for procedure once, add more "log(2," to eval_func #3576
Conversation
…2," to eval_func Original OPT issue: datalad#3575 Those debug statements seems to have no cost but could help determining unexpected slow downs
The next line adds `proc_cfg` and expects `spec` to be a plain list.
Codecov Report
@@ Coverage Diff @@
## 0.11.x #3576 +/- ##
===========================================
- Coverage 77.29% 59.08% -18.21%
===========================================
Files 255 255
Lines 33641 33645 +4
===========================================
- Hits 26002 19879 -6123
- Misses 7639 13766 +6127
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## 0.11.x #3576 +/- ##
===========================================
- Coverage 77.29% 59.08% -18.21%
===========================================
Files 255 255
Lines 33641 33645 +4
===========================================
- Hits 26002 19879 -6123
- Misses 7639 13766 +6127
Continue to review full report at Codecov.
|
@@ -363,33 +364,39 @@ def eval_func(wrapped, instance, args, kwargs): | |||
def _result_filter(res): | |||
return result_filter(res, **allkwargs) | |||
|
|||
def _get_procedure_specs(param_key=None, cfg_key=None, ds=None): | |||
def _get_procedure_specs(param_key=None, cfg_key=None, ds=None, proc_cfg=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional parameter is fine by me, though I have a slight preference for just defining proc_cfg
above _get_procedure_specs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is how I did it first, but then it should be not a pure variable since you can't rebind it inside, but some kind of a container (I had list). Just looks ugly etc, so decided to make it explicit
no travis report here, but it is there and I seems screwed up
so will fix it now |
Sorry, ignore me. I fixed another error that the initial commit caused in test_run_procedure, but it looks like this is a different source. |
then I guess there more! I should have indeed went the "outside variable" way to make it less fragile :-/ $> git show
commit 4601749db64e9d0e45b81b68f8086f9feb422d4c (HEAD -> opt-doitonce)
Author: Yaroslav Halchenko <debian@onerussian.com>
Date: Fri Aug 2 09:42:04 2019 -0400
BF: eval_func: One more fix to return value
diff --git a/datalad/interface/utils.py b/datalad/interface/utils.py
index ea510f598..0a169aaae 100644
--- a/datalad/interface/utils.py
+++ b/datalad/interface/utils.py
@@ -369,7 +369,7 @@ def eval_results(func):
spec = common_params.get(param_key, None)
if spec is not None:
# this is already a list of lists
- return spec
+ return spec, proc_cfg
if not proc_cfg:
# .is_installed and .config can be costly, so ensure we do
changes on filesystem:
.asv | 0
(dev) 1 15626.....................................:Fri 02 Aug 2019 09:50:14 AM EDT:.
(git)hopa:~datalad/datalad[opt-doitonce]git
$> git range-diff gh-yarikoptic/opt-doitonce...HEAD
-: --------- > 1: 4601749db BF: eval_func: One more fix to return value
why range-diff doesn't show the diff? here is no rewrite but a new commit pushing |
range-diff won't show the diff for added or dropped commits, only those for matched patches, which means it's not particularly useful (or needed) when you're not rewriting a series. |
Looks good. I missed that in my fixup commit. |
0.11.7 (Sep 02, 2019) -- python2-we-still-love-you-but-... Primarily bugfixes with some optimizations and refactorings. Fixes - [addurls][] - now provides better handling when the URL file isn't in the expected format. ([#3579][]) - always considered a relative file for the URL file argument as relative to the current working directory, which goes against the convention used by other commands of taking relative paths as relative to the dataset argument. ([#3582][]) - [run-procedure][] - hard coded "python" when formatting the command for non-executable procedures ending with ".py". `sys.executable` is now used. ([#3624][]) - failed if arguments needed more complicated quoting than simply surrounding the value with double quotes. This has been resolved for systems that support `shlex.quote`, but note that on Windows values are left unquoted. ([#3626][]) - [siblings][] now displays an informative error message if a local path is given to `--url` but `--name` isn't specified. ([#3555][]) - [sshrun][], the command DataLad uses for `GIT_SSH_COMMAND`, didn't support all the parameters that Git expects it to. ([#3616][]) - Fixed a number of Unicode py2-compatibility issues. ([#3597][]) Enhancements and new features - The [annotate-paths][] helper now caches subdatasets it has seen to avoid unnecessary calls. ([#3570][]) - A repeated configuration query has been dropped from the handling of `--proc-pre` and `--proc-post`. ([#3576][]) - Calls to `git annex find` now use `--in=.` instead of the alias `--in=here` to take advantage of an optimization that git-annex (as of the current release, 7.20190730) applies only to the former. ([#3574][]) - [addurls][] now suggests close matches when the URL or file format contains an unknown field. ([#3594][]) - Shared logic used in the setup.py files of Datalad and its extensions has been moved to modules in the _datalad_build_support/ directory. ([#3600][]) - Get ready for upcoming git-annex dropping support for direct mode ([#3631][]) * tag '0.11.7': (87 commits) DOC: Added an entry to changelogn on merged 3631 ENH: finalizing changelog for 0.11.7 TST: Update tests for a git-annex without direct mode TST: utils: Add decorator that skips when direct mode is unsupported ENH: annexrepo: Refuse to initialize in direct mode if unsupported ENH: annexrepo: Add check_direct_mode_support method BF+TST: Avoid leaking patched git-annex version TST+RF: test_annexrepo: Split up a test CHANGELOG.md: Second batch for 0.11.7 TST: run_procedure: Mark test_spaces() as known Windows failure TST: run_procedure: Mark test_quoting as known windows failure TST: run_procedure: Test more arguments that need quoting BF(py2): run_procedure: Avoid encoding error in log message TST: add run_procedure test with spaces in file name TST/RF: non-hardcoded Python executable RF: newline at end of file RF: helper instead of conditional RF: remove superfluous imports BF/TST: remove quoting ENH: replace conditionals with helper function ...
0.11.7 (Sep 02, 2019) -- python2-we-still-love-you-but-... Primarily bugfixes with some optimizations and refactorings. Fixes - [addurls][] - now provides better handling when the URL file isn't in the expected format. ([#3579][]) - always considered a relative file for the URL file argument as relative to the current working directory, which goes against the convention used by other commands of taking relative paths as relative to the dataset argument. ([#3582][]) - [run-procedure][] - hard coded "python" when formatting the command for non-executable procedures ending with ".py". `sys.executable` is now used. ([#3624][]) - failed if arguments needed more complicated quoting than simply surrounding the value with double quotes. This has been resolved for systems that support `shlex.quote`, but note that on Windows values are left unquoted. ([#3626][]) - [siblings][] now displays an informative error message if a local path is given to `--url` but `--name` isn't specified. ([#3555][]) - [sshrun][], the command DataLad uses for `GIT_SSH_COMMAND`, didn't support all the parameters that Git expects it to. ([#3616][]) - Fixed a number of Unicode py2-compatibility issues. ([#3597][]) - [download-url][] now will create leading directories of the output path if they do not exist ([#3646][]) Enhancements and new features - The [annotate-paths][] helper now caches subdatasets it has seen to avoid unnecessary calls. ([#3570][]) - A repeated configuration query has been dropped from the handling of `--proc-pre` and `--proc-post`. ([#3576][]) - Calls to `git annex find` now use `--in=.` instead of the alias `--in=here` to take advantage of an optimization that git-annex (as of the current release, 7.20190730) applies only to the former. ([#3574][]) - [addurls][] now suggests close matches when the URL or file format contains an unknown field. ([#3594][]) - Shared logic used in the setup.py files of Datalad and its extensions has been moved to modules in the _datalad_build_support/ directory. ([#3600][]) - Get ready for upcoming git-annex dropping support for direct mode ([#3631][]) * tag '0.11.7': Changelog entry for download-url paths handling ENH: downloaders: Ensure directories for target exist
Original OPT issue: #3575 . This PR addresses only one aspect, and doesn't touch upon relatively slow
.config
and.is_installed
which are yet to be figured out and probably should be properly "lazied" at the dataset level since we are doing this over and over for every decorated function in the same dataset!Those added debug statements seems to have no cost but could help determining unexpected
slow downs
For me it is cutting down time for a dummy
datalad run
from 2.9 to 2.2 sec (so ~20% boost! ;))and I do not expect negative effect on other usecases.