Our approach of providing lots of methods that wrap different Git
subcommands leads to us adding one-off methods, many of which do the
bare minimum for the use case we have at that time. The minimal
implementation is a good thing if the code is meant to be used
internally because there's no point in trying to support and test
knobs we don't need, but it doesn't make for a nice external
interface, and it leads to churn and incompatibilities when we later
need to extend the method for another use case.
Let's instead expose a set of methods wrap simple calls based on the
return value. Introducing these will hopefully have the following
benefits:
* It is a set of commands that we can offer to third-party callers
if they need a command that we don't have but still want to call
git through us.
* Likewise, it is a set of commands we can use to avoid adding
one-off methods.
* We already use _git_custom_command() in many spots outside of
gitrepo.py to avoid adding more methods to GitRepo. These new
methods would simplify these calls.
* If we decide to deprecate a method or change it in an incompatible
way, we can point to these methods as an alternative.
The arguments of these new methods are pretty minimal compared to what
_git_custom_command() accepts. The idea is to keep them simple until
we need to expose more options, given we can extend the keyword
arguments in a backward compatible way. Between the new methods, the
arguments are largely consistent, but call_git() has `expect_fail`
while call_git_{oneline,items_} do not because we don't have a spot
where that'd be used yet in our code base.
Note that call_git_items_() is a generator. While this isn't
particularly useful at the moment because str.split{,lines} already
put the entire list into memory, it keeps open the possibility of
changing the internal implementation to an approach the doesn't load
the entire output in memory.
All the methods call _git_custom_command() with check_fake_dates=True.
This will be unnecessary for nearly all repositories, but not doing so
risks leaking dates in repositories configured to use fake dates. The
other option would be to add a check_fake_dates parameter to these
methods, which is ugly because this is an obscure parameter that most
callers should not have to worry about. Unconditionally using
check_fake_dates=True costs an attribute lookup and a
.config.getbool() call on the first use and then a _fake_dates_enabled
attribute lookup on all remaining uses for that instance. Here are
times of an example command on the datalad repo:
% python -m timeit -n100 \
-s "from datalad.support.gitrepo import GitRepo" \
-s "repo = GitRepo('.')" -- "repo.call_git(['ls-files'])"
Results of two runs with `check_fake_dates=False`:
1) 100 loops, best of 5: 3.87 msec per loop
2) 100 loops, best of 5: 3.87 msec per loop
Results of two runs with `check_fake_dates=True`:
1) 100 loops, best of 5: 3.97 msec per loop
2) 100 loops, best of 5: 3.96 msec per loop
Closes datalad#3789.