-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type-annotate datalad/support/gitrepo.py
#7341
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## maint #7341 +/- ##
==========================================
+ Coverage 88.30% 88.38% +0.08%
==========================================
Files 327 327
Lines 44629 44869 +240
Branches 0 5916 +5916
==========================================
+ Hits 39409 39658 +249
+ Misses 5220 5196 -24
- Partials 0 15 +15
... and 51 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few possible suggestions, otherwise looks great, thank you @jwodder !
@@ -214,8 +253,12 @@ def _normalize_path(base_dir, path): | |||
return relpath(path, start=base_dir) | |||
|
|||
|
|||
class _WithPath(Protocol): | |||
path: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not to move it to datalad/typing.py
as plain WithPath
and may be even WithPathT
since it seems you use those whenever construct is not clearly a type, but could be some class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there functions in other files that operate on any type that has a path
attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would guess so, in particular in annexrepo.py
since that one is close companion of this one. But also lots of files listed in hit for git grep -l '\.path\>'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There are only two top-level functions in
annexrepo.py
, and they both operate on IO handles rather thanWithPath
. - A large number of hits for that regex doesn't mean that
WithPath
will be useful; most of those can be expected to operate on specific classes.
@@ -397,21 +440,23 @@ class GitProgress(WitlessProtocol): | |||
CHECKING_OUT: ("Check out", "Things"), | |||
} | |||
|
|||
__slots__ = ('_seen_ops', '_pbars', '_encoding') | |||
__slots__ = ('_unprocessed', '_seen_ops', '_pbars') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised that it didn't error out before somehow since I was expecting that if __slots__
specified then assignment of unknown ones to lead to an error. oh well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitProgress
inherits from WitlessProtocol
, which does not use __slots__
, and so GitProgress
will still have a __dict__
attribute and support assignment of arbitrary attributes.
commitishes = commitishes + [self.get_active_branch()] | ||
branch = self.get_active_branch() | ||
if branch is None: | ||
raise ValueError("Single commitish provided and no active branch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Note: in prior cases we would have got TypeError
though but since it is indeed illegit and unlikely invocation , should be ok. Just FTR
❯ python -c 'from datalad.support.gitrepo import GitRepo; print(GitRepo(".").call_git_oneline(["merge-base", "master", None]))'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/yoh/proj/datalad/datalad-maint/datalad/dataset/gitrepo.py", line 550, in call_git_oneline
lines = list(self.call_git_items_(args, files=files,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yoh/proj/datalad/datalad-maint/datalad/dataset/gitrepo.py", line 517, in call_git_items_
for file_no, line in self._generator_call_git(
File "/home/yoh/proj/datalad/datalad-maint/datalad/dataset/gitrepo.py", line 345, in _generator_call_git
generator = self._git_runner.run(
^^^^^^^^^^^^^^^^^^^^^
File "/home/yoh/proj/datalad/datalad-maint/datalad/runner/runner.py", line 206, in run
results_or_iterator = threaded_runner.run()
^^^^^^^^^^^^^^^^^^^^^
File "/home/yoh/proj/datalad/datalad-maint/datalad/runner/nonasyncrunner.py", line 343, in run
return self._locked_run()
^^^^^^^^^^^^^^^^^^
File "/home/yoh/proj/datalad/datalad-maint/datalad/runner/nonasyncrunner.py", line 403, in _locked_run
self.process = Popen(self.cmd, **kwargs) # nosec
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.11/subprocess.py", line 1834, in _execute_child
self.pid = _fork_exec(
^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not NoneType
if state in ('clean', 'modified', 'deleted', None): | ||
# assign previous gitsha to any record | ||
# state==None can only happen for subdatasets that | ||
# already existed, so also assign a sha for them | ||
assert from_sha is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not 100% sure it must be not a None
, but ok - let's hope it is so
|
||
class GitAddOutput(TypedDict): | ||
file: str | ||
success: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to myself and other @datalad/developers -- TypedDict
is what would save our sanity in formalizing our return record dictionaries, which we produce using https://github.com/datalad/datalad/blob/HEAD/datalad/interface/results.py#L51 . TypedDict
allows for NotRequired
(not sure why not Optional
) annotation for keys which do not have to be present.
if 'error-messages' in r else None, | ||
#message='\n'.join(r['error-messages']) | ||
#if 'error-messages' in r else None, | ||
message=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: indeed r
cannot have error-messages
in its return since it is
@staticmethod
def _process_git_get_output(stdout, stderr=None):
"""Given both outputs (stderr is ignored atm) of git add - process it
Primarily to centralize handling in both indirect annex and direct
modes when ran through proxy
"""
return [{u'file': f, u'success': True}
for f in re.findall("'(.*)'[\n$]", ensure_unicode(stdout))]
so functionally this change would be correct and without side-effect.
ok, let's proceed! I had another look and although there are some changes to code, I think overall it just would make it more robust and prevent some loose/buggy behavior possible (but unlikely to be triggered) before. |
PR released in |
Part of #6884.