Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF/ENH: get_size_from_key to match specification #4081

Merged
merged 4 commits into from Feb 13, 2020

Conversation

bpoldrack
Copy link
Member

@bpoldrack bpoldrack commented Jan 24, 2020

Suck in implementation from git-annex-ria-remote.
Former implementation couldn't correctly deal with chunks for example.
See https://git-annex.branchable.com/internals/key_format/ for specification.

Suck in implementation from git-annex-ria-remote.
Former implementation couldn't correctly deal with chunks for example.
See https://git-annex.branchable.com/internals/key_format/ for specification.
@codecov
Copy link

@codecov codecov bot commented Jan 24, 2020

Codecov Report

Merging #4081 into master will increase coverage by 0.05%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4081      +/-   ##
==========================================
+ Coverage   89.71%   89.77%   +0.05%     
==========================================
  Files         274      275       +1     
  Lines       36711    37014     +303     
==========================================
+ Hits        32934    33228     +294     
- Misses       3777     3786       +9     
Impacted Files Coverage Δ
datalad/support/tests/test_due_utils.py 78.04% <0.00%> (-9.76%) ⬇️
datalad/core/distributed/clone.py 92.24% <0.00%> (-0.81%) ⬇️
datalad/cmdline/helpers.py 67.44% <0.00%> (-0.50%) ⬇️
datalad/support/annexrepo.py 86.07% <0.00%> (+0.11%) ⬆️
datalad/log.py 89.90% <0.00%> (ø) ⬆️
datalad/cmdline/main.py 77.40% <0.00%> (ø) ⬆️
datalad/core/local/run.py 98.70% <0.00%> (ø) ⬆️
datalad/interface/unlock.py 100.00% <0.00%> (ø) ⬆️
datalad/distribution/drop.py 98.79% <0.00%> (ø) ⬆️
datalad/support/exceptions.py 82.45% <0.00%> (ø) ⬆️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b709d2...bb2c122. Read the comment docs.

Copy link
Contributor

@kyleam kyleam left a comment

When looking at datalad/git-annex-ria-remote#19, I recall working through this pretty closely. I'll lazily transfer my approval rather than closely looking at it again :]

My main comment is that this sort of method is very amenable to comprehensive unit tests. It'd be great to include some.

# has no 2nd field in the key

# TODO: this method can be more compact. we don't need particularly
# elaborated error distinction
Copy link
Contributor

@kyleam kyleam Jan 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is referring to. I spot only one exception that's raised.

Copy link
Member Author

@bpoldrack bpoldrack Jan 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't remember. Might be a leftover from an earlier version. Will have a closer look when I write the tests for it and delete, if I can't decode that comment again (otherwise clarify or resolve it, of course) ;-)

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below. @yarikoptic kind of found what this was about. ;-)

S = int(field[1:]) if field[1:].isdigit() else None
elif field.startswith('C'):
# we have a chunk, this is it's number:
C = int(field[1:]) if field[1:].isdigit() else None
Copy link
Member

@yarikoptic yarikoptic Jan 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I missing some difference or all 3 lines are identical and just assign to different variables? if there is no difference, then I would have used something like

if field and field[0] in 'sSC':
    value = int(field[1:]) if field[1:].isdigit() else None
    if field[0] == 's':
        s = value
    elif ...

to address code duplication. or might have even got brave for something like

if field and field[0] in 'sSC':
    locals().update({field[0]:  int(field[1:]) if field[1:].isdigit() else None})

;-)

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! And that's exactly what the mysterious comment was about. The case distinction is needed only if we want to be more specific in "parsing" errors. Since we don't (I think) in can be "compacted" this way. Doing it now.

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: locals() can't be updated. So, changed to a dict comprehension but updating local variables "manually".

Copy link
Member

@yarikoptic yarikoptic Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: locals() can't be updated.

d'oh. indeed. why did I think they could?

demo
$> cat /tmp/11.py

def func():
    c = 1
    locals().update({'c': 2})
    print("c=%s" % c)

func()

$> python /tmp/11.py
c=1

if C <= int(s / S):
return S
else:
return s % S
Copy link
Member

@yarikoptic yarikoptic Jan 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a nice little function to unittest ;)

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bpoldrack added 2 commits Feb 12, 2020
Plus: Fix a minor bug, where an invalid key would result in None instead of ValueError
@bpoldrack bpoldrack added the merge-if-ok OP considers this work done, and requests review/merge label Feb 12, 2020
# don't lookup the dict for the same things several times;
# Is there a faster (and more compact) way of doing this? Note, that
# locals() can't be updated.
s = parsed['s'] if 's' in parsed.keys() else None
Copy link
Contributor

@kyleam kyleam Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsed.get('s')?


I know this was in response to a review comment from Yarik, but I must admit that I don't find the newer version to be an improvement in terms of readability. Perhaps a compromise between this and the previous form would be an internal function that encapsulates int(field[1:]) if field[1:].isdigit() else None while keeping the expanded ifelif branches.

Anyway, it's a minor thing either way, and I'm fine with how it is. (I'm just glad you didn't use locals() :] )

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsed.get('s')

WTH? Why didn't I think of it? :-D

Copy link
Member Author

@bpoldrack bpoldrack Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to get, but otherwise will leave as is. Kind of agree with the readability aspect, but it's not too bad and TBH I don't really like additional function calls for that.

@bpoldrack bpoldrack merged commit e1c3bf0 into datalad:master Feb 13, 2020
17 checks passed
@bpoldrack bpoldrack deleted the bf-size-from-key branch Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-if-ok OP considers this work done, and requests review/merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants