New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-42306: Performance optimizations #942
Conversation
61c476b
to
3c93daa
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #942 +/- ##
==========================================
+ Coverage 88.37% 88.39% +0.01%
==========================================
Files 303 303
Lines 38979 39070 +91
Branches 8221 8238 +17
==========================================
+ Hits 34448 34535 +87
- Misses 3339 3341 +2
- Partials 1192 1194 +2 ☔ View full report in Codecov by Sentry. |
b16c8b9
to
81b8fc3
Compare
9a69807
to
a6d2fd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizations look great.
I'm not sure I understand how the transfer-from dry-run option works (maybe because I'm not sure precisely how the checking for existing datasets in the target butler works even without dry-run).
I do think we should drop the ThreadPoolExecutor
commit for now.
If the relative path is coming directly from a datastore record we know it can't be something that is trying to ../ out of the datastore. The safety check in the Location constructor has significant overhead. Although with client/server this might be less true.
This can lead to some small performance improvement when ResourcePath does not need to check itself whether something is a file or not.
This makes it easier for code to copy a Location without having to know if deepcopy should be used.
The .fields method was slow and also being called four times. Now call a grouped fields method once that returns all the answers.
Add __deepcopy__ and __copy__ special methods that use clone()
Important now that we disable check for this in Location constructor.
a6d2fd2
to
f7253d4
Compare
This reverts commit f212bf5. We have decided that we are not read for the thread pool speedup.
cd73e09
to
f2dfdfc
Compare
It does all the work that it would have to do to do the transfer but it doesn't copy any files or write to the target butler. I just modified how it works such that it will report that N datasets have already been transferred rather than dry run assuming that it always has to transfer everything. |
5138c62
to
cc7c1cb
Compare
Dry run is more useful if it really does report what it is going to do rather than reporting that it might transfer all the files to the target butler.
4d5ec6a
to
f1578f7
Compare
No longer need to check for None.
cc7657e
to
f516a08
Compare
Depends on lsst/resources#76
Checklist
doc/changes