-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --jobs
(parallel git fetch)
#539
Changes from 2 commits
3dc739b
f2d7a0f
c49fa7b
1fcd9b9
5c3bdb1
f3a3069
886fdbd
7f45434
bac393b
670cfb7
2393c4b
cad41bf
aa65be1
60c7788
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,7 +47,38 @@ module Shards | |
end | ||
end | ||
|
||
private def prefetch_local_caches(lock_index, deps) | ||
count = 0 | ||
active = 0 | ||
ch = Channel(Exception?).new(deps.size + 1) | ||
deps.each do |dep| | ||
next unless lock = lock_index[dep.name]? | ||
next unless dep.matches?(lock.version) | ||
count += 1 | ||
active += 1 | ||
while active > Shards.parallel_fetch | ||
sleep 0.1 | ||
end | ||
spawn do | ||
begin | ||
dep.resolver.update_local_cache | ||
ch.send(nil) | ||
active -= 1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid that this is not safe from races. Probably OK if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm. Thanks for catching this! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yes! Atomic is what we need here. Do you want to do the change? Let me know otherwise and we'll take care. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is fine. Shards is currently not intended to be built with multithreading. And I wouldn't even see a good reason to even go for that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, the change seems easy enough, so I've added it here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And in fact, another fix right here. |
||
rescue ex : Exception | ||
ch.send(ex) | ||
end | ||
end | ||
end | ||
|
||
count.times do | ||
obj = ch.receive | ||
raise obj if obj.is_a? Exception | ||
m-o-e marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
end | ||
|
||
private def add_lock(base, lock_index, deps : Array(Dependency)) | ||
prefetch_local_caches(lock_index, deps) if Shards.parallel_fetch > 1 | ||
|
||
deps.each do |dep| | ||
if lock = lock_index[dep.name]? | ||
next unless dep.matches?(lock.version) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this works for every resolver (implementing
update_local_cache
), not just git?If so, I'd rename it to
parallel-jobs
or sth like that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought
fetch
would make the most sense for all resolvers where this appliessince it's about the data-fetching and nothing else - each VCS might call that step
something different but ultimately it's a "fetch".
It's only implemented for git atm, that's why I mentioned that one in the description.
(would replace "git" with "VCS" later, when other impls are added)
We could change the arg-name but to me
parallel-jobs
feels less to the point thanparallel-fetch
? 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. You're calling
Resolver#update_local_cache
for every resolver regardless, so how come you state it's implemented only for git?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heck... you are right.
I got confused because my first iteration was git-specific,
I totally forgot that it now indeed applies to all of them. 🙈
Ouch, that means this PR probably breaks hg.
The hg-side will need a change similar to this.
Sigh. This makes things more complicated.
I can port the change from git.cr to hg.cr - but I haven't wrapped my head
around the spec-suite enough to see how to make a test to validate it.
It looks like the current integration test only uses git URLs (no
hg_url()
in there).That may be the reason why it didn't catch the breakage - or maybe
it just doesn't exercise the parallelism at all.
(I thought it does because it uses multiple deps in some places -
but the fact that it didn't break for hg... confuses me 🤔)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. The best I could offer for now would be to actually exclude hg from
parallelism (like I thought it already was), by doing something like:
But I feel like if the spec's didn't catch the hg issue then they
might also not sufficiently cover the git-side. So even though it works
splendidly for me under real usage, spec-coverage should probably
be addressed before merging this.
If someone more familiar with the spec-suite and/or hg
has an idea on how to best do that, please don't hesitate. 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but that will need test-coverage first.
For git I have some confidence that it works fine since I've been using it daily for a while, but
I've never encountered a hg-shard in the wild, so don't even have an URL to test that side manually.
Not sure when I'll have time to continue on this, so excluding Hg for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's totally fine. Concurrent fetches are just an optimization without functional differences. There's no need to have this implemented for all resolvers immediately.
Wording should be more generic, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolving this convo as I believe all points are addressed.
--jobs
to keep it generic (for consistency withbundler
and for when we add hg in the future)
confuse users into thinking it would apply to all resolvers.
Please re-open or reply on main thread if any of these need further discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep the wording more general, but mention that it only works for git currently. For example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, updated in 670cfb7
Dropped "function" and "resolver" to make it shorter.