Add maxConcurrentShards setting#30
Conversation
27338a7 to
813cfda
Compare
e59b03e to
4a60a59
Compare
Ian Streeter (istreeter)
left a comment
There was a problem hiding this comment.
As already discussed, I think this is a nice idea and I think it should work.
| } | ||
|
|
||
| // Helper function for safe semaphore release | ||
| releaseSemaphore := func() { |
There was a problem hiding this comment.
It's not my place to advise on how to arrange go code. But I think if it was me, I'd have explored a way to break everything from line 197 downwards into a separate function. So that I could do:
k.shardSemaphore <- struct{}{}
defer <-k.shardSemaphoreAnd then I no longer need to worry whether all the branches of my if statements have remembered to do releaseSemaphore().
There was a problem hiding this comment.
defer would've worked if we were in a simple function of some sort, but here, unless we refactor, it won't work for, from my understanding, defer works when the scope collapses, ie function returns/panics or for-loop completes in full (not just a single iteration); so here we need to manually release it
(alas line 222-226 would be fine with defer, just not the rest)
There was a problem hiding this comment.
Yeah agreed, what I was trying to say is... I think that small refactor is worth exploring.
There was a problem hiding this comment.
Actually, I started with that idea, and I do agree that it's better. But in implementing the refactor, I realised that it's not as simple as it looks, because we have flow control statements that are fairly fundamental to the logical flow we need to preserve.
That's not a big challenge to overcome, but I lost confidence in this approach - I felt it introduces enough scope of change to be uncomfortable doing it under time pressure. Or at least, enough in the weeds detail to be careful of, if that makes sense?
(Mostly thinking of the testing required for this feature - there are already enough variables without a possible hidden issue).
However, I'm not completely wedded to that decision. I'll take a look at it again, perhaps it's not as scary as I thought - I'll give it a look but I might still land on playing it safer :)
There was a problem hiding this comment.
Ian Streeter (@istreeter) as you suspected, after approaching it fresh, it's not as scary as I had initially seen it to be. I put the refactor into a separate PR, but only to avoid conflicts with the metrics implementation, which is separate.
It's much cleaner this way. PR is in draft atm because I used claude and want to give it a more close re-review myself, but broadly I think it's good.
There was a problem hiding this comment.
In fact, it turned out to be even better your way. I discovered that I missed releasing the semaohore when we exit on a checkpointer error - tests only sometimes fail as it depends on hitting that condition. The refactor removes the issue.
|
Merging this one, but I do plan on also adding the refactor ian mentioned - just doing so separately :) |
Ian Streeter (@istreeter) and Josh (@jbeemster) put you on the PR as an FYI, since we each discussed it