Skip to content

Add maxConcurrentShards setting#30

Merged
colmsnowplow merged 1 commit intorelease/1.7.0from
feat/maxConcurrentShards
Sep 5, 2025
Merged

Add maxConcurrentShards setting#30
colmsnowplow merged 1 commit intorelease/1.7.0from
feat/maxConcurrentShards

Conversation

@colmsnowplow
Copy link

@colmsnowplow colmsnowplow commented Sep 2, 2025

Ian Streeter (@istreeter) and Josh (@jbeemster) put you on the PR as an FYI, since we each discussed it

@colmsnowplow colmsnowplow force-pushed the feat/maxConcurrentShards branch from e59b03e to 4a60a59 Compare September 3, 2025 16:35
@colmsnowplow colmsnowplow changed the base branch from feat/getrecordslimit to release/1.7.0 September 3, 2025 16:36
@colmsnowplow colmsnowplow marked this pull request as ready for review September 3, 2025 16:36
Copy link

@oo-sleeper Mikhail (oo-sleeper) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@istreeter Ian Streeter (istreeter) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already discussed, I think this is a nice idea and I think it should work.

}

// Helper function for safe semaphore release
releaseSemaphore := func() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not my place to advise on how to arrange go code. But I think if it was me, I'd have explored a way to break everything from line 197 downwards into a separate function. So that I could do:

k.shardSemaphore <- struct{}{}
defer <-k.shardSemaphore

And then I no longer need to worry whether all the branches of my if statements have remembered to do releaseSemaphore().

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer would've worked if we were in a simple function of some sort, but here, unless we refactor, it won't work for, from my understanding, defer works when the scope collapses, ie function returns/panics or for-loop completes in full (not just a single iteration); so here we need to manually release it

(alas line 222-226 would be fine with defer, just not the rest)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed, what I was trying to say is... I think that small refactor is worth exploring.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I started with that idea, and I do agree that it's better. But in implementing the refactor, I realised that it's not as simple as it looks, because we have flow control statements that are fairly fundamental to the logical flow we need to preserve.

That's not a big challenge to overcome, but I lost confidence in this approach - I felt it introduces enough scope of change to be uncomfortable doing it under time pressure. Or at least, enough in the weeds detail to be careful of, if that makes sense?

(Mostly thinking of the testing required for this feature - there are already enough variables without a possible hidden issue).

However, I'm not completely wedded to that decision. I'll take a look at it again, perhaps it's not as scary as I thought - I'll give it a look but I might still land on playing it safer :)

Copy link
Author

@colmsnowplow colmsnowplow Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ian Streeter (@istreeter) as you suspected, after approaching it fresh, it's not as scary as I had initially seen it to be. I put the refactor into a separate PR, but only to avoid conflicts with the metrics implementation, which is separate.

It's much cleaner this way. PR is in draft atm because I used claude and want to give it a more close re-review myself, but broadly I think it's good.

Copy link
Author

@colmsnowplow colmsnowplow Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, it turned out to be even better your way. I discovered that I missed releasing the semaohore when we exit on a checkpointer error - tests only sometimes fail as it depends on hitting that condition. The refactor removes the issue.

@colmsnowplow
Copy link
Author

Merging this one, but I do plan on also adding the refactor ian mentioned - just doing so separately :)

@colmsnowplow colmsnowplow merged commit 7f01690 into release/1.7.0 Sep 5, 2025
1 of 2 checks passed
@colmsnowplow colmsnowplow deleted the feat/maxConcurrentShards branch September 5, 2025 13:13
colmsnowplow added a commit that referenced this pull request Sep 9, 2025
colmsnowplow added a commit that referenced this pull request Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants