Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only batch fetch when there is a sufficiently large number requested #316

Merged
merged 4 commits into from
Jul 27, 2018

Conversation

dturn
Copy link
Contributor

@dturn dturn commented Jul 12, 2018

Polling can take a long time for small resource groups when the cluster contains a large number of resources of that type. Don't batch fetch when number of resources being tracked is small.

fixes: #314

@dturn
Copy link
Contributor Author

dturn commented Jul 12, 2018

PR that added batch fetching #251. Doesn't look like the tests had major changes.

When LARGE_BATCH_THRESHOLD is set to 5 all tests pass (5 was a number out of a hat).

When set to -1

  • 5 unit tests fail with errors expected errors e.x.:
Minitest::Assertion: unexpected invocation: #<AnyInstance:KubernetesDeploy::Kubectl>.run("get", "BadCitizen", "foo", "-a", "--output=json")
unsatisfied expectations:
- expected exactly once, not yet invoked: #<AnyInstance:KubernetesDeploy::Kubectl>.run("get", "BadCitizen", "-a", "--output=json")

@dturn
Copy link
Contributor Author

dturn commented Jul 13, 2018

@Shopify/cloudx this is ready for 👀

@@ -1,6 +1,8 @@
# frozen_string_literal: true
module KubernetesDeploy
class SyncMediator
LARGE_BATCH_THRESHOLD = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes you choose 5? I would expect a much higher number. We fetch 8-way parallel, so I'd be inclined to define the threshold in relation to how many batches of concurrent requests it will take... maybe 3?

On the other hand, what do you think about applying the threshold per class rather than globally? E.g. maybe you have 15 things, but they're each of a different kind, so the batched fetching doesn't actually help. I'm thinking it would be more accurate in terms of using the batch strategy when it actually helps, but it would need to take sync dependencies into account and definitely be more complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 was a number out of a hat, I didn't want to get bogged down with this number before seeing how the code looked. I think 3 x parallelism makes sense.

I agree we could get a lot more sophisticated here, but it seems like you're a small app who wont batch or your giant and the extra logic wont make a difference in most cases.


def test_sync_does_not_batch_with_few_resources
stub_kubectl_response('get', 'FakeConfigMap', @fake_cm.name, *@params,
resp: { "items" => [@fake_cm.kubectl_response] }, times: 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

times: 0 seems to still expect a request, but not return a valid response from it... why? I'd think we'd want a .expects.never here on a list call (the one here is an instance call for @fake_cm), and have an actual stub on the instance API call we expect rather than the expects below, to show what happened to the cache (it wasn't populated).

mediator.sync(test_resources)
end

def test_sync_does_batch_with_enough_resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What value does this add beyond test_sync_caches_the_types_of_the_dependencies_of_the_instances_given and test_sync_caches_the_types_of_the_instances_given (which should probably have their names amended)? Because this test is stubbing the instance syncs, it isn't actually proving anything about the cache, just that the list api call was made (which those existing tests also do).

mediator.sync(test_resources)
end

def test_sync_does_not_batch_with_few_resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I know I talked about this in terms of batching, but the real behaviour difference might be better described in terms of whether or not sync warms the type cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot more sense.

Copy link
Contributor

@KnVerey KnVerey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions re: tests, but the change looks good!

end

def test_sync_does_not_warm_cache_with_few_resources
KubernetesDeploy::Kubectl.any_instance.expects(:run).with('get', 'FakeConfigMap', @fake_cm.name, *@params).never
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't really matter because an exception will be raised if either request is attempted, but technically the cache warming would be requesting the list, not the instance.


test_resources = [@fake_pod, @fake_cm, @fake_cm2, @fake_deployment]
test_resources = [@fake_cm]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this test the relevant edge by multiplying it by KubernetesDeploy::SyncMediator::LARGE_BATCH_THRESHOLD


test_resources = [@fake_pod, @fake_cm, @fake_cm2, @fake_deployment]
test_resources = [@fake_cm]
test_resources.each { |r| r.expects(:sync).once }
mediator.sync(test_resources)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For additional proof, we could add a request stub after this and do mediator.get_instance('FakeConfigMap', @fake_cm.name) to show the cache is not warm (kinda the opposite of what the cache-is-warm tests above do).

@dturn
Copy link
Contributor Author

dturn commented Jul 26, 2018

ping

Copy link
Contributor

@karanthukral karanthukral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dturn dturn merged commit 7b47ed8 into master Jul 27, 2018
@dturn dturn deleted the bypass-batched-fetching branch July 27, 2018 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bypass batched fetching for small numbers of resources
3 participants