Only batch fetch when there is a sufficiently large number requested #316

dturn · 2018-07-12T21:53:01Z

Polling can take a long time for small resource groups when the cluster contains a large number of resources of that type. Don't batch fetch when number of resources being tracked is small.

fixes: #314

dturn · 2018-07-12T22:09:40Z

PR that added batch fetching #251. Doesn't look like the tests had major changes.

When LARGE_BATCH_THRESHOLD is set to 5 all tests pass (5 was a number out of a hat).

When set to -1

5 unit tests fail with errors expected errors e.x.:

Minitest::Assertion: unexpected invocation: #<AnyInstance:KubernetesDeploy::Kubectl>.run("get", "BadCitizen", "foo", "-a", "--output=json")
unsatisfied expectations:
- expected exactly once, not yet invoked: #<AnyInstance:KubernetesDeploy::Kubectl>.run("get", "BadCitizen", "-a", "--output=json")

dturn · 2018-07-13T17:27:14Z

@Shopify/cloudx this is ready for 👀

KnVerey · 2018-07-13T20:36:56Z

lib/kubernetes-deploy/sync_mediator.rb

@@ -1,6 +1,8 @@
 # frozen_string_literal: true
 module KubernetesDeploy
  class SyncMediator
+    LARGE_BATCH_THRESHOLD = 5


What makes you choose 5? I would expect a much higher number. We fetch 8-way parallel, so I'd be inclined to define the threshold in relation to how many batches of concurrent requests it will take... maybe 3?

On the other hand, what do you think about applying the threshold per class rather than globally? E.g. maybe you have 15 things, but they're each of a different kind, so the batched fetching doesn't actually help. I'm thinking it would be more accurate in terms of using the batch strategy when it actually helps, but it would need to take sync dependencies into account and definitely be more complicated.

5 was a number out of a hat, I didn't want to get bogged down with this number before seeing how the code looked. I think 3 x parallelism makes sense.

I agree we could get a lot more sophisticated here, but it seems like you're a small app who wont batch or your giant and the extra logic wont make a difference in most cases.

KnVerey · 2018-07-13T20:44:39Z

test/unit/sync_mediator_test.rb

+
+  def test_sync_does_not_batch_with_few_resources
+    stub_kubectl_response('get', 'FakeConfigMap', @fake_cm.name, *@params,
+      resp: { "items" => [@fake_cm.kubectl_response] }, times: 0)


times: 0 seems to still expect a request, but not return a valid response from it... why? I'd think we'd want a .expects.never here on a list call (the one here is an instance call for @fake_cm), and have an actual stub on the instance API call we expect rather than the expects below, to show what happened to the cache (it wasn't populated).

KnVerey · 2018-07-13T20:54:22Z

test/unit/sync_mediator_test.rb

+    mediator.sync(test_resources)
+  end
+
+  def test_sync_does_batch_with_enough_resources


What value does this add beyond test_sync_caches_the_types_of_the_dependencies_of_the_instances_given and test_sync_caches_the_types_of_the_instances_given (which should probably have their names amended)? Because this test is stubbing the instance syncs, it isn't actually proving anything about the cache, just that the list api call was made (which those existing tests also do).

KnVerey · 2018-07-13T20:55:49Z

test/unit/sync_mediator_test.rb

+    mediator.sync(test_resources)
+  end
+
+  def test_sync_does_not_batch_with_few_resources


nit: I know I talked about this in terms of batching, but the real behaviour difference might be better described in terms of whether or not sync warms the type cache

That makes a lot more sense.

KnVerey

I have some suggestions re: tests, but the change looks good!

KnVerey · 2018-07-19T23:55:51Z

test/unit/sync_mediator_test.rb

+  end
+
+  def test_sync_does_not_warm_cache_with_few_resources
+    KubernetesDeploy::Kubectl.any_instance.expects(:run).with('get', 'FakeConfigMap', @fake_cm.name, *@params).never


I guess it doesn't really matter because an exception will be raised if either request is attempted, but technically the cache warming would be requesting the list, not the instance.

KnVerey · 2018-07-19T23:58:01Z

test/unit/sync_mediator_test.rb


-    test_resources = [@fake_pod, @fake_cm, @fake_cm2, @fake_deployment]
+    test_resources = [@fake_cm]


Let's make this test the relevant edge by multiplying it by KubernetesDeploy::SyncMediator::LARGE_BATCH_THRESHOLD

KnVerey · 2018-07-20T00:01:19Z

test/unit/sync_mediator_test.rb


-    test_resources = [@fake_pod, @fake_cm, @fake_cm2, @fake_deployment]
+    test_resources = [@fake_cm]
    test_resources.each { |r| r.expects(:sync).once }
    mediator.sync(test_resources)


For additional proof, we could add a request stub after this and do mediator.get_instance('FakeConfigMap', @fake_cm.name) to show the cache is not warm (kinda the opposite of what the cache-is-warm tests above do).

dturn · 2018-07-26T17:28:38Z

ping

karanthukral

LGTM

how bad do tests break

d6e0014

dturn force-pushed the bypass-batched-fetching branch from e5bf00d to d6e0014 Compare July 12, 2018 21:55

dturn force-pushed the bypass-batched-fetching branch from d39a44a to 8436cab Compare July 13, 2018 17:10

Add some tests

9752189

dturn force-pushed the bypass-batched-fetching branch from 8436cab to 9752189 Compare July 13, 2018 17:12

KnVerey reviewed Jul 13, 2018

View reviewed changes

PR feedback

a45246c

KnVerey requested a review from timothysmith0609 July 19, 2018 23:43

KnVerey approved these changes Jul 20, 2018

View reviewed changes

Fix up not cache warm test

a28d6f2

karanthukral approved these changes Jul 26, 2018

View reviewed changes

dturn merged commit 7b47ed8 into master Jul 27, 2018

dturn deleted the bypass-batched-fetching branch July 27, 2018 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only batch fetch when there is a sufficiently large number requested #316

Only batch fetch when there is a sufficiently large number requested #316

dturn commented Jul 12, 2018

dturn commented Jul 12, 2018 •

edited

Loading

dturn commented Jul 13, 2018

KnVerey Jul 13, 2018

dturn Jul 17, 2018

KnVerey Jul 13, 2018

KnVerey Jul 13, 2018

KnVerey Jul 13, 2018

dturn Jul 17, 2018

KnVerey left a comment

KnVerey Jul 19, 2018

KnVerey Jul 19, 2018

KnVerey Jul 20, 2018

dturn commented Jul 26, 2018

karanthukral left a comment


		test_resources = [@fake_pod, @fake_cm, @fake_cm2, @fake_deployment]
		test_resources = [@fake_cm]

Only batch fetch when there is a sufficiently large number requested #316

Only batch fetch when there is a sufficiently large number requested #316

Conversation

dturn commented Jul 12, 2018

dturn commented Jul 12, 2018 • edited Loading

dturn commented Jul 13, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KnVerey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dturn commented Jul 26, 2018

karanthukral left a comment

Choose a reason for hiding this comment

dturn commented Jul 12, 2018 •

edited

Loading