Thread based concurrency #315

bestie · 2023-01-25T21:32:13Z

Concurrently runs a number of 'Racecar::Runner' instances in a fixed size thread pool.

Each thread starts a single Racecar::Runner with Racecar::Consumer class instance. All threads run the same consumer class, have the same config and consume partitions from the same topic(s).

ThreadPoolRunnerProxy can be combined with ParallelRunner, to run forks and threads. ParallelRunner is not used or battle tested (at Zendesk) and so this is still not a recommended thing to do.

Racecar does not yet implement a health-check mechanism so an uncaught error in a single worker thread will cause the process to start a graceful shutdown of all threads before exiting with an error.

Other inclusions:

Signal handling has been moved up one level to the CLI
Runner-like object interface standardized to #run #stop #running?
Tests can be run locally without Docker, export LOCAL=1
Some test bugs have been fixed, connections now always close and orphaned processes raise an exception
Suggest that 'parallel' be deprecated in favor of fork/forking next to thread/threaded for ease of understanding

Concurrently runs a number of 'Racecar::Runner' instances in a fixed size thread pool. Each thread starts a single `Racecar::Runner` with `Racecar::Consumer` class instance. All threads run the same consumer class, have the same config and consume partitions from the same topic(s). `ThreadPoolRunnerProxy` can be combined with `ParallelRunner`, to run forks and threads. `ParallelRunner` is not used or battle tested (at Zendesk) and so this is still not a recommended thing to do. Other inclusions: - Signal handling has been moved up one level to the CLI - Runner-like object interface standardized to `#run` `#stop` `#running?` - Test can be run locally with Docker, `export LOCAL=1` - Some test bugs have been fixed, connections now always close and orphaned processes raise an exception

dasch · 2023-02-01T11:50:16Z

lib/racecar/config.rb

@@ -223,7 +236,8 @@ def load_consumer_class(consumer_class)
        consumer_class.name.gsub(/[a-z][A-Z]/) { |str| "#{str[0]}-#{str[1]}" }.downcase,
      ].compact.join

-      self.parallel_workers = consumer_class.parallel_workers
+      self.forks = consumer_class.forks


I think we should avoid breaking changes unless there's a really good reason.

Agreed, it's actually aliased which you see further down the PR 👇

dasch · 2023-02-01T11:51:24Z

lib/racecar/config.rb

@@ -223,7 +236,8 @@ def load_consumer_class(consumer_class)
        consumer_class.name.gsub(/[a-z][A-Z]/) { |str| "#{str[0]}-#{str[1]}" }.downcase,
      ].compact.join

-      self.parallel_workers = consumer_class.parallel_workers
+      self.forks = consumer_class.forks
+      self.threads = consumer_class.threads


I think this should be configurable with the config itself as well; could we add a new config key and have this be e.g. self.threads = consumer_class.threads || self.threads?

dasch · 2023-02-01T11:51:58Z

lib/racecar/consumer.rb

+      # For deprecation
+      def parallel_workers=(forks)
+        self.forks = forks
+      end


Ah, I see there's backwards compatibility; all good!

dasch

Thanks for the PR!

I love the idea of having multi-thread support in Racecar, but I think the design needs to be changed a bit.

Specifically, I'm not so sure about having one runner per thread; I think the real value would be to have a single runner per process, but having the runner be multithreaded. This would mean a shared message fetch loop, with the loop fanning out message batches to worker threads, perhaps using bounded queues.

This would mean that each worker process would be a consumer in the consumer group, rather than each thread. And we would increase our processing concurrency without reducing the efficiency of the fetches.

bestie · 2023-02-01T13:27:13Z

My pleasure!

That sounds a little more complex but totally doable. This is very much MVP concurrency, changing as little as existing code as possible.

Unless someone already has that work in progress, I can give it a try.

Is this enough an improvement to consider merging? If so, I think we could iterate towards your proposed design without breaking the API. It should also be marked experimental anyway.

dasch · 2023-02-01T13:31:52Z

I'm not sure it provides enough value as it stands – with copy-on-write, the forking approach seems better if we're using separate consumer instances anyway.

I think the shared-main-loop approach should be pretty doable; if you look into Runner, you can see how batches are being processed sequentially – we would change that to an async model, with a fixed partition number -> worker thread mapping probably.

bestie · 2023-02-03T09:00:10Z

bestie force-pushed the thread-concurrency branch from 32cfd03 to 3ca2bf0 Compare January 25, 2023 21:34

bestie added 4 commits January 25, 2023 22:38

Immediately breaks on CI

65ae98e

New consumer class per test

a807358

Clean up runners attr_reader

ac1d13f

Integration tests wait for assignments before publishing messages

fed9f24

bestie force-pushed the thread-concurrency branch from 64dbb63 to fed9f24 Compare January 31, 2023 20:28

More clean up

16772e1

bestie changed the title ~~WIP: Thread based concurrency~~ Thread based concurrency Jan 31, 2023

bestie requested review from dasch, deepredsky and leonmaia January 31, 2023 22:32

dasch reviewed Feb 1, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread based concurrency #315

Thread based concurrency #315

bestie commented Jan 25, 2023 •

edited

Loading

dasch Feb 1, 2023

bestie Feb 2, 2023

dasch Feb 1, 2023

dasch Feb 1, 2023

dasch left a comment

bestie commented Feb 1, 2023

dasch commented Feb 1, 2023

bestie commented Feb 3, 2023

Thread based concurrency #315

Are you sure you want to change the base?

Thread based concurrency #315

Conversation

bestie commented Jan 25, 2023 • edited Loading

dasch Feb 1, 2023

Choose a reason for hiding this comment

bestie Feb 2, 2023

Choose a reason for hiding this comment

dasch Feb 1, 2023

Choose a reason for hiding this comment

dasch Feb 1, 2023

Choose a reason for hiding this comment

dasch left a comment

Choose a reason for hiding this comment

bestie commented Feb 1, 2023

dasch commented Feb 1, 2023

bestie commented Feb 3, 2023

bestie commented Jan 25, 2023 •

edited

Loading