Notes on my response
I've attempted to demonstrate several things in this response.
- Firstly, I've written a few straightforward tests (of the spec type, not the unit test type, mostly for simplicity and clarity). These tests are obviously not complete but are provided to demonstrate at least some thoughtful conditions I might break out from the provided specifications. If I were to do this in a business context, I would put a little more effort into how I modularized the tests, allowing us to test each of the pluggable modules against a common core test suite, decouple the tests a bit more from the implementation, and of course consider a broader array of possible error conditions, both within the code and in the tests.
- I've also made an attempt to show that optimization is a series of tradeoffs - in this case, I've provided two optimization
examples, one where a rudimentary index provides a speedup of about one order of magnitude over the original naiive
effort, and one which provides yet another order of magnitude or so over that. The tradeoff here is memory usage and
precalculation time - along with the need to more carefully re-initialize objects when changing wordlists.
- I should note here also that there is always more optimization that can be done - and that optimization is always a matter of what you are optimizing in exchange for what you are trading - are you trying to optimize for memory efficiency? You may have to trade computational efficiency. Are you trying to optimize for both? You may need to precalculate and trade off dynamic flexibility. And so forth.
- The important thing is that all of the tradeoffs and desired attributes are carefully weighed against each other to come up with a suitable strategy - and this often means benchmarking, testing, and occasionally realizing later on that new business needs require a different choice than was originally made.
- My code is not entirely refactored - deliberately so in some cases, and for expediency in responding to your challenge in others. I view refactoring as an ongoing process - a balance between completion time/total effort and the obsessive desire to optimize further.
- Another important thing I wished to demonstrate is that optimized code does not have to be unreadable or even "less readable" than unoptimized code - the why of the optimized code may not be immediately clear, but the how should still come through with little more than a careful read and some thought. I personally do not value easy readability over performance in situations where performance matters - but I believe that all code should be as concise yet clear as possible provided one does not compromise meaningfully on the performance aspects that matter. To put this point perhaps more clearly: code should be as obvious and readable as it can be and still be performant, but no more obvious than that.
- Finally, I've made a simple attempt at showing one of many methods that can be used for code modularity - having a simple switcher for which combination finder is used by the wordlist parent object. This is a very trivial example of pluggable modularity, but it is at least dynamic. Writing extensible objects is a very deep subject, often delving into creating DSLs for the extension specification, and can get messy. I prefer to use tools like the Decorator pattern rather than the more naiive configuration mechanisms where they make sense, as they can be more contextually dynamic and also layer a bit more elegantly, but I'm aware there are many other methods of providing extensible or configurable behaviour.
To run the example benchmark I've provided, which makes use of the full set of features I've written in, just type
A core installation of MRI 2.4 or later should run this just fine.
To run the provided example specs for the finder class, you'll need minitest gem installed. Then just type
ruby -Ilib:test test/wordlist/hashed_combination_finder_spec.rb (or combination_finder_spec.rb if you wish to use the slower class).
If I were to integrate this into a larger project I would need to consider the project's namespaces, the role that these objects play, and might refactor the objects differently depending on if they are in a Rails environment acting as POROs or Active Record objects. For instance, the wordlist object is rather redundant - I've created it mostly to demonstrate modularity and to wrap the text handling components. This could be rolled into the main object representing the wordlist, for example, or the combination finders could be written as or wrapped within service objects which provide useful functionality not directly related to word lists but perhaps to any object which uses compositional components.