Made parallel APIs for SetOperations in Theta, and Tuple (generics).#348
Conversation
davecromberge
left a comment
There was a problem hiding this comment.
These changes look good 👍
I have a question about the use of reset to reset the state within the stateless intersection and union operations. Specifically, was this related to the bug that you found in the release candidate?
How likely is it that same union/intersection instance could be used in some concurrent context, like an RDD in Spark? This question seems to have come up in the mailing list - is it worth putting one or two lines of java doc in the SetOperation base class? Otherwise, resets may interleave.
|
"Stateless" does not imply concurrent. It just means everything is completed in one action. The term "stateless" is also used to contrast it with "stateful" where the sketch object holds state between subsequent calls to the same object. The user must be aware of this stateful property and respect it. "Stateless" also does not imply that "stateless" and "stateful" operations can be interleaved. All of our sketches are non-thread-safe, with the exception of the "Concurrent*ThetaSketch" family, which was specifically designed to be thread-safe. Perhaps this comment should be in the documentation somewhere? |
|
We already have a clear statement of "not-thread-safe" in our documentation ... do we need to say more? |
I agree with you - you raise a good point that "stateless" does not imply concurrent. This is just something that I thought of in passing, and admittedly it's a contrived scenario, based on one or two that may have come up on the mailing list. Personally, I don't see any reason to reuse a set operation builder, and in our use case we construct a virgin instance for every union or intersection operation - despite serving many concurrent requests. The documentation you have linked to is clear and explicit about the concerns raised in my comment. Thanks for. your response and additional detail. |
Made parallel APIs for SetOperations in Theta, and Tuple (generics): We now have both stateful and stateless operations for union, intersection and AnotB. Added comprehensive examples (and live tests) for these added stateless and stateful operations.
Made consistent naming of union and intersection operations in ArrayOfDoubles package. We don't have stateless operations for union and intersection in ArrayOfDoubles. This can wait for when we do major refactoring of that package.
Also added more complete testing for the inequality search methods for long arrays, which were missing.