-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExtendableCoordinateSequence #271
ExtendableCoordinateSequence #271
Conversation
and unit test
Completed unit test Added performance test Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
Added add and insertAt functions to ExtendableCoordinateSequence Completed unit test Added performance test Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
…Obermaier/jts into enhancement/ExtendableSequence
Added test to testConstructor, made other test use PackedCoordinateSequence.Float Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
Added comparison to current behavior, ArrayList storing Coordinates and creating a sequence from the resulting Coordinate array. Fixed performance number initialization, added overtime in percent Sort methods in PerformanceTestRunner.findMethods
Another change to the ensureCapacity logic Tweaked output of performance test Changed PerformanceTestRunner to use System.nanoTime() instead of StopWatch Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
| * ensures that a sequence is large enough to store ordinate | ||
| * values. | ||
| */ | ||
| public class ExtendableCoordinateSequence implements CoordinateSequence, Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class name kind of conflicts with ExtendedCoordinateSequence.java which we have noticed is in common use by downstream projects.
Is there any other kind of naming convention we could use rather than "Extendable"? I know for collections I often use CopyOnWriteArrayList (not the best naming contention).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could use the word delegate since it is delegating storage to another csFactory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problem with a different name at all. Perhaps sth. that includes Builder (as in StringBuilder) or Buffer would describe it better but CoordinateSequenceBuilderSequence is weird. How about BufferedCoordinateSequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DynamicResizingCoordinateSequence?AutoResizingCoordinateSequence?ResizingCoordinateSequence?ExpandableCoordinateSequence?ExpandingCoordinateSequence?VariableSizeCoordinateSequence?
|
This is really interesting functionality, see comment for question about the naming of this class. |
|
Is there an interest in persuing this any further? |
|
@FObermaier I think one of the philosophical questions at play is whether CoordinateSequences should be mutable or immutable. I know this has come up in conversations with @jodygarnett and @dr-jts, but my memory is faulty and don't remember which position they'd take or if we've reached a conclusion. I do know that Jody worked with Martin to add the interface so that downstream users of the JTS could do things which make sense. That said, I'd like to see the JTS library have some 'collections-like api' support for common operations with CoordinateSequences. For the case of immutable coordinatesequences, some obvious operations like concatenating and reversing sequences could be implemented. I think the next step is Jody and Martin weighing in on mutability. If that goes, then I think could make sense. |
|
Generally speaking I think it is less error-prone to have Geometry be immutable. That said, it seems a bit draconian to enforce this 100%. And there are other uses for mutable CoordinateSequences. A prime example being when Geometry is being constructed from a stream of incoming data (as occurs for example in some of the Readers). Perhaps it might have been better to split the read and update accessors for CoordinateSequence into two separate interfaces, so that immutable usage was more obvious (and enforceable). |
|
As far as I recall, the This PR only addresses that currently adding |
|
Is this really the most efficient way to support streaming construction of |
Before I noticed that Felix had started work on this, I had started work on an alternative way to build immutable (well, fixed-length) coordinate sequences using a stream of coordinate data. My solution looked like this (C#). I have a few reasons to believe that that solution is going to be more efficient, in the .NET version, than creating and expanding coordinate sequences using a factory, but I admittedly haven't run any numbers to compare the two ideas, and I don't know how much of my reasons would be equally valid on the JVM. I don't really want to volunteer to test the Java side of things, but if it helps, I could try to port this to .NET and compare the performances of the two. |
|
@airbreather That seems along the lines of what I was thinking about. Is it more efficient to create the separate ordinate lists, rather than a list of Coordinate objects? I guess it would be if they are optimized to store doubles as native values. |
|
The design that @airbreather referenced seems like a more efficient and safer way of creating an arbitrary The only downside might be that the CoordinateBuffer implementation is limited to a max of 4 dimensions. But this should be adequate to support WKT and WKB reading, which are primary use cases. |
The main reason I did it this way was so that users don't have to pay for Z or M values that aren't actually used. Here, if all you need is XY, then you only pay 16 bytes per coordinate (plus the extra room in the array list).
There's no fundamental reason why it couldn't contain an arbitrary number of dimensions, it just seemed like the extra flexibility wasn't worth the cost when I wrote this up. |
* Added GrowableCoordinateSequence * Tweaked ExtendableCoordinateSequence's add methods and the ensureCapacity functionality Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
|
I admit that You can read the figures of a comparison test I created in this file: ExpandableVsGrowable.txt Annotations:
|
* newCapacity = oldCapacity + (oldCapacity >> 1) * implement add(2d) and add(3d) Signed-off-by: Felix Obermaier <felix.obermaier@netcologne.de>
|
@FObermaier what is your view of the status of this? The @airbreather design of CoordinateBuffer seems like a better alternative - do you agree? Would be nice to close this PR if no longer active. |
|
IMHO there is no need for a Plain |
This is a wrapper around the
CoordinateSequenceinterface.Internally it has access to a
CoordinatSequenceFactorythat it uses to create a buffer sequence in which it stores the ordinate values. If the capacity of the sequence is used up, a new sequence with extended capacity is created and the present values are stored in it.Extensions to the public interface
addmethodsinsertAtmethodsgetCapacityfunctiontruncatedfunctionThis PR comes with a unit test and a performance comparing
ExtendableCoordinateSequenceto creating several 1-coordinate sequences and merging those into one.This class is convenient wherever the final size of a
CoordinateSequenceis not known, e.g. when parsing WKT (WKTReader) or when building buffers. It might help with issue #188.I am not sure if the wording is correct, maybe
GrowableCoordinateSequenceorBufferedCoordinateSequenceis a better name.