New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read SQL Servers binary format directly #174
Conversation
Read SQL Server binary directly into JTS Geometries, instead of going via WKB. While geotools will use a little longer to decode (than WKB), SQL Server won't have to encode at all, resulting in a performance improvement.
Everything that becomes part of GeoTools has to be licenced under LGPL. I'll have a look at the patches as spare time allows (or maybe Justin wants to?). Wondering, are V1 and V2 specs compatible when it comes to the same geometry type? Like, Point, Line and Polygon are encoded the same way? |
About the license: that's ok.
To support points, lines, polygons in V2, you would at least have to support the new definition of "figure attribute". Otherwise the formats are the same. However, I downloaded the latest SQL Server 2012 release, and when I created a new geometry, I got version 1. I've also tested with SQL Server 2008 R2. So the fact that I wasn't able to produce test-data with SQL Server stopped me from implementing partial support for V2. If you could point me in the right direction of how to make the choice configurable, I can implement it. How would the user specify it - in the GUI? |
Just add a new configuration parameter in the factory, that gets eventually set in the dialect, like the one for supporting native paging. The GUI in GeoServer is automatically generated from the parameter list (you'll have to also add it to the list of params advertised by the factory, and its jndi equivalent). |
Added a parameter to switch between WKB and SQL Server binary as serialization format
I've added a new configuration parameter. It defaults to using WKB. If the new code proves to be stable, I suggest switching default settings later. |
@@ -0,0 +1,25 @@ | |||
package org.geotools.data.sqlserver.Reader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to stick to the convention of keeping package names lowercase.
First off great work! This is something i have wanted to see for a while since as you note it can have a huge performance impact. I think the patch looks great for the most part, except for the optional configuration parameter, which as implemented will still change the default for people upgrading. I think instead of explicitly having to enable wkb serialization we should have to explicitly enable native serializastion. So i would change the parameter to "NATIVE_SERIALIZATION" and have it by default set to false. As you say if it shows to be stable we can make it the default in a future version. |
Additionally the parameter is moved up, which groups it together with two other boolean params in Geoserver's GUI. Corrected package name to lowercase.
You are correct - changing default for people upgrading was unintended. I've corrected it, and I also took the liberty to move the parameter up by one position in the list of advertised parameters. I just thought it looked better in GeoServer. The casing of the parameters (as they appear in GeoServer) is inconsistent, so I'm not sure if there is a correct way. Another thing: I believe all the tests now run with the default setting (WKB), so the native implementation is not tested (other than the unit tests which test serialization directly). Not sure if this is a problem. |
Great, thanks for changing that. And yeah, the casing is inconsistent so i wouldn't worry about that. Indeed it would be nice to have at least one of the tests run with the native serialization, ideally one of the tests that involves reading geometries, like SQLServerFeatureSourceTest. It should just be a matter of extending the existing class and overriding the setupDataStore method to set the native serialization. But I will leave that up to you, i don't see it as a blocker for this change since it already has a good test that exercise the changes. So as far as i am concerned the patch is good but i'll give Andrea a chance to weigh in as well before merge. |
Thanks a lot, looking good. I've manually merged it and added a few of changes:
The full result is here: Given that the behavior is optional and I've found no issues in the code or tests, I've also backported the patch to the stable series (9.x). |
That's great! Thank you both for reviewing it so quickly! |
Read SQL Server binary directly into JTS Geometries, instead of going
via WKB. While geotools will use a little longer to decode (about 50% more
compared to WKB), SQL Server won't have to encode at all, resulting in a performance
improvement.
A simple benchmark showed a performance increase around 50% on a
layer with many small geometries (a grid), while another layer only showed
7% performance increase.
The binary reader only support v1 of the specification. V2 adds support for CircularString, CompoundCurve, CurvePolygon, FullGlobe (http://go.microsoft.com/fwlink/?LinkId=226407), and is not supported by today's WKB implementation either (as far as I can see).
It would be nice if the added classes (for parsing the sql server binary) could be licenced freely (eg. apache/bsd), but if you would like it to be LGPL like the rest, that's OK too.
It's important to review this thoroughly so that everything that today's implementation supports still works :)