Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Replace geowave subproject with GeoTrellis/GeoWave data adapter (#3364)
* Replace geowave subproject with GeoTrellis / GeoWave data adapter * Update geowave module description * Add geowave to cassandra tests * Downgrade JTS to 1.16 to avoid bin-compat problem with GeoWave * Downgrade GeoTools version to 23.2 * Fix GeoWave builds * Fix GeoTools deps * Move GeoWave into a separate executor, rm BlockingThreadPool implementation from benchmarks, generate missing headers * Update SBT plugins * Upd GeoWave syntax to match Scala 2.13 Co-authored-by: Grigory Pomadchin <gr.pomadchin@gmail.com>
- Loading branch information
Showing
154 changed files
with
6,840 additions
and
1,739 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/bin/bash | ||
|
||
.circleci/unzip-rasters.sh | ||
|
||
./sbt -Dsbt.supershell=false "++$SCALA_VERSION" \ | ||
"project geowave" test || { exit 1; } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
cqlsh: | ||
docker exec -it $(FOLDER)_cassandra_1 cqlsh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# GeoTrellis/GeoWave Connector | ||
|
||
GeoTrellis/GeoWave connector for storing raster and volumetric data. | ||
|
||
- [GeoTrellis/GeoWave Connector](#geotrellisgeowave-connector) | ||
- [Requirements](#requirements) | ||
- [Project Inventory](#project-inventory) | ||
- [Development](#development) | ||
- [!Important](#important) | ||
- [Executing Tests](#executing-tests) | ||
## Requirements | ||
|
||
- Docker Engine 17.12+ | ||
- Docker Compose 1.21+ | ||
- OpenJDK 8 | ||
|
||
## Project Inventory | ||
|
||
- `src` - Main project with `GeoTrellisDataAdapter` enabling storing GeoTrellis types with GeoWave | ||
- `benchmark` - Skeleton for microbenchmarks on GeoWave queries | ||
- `docs` - Overview of GeoWave concepts relevant to index and data adapter usage | ||
|
||
## Development | ||
|
||
### !Important | ||
|
||
After merging PRs / fetching changes from master and other branches be sure that you _recreated_ | ||
dev env. Any changes introduced into interfaces that are present in the `Persistable Registry` | ||
and have `fromBinary` and `toBinary`methods can cause serialization / deserialization issues | ||
in tests and as a consequence tests would fail with various of unpredictable runtime exceptions. | ||
|
||
### Executing Tests | ||
|
||
Tests are dependent on Apache Cassandra, Kafka, ZooKeeper, and Graphite with Grafana. First, ensure | ||
these dependencies are running: | ||
|
||
```bash | ||
docker-compose up -d cassandra | ||
``` | ||
|
||
Now, you can execute tests from project root: | ||
|
||
```bash | ||
$ ./sbt "project geowave" test | ||
... | ||
[info] All tests passed. | ||
[success] Total time: 72 s, completed Nov 22, 2019 11:48:25 AM | ||
``` | ||
|
||
When you're done, ensure that the services and networks created by Docker | ||
Compose are torn down: | ||
|
||
```bash | ||
docker-compose down | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# JMH Benchmarks | ||
|
||
## Instructions | ||
|
||
1. Make the following cassandra changes: | ||
```yaml | ||
cassandra: | ||
image: cassandra:3.11 | ||
environment: | ||
- MAX_HEAP_SIZE=4G | ||
- HEAP_NEWSIZE=800M | ||
- CASSANDRA_LISTEN_ADDRESS=127.0.0.1 | ||
mem_limit: 8G | ||
memswap_limit: -1 | ||
``` | ||
2. Ingest data into Cassandra via `sbt "project geowave-benchmark" run` | ||
3. Run benchmarks via `jmh:run -i 5 -wi 5 -f1 -t1 .*QueryBenchmark.*` | ||
It is recommend to run run benchmarks via `jmh:run -i 20 -wi 10 -f1 -t1 .*QueryBenchmark.*` | ||
(to do at least 10 warm up iterations and 20 of actual iterations, just to get a bit more consistent results). | ||
|
||
## Results | ||
|
||
<pre><code> | ||
jmh:run -i 20 -wi 10 -f 1 -t 1 .*QueryBenchmark.* | ||
|
||
88 Entries | ||
Benchmark Mode Cnt Score Error Units | ||
<b>entireSpatialGeometryQuery avgt 20 5.278 ± 0.643 s/op</b> | ||
entireSpatialQuery avgt 20 1.155 ± 0.057 s/op | ||
entireSpatialTemporalElevationElevationQuery avgt 20 1.145 ± 0.069 s/op | ||
entireSpatialTemporalElevationGeometryQuery avgt 20 1.089 ± 0.030 s/op | ||
<b>entireSpatialTemporalElevationGeometryTemporalElevationQuery avgt 20 5.963 ± 0.358 s/op</b> | ||
entireSpatialTemporalElevationGeometryTemporalQuery avgt 20 1.093 ± 0.042 s/op | ||
entireSpatialTemporalElevationQuery avgt 20 1.117 ± 0.033 s/op | ||
entireSpatialTemporalElevationTemporalQuery avgt 20 1.080 ± 0.029 s/op | ||
entireSpatialTemporalGeometryQuery avgt 20 1.117 ± 0.039 s/op | ||
<b>entireSpatialTemporalGeometryTemporalQuery avgt 20 4.223 ± 0.213 s/op</b> | ||
entireSpatialTemporalQuery avgt 20 1.072 ± 0.036 s/op | ||
entireSpatialTemporalTemporalQuery avgt 20 1.110 ± 0.039 s/op | ||
|
||
328 Entries | ||
Benchmark Mode Cnt Score Error Units | ||
<b>entireSpatialGeometryQuery avgt 20 4.705 ± 0.146 s/op</b> | ||
entireSpatialQuery avgt 20 5.249 ± 0.503 s/op | ||
entireSpatialTemporalElevationElevationQuery avgt 20 4.919 ± 0.310 s/op | ||
entireSpatialTemporalElevationGeometryQuery avgt 20 4.688 ± 0.251 s/op | ||
<b>entireSpatialTemporalElevationGeometryTemporalElevationQuery avgt 20 15.801 ± 6.629 s/op</b> | ||
entireSpatialTemporalElevationGeometryTemporalQuery avgt 20 5.212 ± 0.467 s/op | ||
entireSpatialTemporalElevationQuery avgt 20 5.256 ± 1.107 s/op | ||
entireSpatialTemporalElevationTemporalQuery avgt 20 4.878 ± 0.324 s/op | ||
entireSpatialTemporalGeometryQuery avgt 20 4.760 ± 0.498 s/op | ||
<b>entireSpatialTemporalGeometryTemporalQuery avgt 20 4.272 ± 0.126 s/op</b> | ||
entireSpatialTemporalQuery avgt 20 4.553 ± 0.275 s/op | ||
entireSpatialTemporalTemporalQuery avgt 20 4.736 ± 0.290 s/op | ||
</code></pre> | ||
|
||
## Interpretation: | ||
|
||
The index type does affect the query performance. | ||
The more dimensions there are defined for the index, the more ranges | ||
would be generated for the SFC and the more range requests would be sent to Cassandra. | ||
All ranged queries are marked as bold in benchmark results, all other benchmarks generate | ||
full scan queries. | ||
|
||
Full scan by a three dimensional index is more expensive than by a single | ||
or two dimensional index. The more dimensions SFC has, the more ranges would be generated. | ||
|
||
These benchmarks are not representative since were done with a local instance of Cassandra | ||
and demonstrate only the local relative performance that shows how the Query performance | ||
depends on the index type and the amount of data. In fact it is a Cassandra instance benchmark, | ||
though it can give some general sense of how index and query types affect the performance. | ||
|
||
This benchmark measures in fact only full table scans (done via multiple ranged select queries or | ||
via a single select). | ||
|
||
In the `entireSpatialTemporalElevationGeometryTemporalElevationQuery` case the results | ||
are a bit high: too many range queries are generated and it is hard for a single Cassandra instance | ||
to handle them. | ||
|
||
### Legend: | ||
- `entireSpatial({Temporal|TemporalElevation})` performs a full table scan: | ||
```genericsql | ||
SELECT * FROM QueryBench.indexName; | ||
``` | ||
- In all cases where the query contains not all the index dimensions | ||
(for instance a spatial query only from the spatial temporal indexed table), | ||
GeoWave performs a full table scan: | ||
```genericsql | ||
SELECT * FROM QueryBench.indexName; | ||
``` | ||
- In all cases where the query contains all the index dimensions defined for the table, | ||
GeoWave performs multiple ranged queries (number of SFC splits depends on the index dimensionality), | ||
**benchmarks that generate such queries are marked as bold in the JMH report**: | ||
```genericsql | ||
SELECT * FROM QueryBench.indexName | ||
WHERE partition=:partition_val | ||
AND adapter_id IN :adapter_id_val | ||
AND sort>=:sort_min AND sort<:sort_max; | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
geotrellis.geowave.connection.store { | ||
data-store-type = "cassandra" | ||
options = { | ||
"contactPoints": "localhost", | ||
"contactPoints": ${?CASSANDRA_HOST}, | ||
"gwNamespace" : "geotrellis" | ||
} | ||
} | ||
|
||
geotrellis.blocking-thread-pool { | ||
threads = default | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
<configuration debug="true"> | ||
<variable name="LEVEL" value="${LOG_LEVEL:-INFO}"/> | ||
|
||
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender"> | ||
<encoder> | ||
<pattern>%white(%d{HH:mm:ss.SSS}) %highlight(%-5level) %cyan(%logger{50}) - %msg %n</pattern> | ||
</encoder> | ||
</appender> | ||
|
||
<root level="${LEVEL}"> | ||
<appender-ref ref="STDOUT" /> | ||
</root> | ||
|
||
<logger name="org.apache.kafka" level="${LEVEL}"/> | ||
<logger name="mil.navsea.geoindex" level="DEBUG"/> | ||
|
||
<!-- In order to enable this logging you have to register QueryLogger with Cassandra session --> | ||
<!-- https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/#logging-query-latencies --> | ||
|
||
<!-- | ||
<logger name="com.datastax.driver.core.QueryLogger.NORMAL"> | ||
<level value="TRACE"/> | ||
</logger> | ||
<logger name="com.datastax.driver.core.QueryLogger.SLOW"> | ||
<level value="TRACE"/> | ||
</logger> | ||
<logger name="com.datastax.driver.core.QueryLogger.ERROR"> | ||
<level value="TRACE"/> | ||
</logger> | ||
--> | ||
</configuration> |
Oops, something went wrong.