Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hbase store Can't get the locations #1268

Closed
mcammisa75 opened this issue Dec 15, 2017 · 11 comments
Closed

Hbase store Can't get the locations #1268

mcammisa75 opened this issue Dec 15, 2017 · 11 comments

Comments

@mcammisa75
Copy link

Hi all,
I'm trying to ingest shapefile inside my store created in Hbase. I've used the following commands:

geowave config addstore meteo -t hbase -z XXX.XXX.XXX.XXX:2181
geowave config addindex -t spatial_temporal meteo-spatial_temporal

but when I try ingest with the following command

geowave ingest localToGW ./marcoTemp meteo meteo-spatial_temporal -f geotools-vector

I receive the following error.

org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations

May you help me?
Thanks in advance

@rfecher
Copy link
Contributor

rfecher commented Dec 15, 2017

it is having difficulty communicating with zookeeper.

one question I'd have is how hbase was installed - is it within a vendor's distribution such as cloudera or hortonworks or is it strictly the open source apache project that you installed?

without really knowing much more information, off the cuff one thing to try is to explicitly put hbase-site.xml within your geowave commandline tools jar. Under most deployments this should be found on the classpath and contains important environmental information to appropriately connect to the backend, but if it is not found, this step is making that very explicit and in fact always including in no matter what context geowave is running in (ie. spark, mapreduce, etc.).

cd /etc/hbase/conf (usually hbase-site.xml should be found here, but if for some reason it is somewhere else, change to that directory)
zip -g /usr/local/geowave/tools/geowave-tools-${version}-${vendor}.jar hbase-site.xml (the actual name of the jar depends on the version and the vendor distribution that you used to install geowave, there will be one jar in that directory and make sure to use it)

Also, there is a geowave --debug ... option on the commandline which basically opens up the floodgates of logging, this may be a lot of log spam, or it may give you some additional useful info.

@mcammisa75
Copy link
Author

Hi rfecher, the cluster is an HortonWorks distribution. I'll try to add the hbase cond file inside the jar and retry.
I'll keep you informed.
Thanks

@mcammisa75
Copy link
Author

Hi rfecher, I've added hbase-site to the jar and now the error is different.

18 Dec 10:30:00 ERROR [persist.PersistableFactory] - '600' already registered for class 'mil.nga.giat.geowave.adapter.raster.Resolution'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin' with id '600'
18 Dec 10:30:00 ERROR [persist.PersistableFactory] - '601' already registered for class 'mil.nga.giat.geowave.adapter.raster.adapter.CompoundHierarchicalIndexStrategyWrapper'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin$IngestTwitterFromHdfs' with id '601'
18 Dec 10:30:00 ERROR [local.IngestTask] - Fatal error occured while trying to get an index writer.
java.lang.NullPointerException
at mil.nga.giat.geowave.core.geotime.index.dimension.TemporalBinningStrategy.getNormalizedRanges(TemporalBinningStrategy.java:378)

Why the system is referring to twitter?
I've started the ingest with:

geowave ingest localToGW /marcoTemp meteo meteo-spatial_temporal -f geotools-vector

Thanks

@rfecher
Copy link
Contributor

rfecher commented Dec 18, 2017

I think you have a fix with putting hbase-site.xml in the jar. If you are using geoserver as well, I suggest copying /usr/local/geowave/tools/geowave-tools-${version}-hdp2.jar into and replacing /usr/local/geowave/{geoserver or tomcat8 depending on which version of geowave you're using}/webapps/geoserver/WEB-INF/lib/geowave-geoserver-${version}-hdp2.jar
and then restart the service (for example sudo service geowave restart or sudo service gwtomcat restart depending on version of geowave)
That way you will be sure not to run into similar connectivity issues within geoserver.
the Twitter error messages are actually common in in older versions, I think it only appears in 0.9.5 (I'm guessing you're not using the latest geowave version?) and can be ignored in your case. There is a workaround to get the twitter ingest format to work, and all other ingest formats are uneffected.

Do you have an attribute in your feature data that is a temporal data type (such as java.util.Date)? My guess is that it is unable to find a time field for your spatial-temporal index. If there is not a temporal attribute, you should ingest with an index type of "spatial" instead of type "spatial_temporal."

If this is in fact the case, and we are not preempting ingest with an earlier warning message that's more clear about not finding the temporal attribute, that can be a relatively easy issue to polish so please let us know.

@mcammisa75
Copy link
Author

Hi rfecher, I've updated the jar inside geoserver that you mentioned. With my code I've indexed generated data as you posted. I want for the moment to avoid index real data, but verify that geoserver is able to use the store (Hbase). Actually I receive "Error creating data store, check the parameters. Error message: null" while creating Store in geoserver. I've done something wrong? Is there any way to check if the data indexed inside HBase are correct?

Thanks

@rfecher
Copy link
Contributor

rfecher commented Dec 19, 2017

did you use geowave gs addlayer or one of the geowave commandline tools to create the datastore in geoserver or are you trying to do it manually in the geoserver UI? I primarily use the commandline and am confident that works, so that would be my suggestion. If that doesn't work, does geowave datastores show up in geoserver's UI? I'd also look at the geoserver log for any more clues.

You can run things like geowave remote liststats <storename> to see stats about the data you ingested. You should see a COUNT statistic for example that represents the number of rows you ingested. You can also do something like geowave analytic sql 'select * from <storename>' and add a where clause if you'd like to show the actual data and not just statistics. By default it just shows the first 20 results to keep you from killing your console with millions of results but you can increase that limit with the -s <# of results> or --show <# of results> option.

@mcammisa75
Copy link
Author

Hi rfecher, I've added a limited number of elements. If I run

geowave remote liststats meteo
this is the output:
22 Dec 10:09:10 WARN [util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
Dec 22, 2017 10:09:12 AM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
22 Dec 10:09:12 ERROR [persist.PersistableFactory] - '600' already registered for class 'mil.nga.giat.geowave.adapter.raster.Resolution'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin' with id '600'
22 Dec 10:09:12 ERROR [persist.PersistableFactory] - '601' already registered for class 'mil.nga.giat.geowave.adapter.raster.adapter.CompoundHierarchicalIndexStrategyWrapper'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin$IngestTwitterFromHdfs' with id '601'

If I run

geowave analytic sql 'select count(*) from meteo'
or
geowave analytic sql 'select * from meteo'

22 Dec 10:09:49 WARN [util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
22 Dec 10:09:52 WARN [geoserver.platform] - Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
22 Dec 10:09:52 ERROR [persist.PersistableFactory] - '600' already registered for class 'mil.nga.giat.geowave.adapter.raster.Resolution'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin' with id '600'
22 Dec 10:09:52 ERROR [persist.PersistableFactory] - '601' already registered for class 'mil.nga.giat.geowave.adapter.raster.adapter.CompoundHierarchicalIndexStrategyWrapper'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin$IngestTwitterFromHdfs' with id '601'
22 Dec 10:09:53 WARN [metadata.AbstractHBasePersistence] - Object 'ROW_RANGE_SPATIAL_TEMPORAL_IDX_BALANCED_YEAR_SPATIAL_TEMPORAL_IDX_BALANCED_YEAR' not found
22 Dec 10:09:53 WARN [splits.SplitsProvider] - Could not determine range of data from 'RowRangeDataStatistics'. Range will not be clipped. This may result in some splits being empty.
22 Dec 10:09:53 WARN [cli.GeoWaveMain] - Unable to execute operation
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
Exchange SinglePartition
+- *HashAggregate(keys=[], functions=[], output=[])
+- *Project
+- Scan ExistingRDD[geom#6,title#7,link#8,tags#9,eventType#10,date#11]

    at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
    at org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:115)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:252)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
    at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:386)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)

.......

@mcammisa75
Copy link
Author

For geoserver I've run:

geowave gs addlayer meteo

22 Dec 10:15:18 ERROR [persist.PersistableFactory] - '600' already registered for class 'mil.nga.giat.geowave.adapter.raster.Resolution'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin' with id '600'
22 Dec 10:15:18 ERROR [persist.PersistableFactory] - '601' already registered for class 'mil.nga.giat.geowave.adapter.raster.adapter.CompoundHierarchicalIndexStrategyWrapper'. Cannot register 'class mil.nga.giat.geowave.format.twitter.TwitterIngestPlugin$IngestTwitterFromHdfs' with id '601'
22 Dec 10:15:18 INFO [util.FeatureDataUtils] - Generating patched classloader
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - getAdapterInfo for id = null
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - > Adapter ID: app
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - > Adapter Type: FeatureDataAdapter
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - id is null or all
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - > 'app' adapter passed filter
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - getStoreAdapterInfo(meteo) got 1 ids
22 Dec 10:15:18 DEBUG [geoserver.GeoServerRestClient] - Finished retrieving adapter list
22 Dec 10:15:19 ERROR [geoserver.GeoServerRestClient] - Error retrieving GeoServer workspace list
22 Dec 10:15:19 DEBUG [geoserver.GeoServerRestClient] - addlayer needs to create the geowave workspace
Error adding GeoServer layer for store 'meteo; code = 404
22 Dec 10:15:19 DEBUG [ipc.Client] - stopping client from cache: org.apache.hadoop.ipc.Client@5c45d770
22 Dec 10:15:19 DEBUG [ipc.Client] - removing client from cache: org.apache.hadoop.ipc.Client@5c45d770
22 Dec 10:15:19 DEBUG [ipc.Client] - stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@5c45d770
22 Dec 10:15:19 DEBUG [ipc.Client] - Stopping client
22 Dec 10:15:19 DEBUG [util.ShutdownHookManager] - ShutdownHookManger complete shutdown.

I suppose 404 is not a good news. No added layers are in the GeoServer interface

Thanks

@rfecher
Copy link
Contributor

rfecher commented Dec 22, 2017

yep the 404 (not found) is a problem. You configure where geoserver is on your commandline, using geowave config geoserver <host:port>

You must be using 0.9.5 because those twitter error messages were only in that release (and don't worry about them, they're just log spam in your case).

you need to know what port you're running geoserver on. If its through our installer it'd be port 8080 by default (although advanced users can modify that within puppet and in fact our EMR boostraps would use port 8000). So my best guess is you have it running on port 8080, but I really don't know what IP/host its running on? The default commandline config if you don't configure it yourself tries localhost:8080 so its seemingly not running there because you have a 404. Or maybe its installed and just not running? In 0.9.5 the command to make sure its running would be sudo service geowave restart

@mcammisa75
Copy link
Author

Thanks rfecher, solved the problem of geoserver. Now I can see a new store in it.

geowave meteo-vector GeoWave Datastore - HBASE

but I'm not able to add a new layer, when I try I see no resources related to the store. I think is related to the other problem, i.e. after the indexing errors during:

geowave analytic sql 'select count(*) from meteo'
or
geowave analytic sql 'select * from meteo'

Thanks

@rfecher
Copy link
Contributor

rfecher commented Dec 22, 2017

ok, yeah, it seems you don't have any data/tables. you can check on :16010 (or navigate to the HBase master web console through Amabari). You should see tables prefixed by your gwNamespace, the most basic of tables would be at least to have _GEOWAVE_METADATA. From above did you ever resolve your issue where it seemed as though you were attempting spatial_temporal indexing on data that did not include an attribute with a temporal type (commonly java.util.Date but there are a couple other acceptable types, I think java.sql.Date being another). For that I would suggest simply trying to start with a spatial index type for now, and later ensure you get data with Dates for the temporal indexing.

@rfecher rfecher closed this as completed Sep 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants