-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help with pyspark #408
Comments
Can you check few things ?
|
Hi Imbriced,
|
Hi, |
Hi, |
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0.
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. # Conflicts: # docs/tutorial/geospark-python.md # mkdocs.yml
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. # Conflicts: # docs/tutorial/geospark-python.md # mkdocs.yml
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. * Add geo-pyspark on PyPi. * Change name of the package from geo_pyspark to geospark. * Change name from geo_pyspark to geospark. * Add CI script for Python. * Update documentation for geospark python. * Update CI script with removing DskipTests attribute. Bring back mvn clean install instead of mvn -q clean install -DskipTests whic was used to speed up tests. * Fix issue with CI script. -q missing flag was causing issue with to much verbosity. * Fix issue with amount of time with testing. Remove testing Spark 2.3 with Python, there is tests only for Python 3.7 and Spark 2.4. * Update jar files for previous GeoSpark SQL releases. The update was caused by package name change.
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. * Add geo-pyspark on PyPi. * Change name of the package from geo_pyspark to geospark. * Change name from geo_pyspark to geospark. * Add CI script for Python. * Update documentation for geospark python. * Update CI script with removing DskipTests attribute. Bring back mvn clean install instead of mvn -q clean install -DskipTests whic was used to speed up tests. * Fix issue with CI script. -q missing flag was causing issue with to much verbosity. * Fix issue with amount of time with testing. Remove testing Spark 2.3 with Python, there is tests only for Python 3.7 and Spark 2.4. * Update jar files for previous GeoSpark SQL releases. The update was caused by package name change. # Conflicts: # python/README.md
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. * Add geo-pyspark on PyPi. * Change name of the package from geo_pyspark to geospark. * Change name from geo_pyspark to geospark. * Add CI script for Python. * Update documentation for geospark python. * Update CI script with removing DskipTests attribute. Bring back mvn clean install instead of mvn -q clean install -DskipTests whic was used to speed up tests. * Fix issue with CI script. -q missing flag was causing issue with to much verbosity. * Fix issue with amount of time with testing. Remove testing Spark 2.3 with Python, there is tests only for Python 3.7 and Spark 2.4. * Update jar files for previous GeoSpark SQL releases. The update was caused by package name change. # Conflicts: # python/README.md
Hi did you solve the problem? I have the same issue. |
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. * Add geo-pyspark on PyPi. * Change name of the package from geo_pyspark to geospark. * Change name from geo_pyspark to geospark. * Add CI script for Python. * Update documentation for geospark python. * Update CI script with removing DskipTests attribute. Bring back mvn clean install instead of mvn -q clean install -DskipTests whic was used to speed up tests. * Fix issue with CI script. -q missing flag was causing issue with to much verbosity. * Fix issue with amount of time with testing. Remove testing Spark 2.3 with Python, there is tests only for Python 3.7 and Spark 2.4. * Update jar files for previous GeoSpark SQL releases. The update was caused by package name change. * [New version release] Set GeoSpark version to 1.3.1 * Add functions object for GeoSpark functions. * Replaced GeometrySerializer to use WKB API instead of the ShapeSerde which contains bugs (added test case with a buggy multipolygon) * Fixed test that before the WKB update passed by mistake (intersection of none intersect polygons returns multipolygon which make no sense) * Change deserialization methodology to WKB. * Update osgeo repo to use the new repository * Removed unused test "test serializing with user Data" * Removed duplicate test case Removed failure test Passed St_GeomFromWKT Removed unused testWkb file * Removed unused copy jar in the .travis.yml * Remove old dependencies from travis script. * Remove temporary files. * Remove temporary files. Co-authored-by: Pawel <pawel93kocinski@gmail.com>
I recommend to put the jar geospark_2.11-1.3.1.jar somewhere on HDFS on the cluster. The following example assume that it is in the folder /jars on HDFS in the cluster (you can put it though anywhere on HDFS and adapt the URL below accordingly). Please replace "myhdfshost" with the hdfs url of your cluster in the following fragment:
|
* Fix Issue, unread block data (#408) * Add GeoSpark core Python API, version beta. * Fix issue with additional else statement. * Add WkbReader to direct imports, Fix issue with version tests. * Add geo_pyspark version 0.3.0. * Add geo_pyspark version 0.3.0. * Update wheel file for geo_pyspark version 0.3.0. * Improve serialization process for GeoSpark Python. * Fix Issue with Adapter import. * Create example notebook for GeoPysparkSQL and GeoPysparkCore. * Delete ShowCase Notebook.ipynb * Update GeoSparkCore example notebook. * Update code for DataBricks platform support. * Add support for collect SpatialPartitionedRDD. * Add persist possibility to indexedRDD. * Add support for serializing rawSpatialRDD. * Update wheel file for geo_pyspark version 0.3.0. * Add geo-pyspark on PyPi. * Change name of the package from geo_pyspark to geospark. * Change name from geo_pyspark to geospark. * Add CI script for Python. * Update documentation for geospark python. * Update CI script with removing DskipTests attribute. Bring back mvn clean install instead of mvn -q clean install -DskipTests whic was used to speed up tests. * Fix issue with CI script. -q missing flag was causing issue with to much verbosity. * Fix issue with amount of time with testing. Remove testing Spark 2.3 with Python, there is tests only for Python 3.7 and Spark 2.4. * Update jar files for previous GeoSpark SQL releases. The update was caused by package name change. * [New version release] Set GeoSpark version to 1.3.1 * Add functions object for GeoSpark functions. * Add support for partition number in spatialPartitioning.
Expected behavior
GeoSparkRegistrator.registerAll(spark)
return 0Actual behavior
Got error
Steps to reproduce the problem
this is my code:
Settings
GeoSpark version = 1.2
Apache Spark version = 2.4.0-cdh6.2.1
JRE version = 1.8
API type = Python
The text was updated successfully, but these errors were encountered: