-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Labels
bugWhen it really isn't a "feature".When it really isn't a "feature".doozieA hard issue/bug dealing with deep Spark internals.A hard issue/bug dealing with deep Spark internals.
Milestone
Description
Description
With various logical local algebra operations, a bool cell type tile is returned.
from pyrasterframes.rasterfunctions import rf_local_equal, rf_convert_cell_type
from pyspark.sql.functions import lit
df = spark.read.raster('/data/raster/example.tif')
df = df.withColumn('trouble', rf_local_equal('proj_raster', 42))
df.limit(10).toPandas()Expected result
We should be able to return bool cell type Tiles to the Python driver via collect, head, toPandas etc operations on the dataframe.
Actual result
Deserialization error as shown below
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/pyrasterframes/rf_types.py in deserialize(self, datum)
437 as_numpy = np.frombuffer(cell_data_bytes, dtype=cell_type.to_numpy_dtype())
--> 438 reshaped = as_numpy.reshape((rows, cols))
439 t = Tile(reshaped, cell_type)
ValueError: cannot reshape array of size 8192 into shape (256,256)
.
.
.
Work around
Explicitly convert the cell type to int8.
from pyrasterframes.rf_types import CellType
ct = CellType.int8()
df = df.drop('trouble') \
.withColumn('work_around',
rf_convert_cell_type(
rf_local_equal('proj_raster', lit(42)),
ct))
df.limit(10).toPandas()Metadata
Metadata
Assignees
Labels
bugWhen it really isn't a "feature".When it really isn't a "feature".doozieA hard issue/bug dealing with deep Spark internals.A hard issue/bug dealing with deep Spark internals.