Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDAL errors when reading repeatedly from one GDALRasterSource #3184

Closed
metasim opened this issue Feb 7, 2020 · 86 comments
Closed

GDAL errors when reading repeatedly from one GDALRasterSource #3184

metasim opened this issue Feb 7, 2020 · 86 comments
Assignees
Labels

Comments

@metasim
Copy link
Member

metasim commented Feb 7, 2020

This error originated in some RasterFrames work. We have a table where one column is predominantly the same file and the analysis fails with one of a number of errors from GDALDataset, such as:

geotrellis.raster.gdal.GDALIOException: Unable to read in data. GDAL Error Code: 3
	at geotrellis.raster.gdal.GDALDataset$.readTile$extension(GDALDataset.scala:324)
...

or

geotrellis.raster.gdal.MalformedDataTypeException: Unable to determine NoData value. GDAL Exception Code: 3
	at geotrellis.raster.gdal.GDALDataset$.noDataValue$extension1(GDALDataset.scala:247)
...

(See below for extended output)

I removed RasterFrames from the mix, resulting in the test case below. (At this point I have not further reduced to get Spark out of mix with, say, Futures instead.) It should be noted that some of the reads complete successfully.

When I run it on my laptop is completes successfully, but when I run it on a beefier EC2 instance (m5a.2xlarge) it fails. Suspect concurrency level and I/O throughput set the conditions. It appears to work when setting --master=local[1].

Edit: my laptop is MacOS, whereas the EC2 instance is Linux. That may be the pertinent variable instead of instance size. Ran in docker locally with 4 cores and the job succeeded.
Edit: Configured docker to run with 8 cores on my laptop and it failed!

Test Case

RSRead.scala

import org.apache.spark.sql.SparkSession
import geotrellis.raster._
import geotrellis.raster.gdal.GDALRasterSource

// implicit val spark = SparkSession.builder().
//    master("local[*]").appName("Hit me").getOrCreate()

val path = "https://s22s-rasterframes-integration-tests.s3.amazonaws.com/B08.jp2"

spark.range(1000).rdd.
    map(_ => path).
    flatMap(uri => {
      val rs = GDALRasterSource(uri)
      val grid = GridBounds(0, 0, rs.cols - 1, rs.rows - 1)
      val tileBounds = grid.split(256, 256).toSeq
      rs.readBounds(tileBounds)
    }).
    foreach(r => ())

Execution Command

Using Spark 2.4.4, Scala 2.11.12, GDAL 2.4.3 (released 2019/10/28)

spark-shell --packages org.locationtech.geotrellis:geotrellis-gdal_2.11:3.2.0 --repositories https://dl.bintray.com/azavea/geotrellis -I RSRead.scala

Sample Backtrace

Full log output

org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 0.0 failed 1 times, most recent failure: Lost task 5.0 in stage 0.0 (TID 5, localhost, executor driver): geotrellis.raster.gdal.MalformedDataTypeException: Unable to deterime the min/max values in order to calculate CellType. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.cellType$extension1(GDALDataset.scala:299)
        at geotrellis.raster.gdal.GDALDataset$.readTile$extension(GDALDataset.scala:315)
        at geotrellis.raster.gdal.GDALDataset$$anonfun$readMultibandTile$extension$1.apply(GDALDataset.scala:333)
        at geotrellis.raster.gdal.GDALDataset$$anonfun$readMultibandTile$extension$1.apply(GDALDataset.scala:333)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at geotrellis.raster.gdal.GDALDataset$.readMultibandTile$extension(GDALDataset.scala:333)
        at geotrellis.raster.gdal.GDALRasterSource$$anonfun$readBounds$2.apply(GDALRasterSource.scala:107)
        at geotrellis.raster.gdal.GDALRasterSource$$anonfun$readBounds$2.apply(GDALRasterSource.scala:106)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
        at geotrellis.raster.gdal.GDALRasterSource.read(GDALRasterSource.scala:156)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  at scala.Option.foreach(Option.scala:257)
  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:927)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:925)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
  at org.apache.spark.rdd.RDD.foreach(RDD.scala:925)
  ... 94 elided
Caused by: geotrellis.raster.gdal.MalformedDataTypeException: Unable to deterime the min/max values in order to calculate CellType. GDAL Error Code: 3
  at geotrellis.raster.gdal.GDALDataset$.cellType$extension1(GDALDataset.scala:299)
  at geotrellis.raster.gdal.GDALDataset$.readTile$extension(GDALDataset.scala:315)
  at geotrellis.raster.gdal.GDALDataset$$anonfun$readMultibandTile$extension$1.apply(GDALDataset.scala:333)
  at geotrellis.raster.gdal.GDALDataset$$anonfun$readMultibandTile$extension$1.apply(GDALDataset.scala:333)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at scala.collection.Iterator$class.foreach(Iterator.scala:891)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
  at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at geotrellis.raster.gdal.GDALDataset$.readMultibandTile$extension(GDALDataset.scala:333)
  at geotrellis.raster.gdal.GDALRasterSource$$anonfun$readBounds$2.apply(GDALRasterSource.scala:107)
  at geotrellis.raster.gdal.GDALRasterSource$$anonfun$readBounds$2.apply(GDALRasterSource.scala:106)
  at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
  at geotrellis.raster.gdal.GDALRasterSource.read(GDALRasterSource.scala:156)
  at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
  at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
  at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
  at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
  at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
  at scala.collection.Iterator$class.foreach(Iterator.scala:891)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:123)
  at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

cc: @vpipkt

@pomadchin pomadchin added the bug label Feb 7, 2020
@metasim
Copy link
Member Author

metasim commented Feb 7, 2020

GDAL formats in environment

$ gdalinfo --formats
Supported Formats:
  VRT -raster- (rw+v): Virtual Raster
  DERIVED -raster- (ro): Derived datasets using VRT pixel functions
  GTiff -raster- (rw+vs): GeoTIFF
  NITF -raster- (rw+vs): National Imagery Transmission Format
  RPFTOC -raster- (rovs): Raster Product Format TOC format
  ECRGTOC -raster- (rovs): ECRG TOC format
  HFA -raster- (rw+v): Erdas Imagine Images (.img)
  SAR_CEOS -raster- (rov): CEOS SAR Image
  CEOS -raster- (rov): CEOS Image
  JAXAPALSAR -raster- (rov): JAXA PALSAR Product Reader (Level 1.1/1.5)
  GFF -raster- (rov): Ground-based SAR Applications Testbed File Format (.gff)
  ELAS -raster- (rw+v): ELAS
  AIG -raster- (rov): Arc/Info Binary Grid
  AAIGrid -raster- (rwv): Arc/Info ASCII Grid
  GRASSASCIIGrid -raster- (rov): GRASS ASCII Grid
  SDTS -raster- (rov): SDTS Raster
  DTED -raster- (rwv): DTED Elevation Raster
  PNG -raster- (rwv): Portable Network Graphics
  JPEG -raster- (rwv): JPEG JFIF
  MEM -raster- (rw+): In Memory Raster
  JDEM -raster- (rov): Japanese DEM (.mem)
  GIF -raster- (rwv): Graphics Interchange Format (.gif)
  BIGGIF -raster- (rov): Graphics Interchange Format (.gif)
  ESAT -raster- (rov): Envisat Image Format
  FITS -raster- (rw+): Flexible Image Transport System
  BSB -raster- (rov): Maptech BSB Nautical Charts
  XPM -raster- (rwv): X11 PixMap Format
  BMP -raster- (rw+v): MS Windows Device Independent Bitmap
  DIMAP -raster- (rov): SPOT DIMAP
  AirSAR -raster- (rov): AirSAR Polarimetric Image
  RS2 -raster- (rovs): RadarSat 2 XML Product
  SAFE -raster- (rov): Sentinel-1 SAR SAFE Product
  PCIDSK -raster,vector- (rw+v): PCIDSK Database File
  PCRaster -raster- (rw+): PCRaster Raster File
  ILWIS -raster- (rw+v): ILWIS Raster Map
  SGI -raster- (rw+v): SGI Image File Format 1.0
  SRTMHGT -raster- (rwv): SRTMHGT File Format
  Leveller -raster- (rw+v): Leveller heightfield
  Terragen -raster- (rw+v): Terragen heightfield
  GMT -raster- (rw): GMT NetCDF Grid Format
  netCDF -raster,vector- (rw+s): Network Common Data Format
  HDF4 -raster- (ros): Hierarchical Data Format Release 4
  HDF4Image -raster- (rw+): HDF4 Dataset
  ISIS3 -raster- (rw+v): USGS Astrogeology ISIS cube (Version 3)
  ISIS2 -raster- (rw+v): USGS Astrogeology ISIS cube (Version 2)
  PDS -raster- (rov): NASA Planetary Data System
  PDS4 -raster- (rw+vs): NASA Planetary Data System 4
  VICAR -raster- (rov): MIPL VICAR file
  TIL -raster- (rov): EarthWatch .TIL
  ERS -raster- (rw+v): ERMapper .ers Labelled
  JP2OpenJPEG -raster,vector- (rwv): JPEG-2000 driver based on OpenJPEG library
  L1B -raster- (rovs): NOAA Polar Orbiter Level 1b Data Set
  FIT -raster- (rwv): FIT Image
  GRIB -raster- (rwv): GRIdded Binary (.grb, .grb2)
  RMF -raster- (rw+v): Raster Matrix Format
  WCS -raster- (rovs): OGC Web Coverage Service
  WMS -raster- (rwvs): OGC Web Map Service
  MSGN -raster- (rov): EUMETSAT Archive native (.nat)
  RST -raster- (rw+v): Idrisi Raster A.1
  INGR -raster- (rw+v): Intergraph Raster
  GSAG -raster- (rwv): Golden Software ASCII Grid (.grd)
  GSBG -raster- (rw+v): Golden Software Binary Grid (.grd)
  GS7BG -raster- (rw+v): Golden Software 7 Binary Grid (.grd)
  COSAR -raster- (rov): COSAR Annotated Binary Matrix (TerraSAR-X)
  TSX -raster- (rov): TerraSAR-X Product
  COASP -raster- (ro): DRDC COASP SAR Processor Raster
  R -raster- (rwv): R Object Data Store
  MAP -raster- (rov): OziExplorer .MAP
  KMLSUPEROVERLAY -raster- (rwv): Kml Super Overlay
  PDF -raster,vector- (rw+vs): Geospatial PDF
  Rasterlite -raster- (rwvs): Rasterlite
  MBTiles -raster,vector- (rw+v): MBTiles
  PLMOSAIC -raster- (ro): Planet Labs Mosaics API
  CALS -raster- (rwv): CALS (Type 1)
  WMTS -raster- (rwv): OGC Web Map Tile Service
  SENTINEL2 -raster- (rovs): Sentinel 2
  MRF -raster- (rw+v): Meta Raster Format
  PNM -raster- (rw+v): Portable Pixmap Format (netpbm)
  DOQ1 -raster- (rov): USGS DOQ (Old Style)
  DOQ2 -raster- (rov): USGS DOQ (New Style)
  PAux -raster- (rw+v): PCI .aux Labelled
  MFF -raster- (rw+v): Vexcel MFF Raster
  MFF2 -raster- (rw+): Vexcel MFF2 (HKV) Raster
  FujiBAS -raster- (rov): Fuji BAS Scanner Image
  GSC -raster- (rov): GSC Geogrid
  FAST -raster- (rov): EOSAT FAST Format
  BT -raster- (rw+v): VTP .bt (Binary Terrain) 1.3 Format
  LAN -raster- (rw+v): Erdas .LAN/.GIS
  CPG -raster- (rov): Convair PolGASP
  IDA -raster- (rw+v): Image Data and Analysis
  NDF -raster- (rov): NLAPS Data Format
  EIR -raster- (rov): Erdas Imagine Raw
  DIPEx -raster- (rov): DIPEx
  LCP -raster- (rwv): FARSITE v.4 Landscape File (.lcp)
  GTX -raster- (rw+v): NOAA Vertical Datum .GTX
  LOSLAS -raster- (rov): NADCON .los/.las Datum Grid Shift
  NTv1 -raster- (rov): NTv1 Datum Grid Shift
  NTv2 -raster- (rw+vs): NTv2 Datum Grid Shift
  CTable2 -raster- (rw+v): CTable2 Datum Grid Shift
  ACE2 -raster- (rov): ACE2
  SNODAS -raster- (rov): Snow Data Assimilation System
  KRO -raster- (rw+v): KOLOR Raw
  ROI_PAC -raster- (rw+v): ROI_PAC raster
  RRASTER -raster- (rw+v): R Raster
  BYN -raster- (rw+v): Natural Resources Canada's Geoid
  ARG -raster- (rwv): Azavea Raster Grid format
  RIK -raster- (rov): Swedish Grid RIK (.rik)
  USGSDEM -raster- (rwv): USGS Optional ASCII DEM (and CDED)
  GXF -raster- (rov): GeoSoft Grid Exchange Format
  DODS -raster- (ro): DAP 3.x servers
  KEA -raster- (rw+): KEA Image Format (.kea)
  BAG -raster- (rwv): Bathymetry Attributed Grid
  HDF5 -raster- (rovs): Hierarchical Data Format Release 5
  HDF5Image -raster- (rov): HDF5 Dataset
  NWT_GRD -raster- (rw+v): Northwood Numeric Grid Format .grd/.tab
  NWT_GRC -raster- (rov): Northwood Classified Grid Format .grc/.tab
  ADRG -raster- (rw+vs): ARC Digitized Raster Graphics
  SRP -raster- (rovs): Standard Raster Product (ASRP/USRP)
  BLX -raster- (rwv): Magellan topo (.blx)
  PostGISRaster -raster- (rws): PostGIS Raster driver
  SAGA -raster- (rw+v): SAGA GIS Binary Grid (.sdat, .sg-grd-z)
  IGNFHeightASCIIGrid -raster- (rov): IGN France height correction ASCII Grid
  XYZ -raster- (rwv): ASCII Gridded XYZ
  HF2 -raster- (rwv): HF2/HFZ heightfield raster
  OZI -raster- (rov): OziExplorer Image File
  CTG -raster- (rov): USGS LULC Composite Theme Grid
  E00GRID -raster- (rov): Arc/Info Export E00 GRID
  ZMap -raster- (rwv): ZMap Plus Grid
  NGSGEOID -raster- (rov): NOAA NGS Geoid Height Grids
  IRIS -raster- (rov): IRIS data (.PPI, .CAPPi etc)
  PRF -raster- (rov): Racurs PHOTOMOD PRF
  RDA -raster- (ro): DigitalGlobe Raster Data Access driver
  EEDAI -raster- (ros): Earth Engine Data API Image
  SIGDEM -raster- (rwv): Scaled Integer Gridded DEM .sigdem
  GPKG -raster,vector- (rw+vs): GeoPackage
  CAD -raster,vector- (rovs): AutoCAD Driver
  PLSCENES -raster,vector- (ro): Planet Labs Scenes API
  NGW -raster,vector- (rw+s): NextGIS Web
  GenBin -raster- (rov): Generic Binary (.hdr Labelled)
  ENVI -raster- (rw+v): ENVI .hdr Labelled
  EHdr -raster- (rw+v): ESRI .hdr Labelled
  ISCE -raster- (rw+v): ISCE raster
  HTTP -raster,vector- (ro): HTTP Fetching Wrapper

@pomadchin
Copy link
Member

pomadchin commented Feb 7, 2020

@metasim just to confirm, have you tried GDAL 2.4.4? OSGeo/gdal#1244
Is the same issue happens with TIFFs or only with JP2K?

@metasim
Copy link
Member Author

metasim commented Feb 7, 2020

@metasim just to confirm, have you tried GDAL 2.4.4? OSGeo/gdal#1244

Not sure... I'll try that later today.

Is the same issue happens with TIFFs or only with JP2K?

Don't know. Took me a week to get to a repeatable test case, so those sorts of refinements are needed.

@metasim
Copy link
Member Author

metasim commented Feb 7, 2020

@pomadchin Confirmed the bug occurs under GDAL 2.4.4, released 2020/01/08

19:15:34 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3)
geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
        at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
        at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
        at $line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anonfun$2.apply(RSRead.scala:36)
        at $line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anonfun$2.apply(RSRead.scala:34)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
19:15:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
geotrellis.raster.gdal.MalformedDataException: A bandCount of <= 0 was found. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.bandCount$extension1(GDALDataset.scala:206)
        at geotrellis.raster.gdal.GDALDataset$.bandCount$extension0(GDALDataset.scala:196)
        at geotrellis.raster.gdal.GDALRasterSource.bandCount$lzycompute(GDALRasterSource.scala:82)
        at geotrellis.raster.gdal.GDALRasterSource.bandCount(GDALRasterSource.scala:82)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
19:15:34 WARN TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3, localhost, executor driver): geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
        at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
        at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
        at $line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anonfun$2.apply(RSRead.scala:36)
        at $line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anonfun$2.apply(RSRead.scala:34)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

19:15:34 ERROR TaskSetManager: Task 3 in stage 0.0 failed 1 times; aborting job
19:15:34 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): geotrellis.raster.gdal.MalformedDataException: A bandCount of <= 0 was found. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.bandCount$extension1(GDALDataset.scala:206)
        at geotrellis.raster.gdal.GDALDataset$.bandCount$extension0(GDALDataset.scala:196)
        at geotrellis.raster.gdal.GDALRasterSource.bandCount$lzycompute(GDALRasterSource.scala:82)
        at geotrellis.raster.gdal.GDALRasterSource.bandCount(GDALRasterSource.scala:82)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at geotrellis.raster.RasterSource$$anonfun$readBounds$2.apply(RasterSource.scala:164)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

[Stage 0:>                                                                                                                     (0 + 6) / 8]org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3, localhost, executor driver): geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
        at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
        at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
        at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
        at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
        at $anonfun$2.apply(RSRead.scala:36)
        at $anonfun$2.apply(RSRead.scala:34)
        at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  at scala.Option.foreach(Option.scala:257)
  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:927)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:925)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
  at org.apache.spark.rdd.RDD.foreach(RDD.scala:925)
  ... 94 elided
Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
  at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
  at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
  at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
  at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
  at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
  at $anonfun$2.apply(RSRead.scala:36)
  at $anonfun$2.apply(RSRead.scala:34)
  at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
  at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
  at scala.collection.Iterator$class.foreach(Iterator.scala:891)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:927)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:123)
  at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

@pomadchin
Copy link
Member

@metasim perfetct (in terms of debugging) :D

@metasim
Copy link
Member Author

metasim commented Feb 7, 2020

Just ran test against this GeoTIFF:

https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/017/033/LC08_L1TP_017033_20181010_20181030_01_T1/LC08_L1TP_017033_20181010_20181030_01_T1_B4.TIF

And it does complete successfully. Perhaps it's a GDAL JP2 issue?

@pomadchin
Copy link
Member

@metasim ‾_(ツ)_/‾ requires a bit more investigations; can be just beacuase there is some random nature of this issue. I wish we could reproduce it on a laptop :/

@metasim
Copy link
Member Author

metasim commented Feb 10, 2020

In RasterFrames, added global thread lock to GDALRasterSource when JP2 files are being read and job completes (albeit extremely slowly). Another mark pointing toward a race condition.

Screen Shot 2020-02-10 at 11 17 24 AM

metasim added a commit to s22s/rasterframes that referenced this issue Feb 10, 2020
@pomadchin
Copy link
Member

pomadchin commented Feb 10, 2020

@metasim sounds really sad, slow, and not too reliable

@metasim
Copy link
Member Author

metasim commented Feb 10, 2020

Looking to try to reproduce at a lower level.

@metasim
Copy link
Member Author

metasim commented Feb 10, 2020

Wondering if this might be the cause (fixed in 3.0.2):

https://github.com/OSGeo/gdal/blob/ee535a1a3f5b35b0d231e1faac89ac1f889f7988/gdal/NEWS#L232-L238

@pomadchin
Copy link
Member

@metasim I think it makes sense to try to use GDAL 3.0.4

@metasim
Copy link
Member Author

metasim commented Feb 10, 2020

Working on it.

@metasim
Copy link
Member Author

metasim commented Feb 10, 2020

@pomadchin gdal-warp-bindings won't link against 3.0.4... looks like it's requiring the 2.x line.

java.lang.UnsatisfiedLinkError: /tmp/nativeutils837692180397/libgdalwarp_bindings.so: libgdal.so.20: cannot open shared object file: No such file or directory

@metasim
Copy link
Member Author

metasim commented Feb 11, 2020

I was able to hack together a new gdal-warp-bindings for Linux linked against GDAL 3.0.4. Good news is that they link:

from pyrasterframes.utils import gdal_version
gdal_version()
...
'GDAL 3.0.4, released 2020/01/28'

Bad news is that the bug is still there. 😢

geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
	at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)

JupyterLab.pdf

@metasim
Copy link
Member Author

metasim commented Feb 11, 2020

BTW, it may be worth trying to run the Test Case on a non-AWS Linux machine or Docker container. My laptop is MacOS, so OS is a variable changed between local vs remote execution. It may not have to do with it being EC2 or a particular instance size.

metasim added a commit to s22s/rasterframes that referenced this issue Feb 11, 2020
* feature/gt-3.0:
  Made JP2 GDAL thread lock configurable.
  Added global thread lock on JP2 GDAL file reading. See locationtech/geotrellis#3184
  Refactor clip to clamp
  Use rf_local_clip and other viz functions in supervised learning doc page
  rf_rescale tests and refinements
  Add Rescale function to Scala, Python and SQL APIs
  Add rf_where and rf_standardize functions
  Add rf_local_min, rf_local_max, and rf_local_clip functions
@metasim
Copy link
Member Author

metasim commented Feb 11, 2020

Test case using custom build

Custom gdal-warp-bindings built against GDAL 3.0.4, Custom GeoTrellis 3.2.x build.

First create a shell in the environment:

$ docker run -it s22s/rasterframes-notebook:0.9.0-astraea.452747b4  bash
wget https://gist.githubusercontent.com/metasim/5332ac959d97d9747921197cd4307948/raw/662687c9b5c52083b007b451b6530f0505b2c9fc/ParallelJP2.scala && echo ':load ParallelJP2.scala' | spark-shell --jars /opt/conda/lib/python3.7/site-packages/pyrasterframes/jars/pyrasterframes-assembly-0.9.0-astraea.452747b4.jar 

Note: Running this locally does not fail. Maybe 8 or more cores are needed?

Edit: With Docker on MacOS configured with all 8 cores, the job above does indeed fail.

@metasim
Copy link
Member Author

metasim commented Feb 11, 2020

Custom gdal-warp-bindings

Create the filetesting.list in gdal-warp-bindings/Docker with this:

deb  [ allow-insecure=yes ] http://http.us.debian.org/debian testing main non-free contrib
deb-src  [ allow-insecure=yes ] http://http.us.debian.org/debian testing main non-free contrib

Replace the # Build GDAL 2.4.3 Linux section of gdal-warp-bindings/Docker/Dockerfile.environment with this:

COPY unstable.list  /etc/apt/sources.list.d/
RUN apt-get update -q && apt-get install -y -q --allow-unauthenticated libgdal-dev=3.0.4+dfsg-1

Build the image. Note ID or tag it.

In the gdal-warp-bindings directory run

docker run -it --rm \
      -v $(pwd):/workdir \
      -e CC=gcc -e CXX=g++ \
      -e CFLAGS="-Wall -Wno-sign-compare -Werror -O0 -ggdb3 -DSO_FINI -D_GNU_SOURCE" \
      -e BOOST_ROOT="/usr/local/include/boost_1_69_0" \
      -e JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64" \
      <image tag or ID from above> make -j4 -C src tests

Note the location of file gdal-warp-bindings/src/main/gdalwarp.jar.

Edit geotrellis/project/Dependencies.scala and replace

val gdalWarp            = "com.azavea.gdal"              % "gdal-warp-bindings"      % Version.gdalWarp

with

val gdalWarp = "com.azavea.gdal" % "gdal-warp-bindings"  % Version.gdalWarp from("file:/path/to/gdal-warp-bindings/src/main/gdalwarp.jar")

Build GeoTrellis.

@metasim
Copy link
Member Author

metasim commented Feb 11, 2020

Tweaking Number of Cores in Docker

  • 8: Fail
  • 7: Fail
  • 5: Fail
  • 4: Success
  • 3: Success

Edit: I was running this at home over mediocre WiFi. The office environment is 1Gbps wired.

@vpipkt
Copy link
Member

vpipkt commented Feb 12, 2020

Update on the script to reproduce it. From within the docker container:

 $ PROJ_LIB=/opt/conda/share/proj spark-shell --master local[8] --jars /opt/conda/lib/python3.7/site-packages/pyrasterframes/jars/pyrasterframes-assembly-0.9.0-astraea.452747b4.jar 
 scala> :load ParallelJP2.scala

Although I do not reproduce the failure with 8 cores.

Test case using custom build

Custom gdal-warp-bindings built against GDAL 3.0.4, Custom GeoTrellis 3.2.x build.

First create a shell in the environment:

$ docker run -it s22s/rasterframes-notebook:0.9.0-astraea.452747b4  bash
$ wget https://gist.githubusercontent.com/metasim/5332ac959d97d9747921197cd4307948/raw/662687c9b5c52083b007b451b6530f0505b2c9fc/ParallelJP2.scala
$ spark-shell --jars /opt/conda/lib/python3.7/site-packages/pyrasterframes/jars/pyrasterframes-assembly-0.9.0-astraea.452747b4.jar 
scala> :load ParallelJP2.scala

Note: Running this locally does not fail. Maybe 8 or more cores are needed?

Edit: With Docker on MacOS configured with all 8 cores, the job above does indeed fail.

@metasim
Copy link
Member Author

metasim commented Feb 12, 2020

@vpipkt What happens if you leave out the --master local[8]? I did not specify the number of cores that way.... I just left it to Spark defaults, but configured Docker to have 8 cores.

@metasim
Copy link
Member Author

metasim commented Feb 12, 2020

@vpipkt Also, if you re-run it, can you do docker pull s22s/rasterframes-notebook:0.9.0-astraea.452747b4 first? I updated it to have the PROJ_LIB done for you.

@vpipkt
Copy link
Member

vpipkt commented Feb 12, 2020

I pulled the image again (image id 26d9771deb79), and ran again omitting the explicit --master local[8] and did not reproduce the bug. :-(

@metasim
Copy link
Member Author

metasim commented Feb 12, 2020

Same.... on wired internet at work it's passing. 😠 These results were from running it at home on mediocre WiFi.

@metasim
Copy link
Member Author

metasim commented Feb 12, 2020

When using my phone's hot spot using 8 cores it fails.

@metasim
Copy link
Member Author

metasim commented Feb 12, 2020

Bandwidth Limiting on MacOS

The "Additional Tools for Xcode 11" package includes a tool called Network Link Conditioner that simulates slow or error prone networks:

Screen Shot 2020-02-12 at 10 08 21 AM

When using this tool (and remembering to filp the "On" switch) results in the test fails.

Edit: If it disappears from your System Preferences after install, do this: https://agilewarrior.wordpress.com/2018/10/31/trouble-installing-link-conditioner/

@pomadchin
Copy link
Member

pomadchin commented Mar 2, 2020

Also added a new issue to make this exception more trackable in the future geotrellis/gdal-warp-bindings#83

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

Still getting this error in our environment, but need to confirm that dependencyOverrides propagated to assembly generation:

geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 3
	at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

Confirmed md5sum values are the same :-(

@pomadchin
Copy link
Member

@metasim gotcha; I'll move it into in progress and will work on geotrellis/gdal-warp-bindings#83 and geotrellis/gdal-warp-bindings#84 next; so there would be a unique error thrown to detect that you're still can not aquire a locked dataset.

@pomadchin
Copy link
Member

after releasing geotrellis/gdal-warp-bindings#83 I will ask you to run tests again; if this error would happen again, we'll add some parametrized timeout setting (it is sleep(0) by default)

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

The error happens in this notebook Docker environment:

s22s/rasterframes-notebook:0.9.0-1ce1ff3

The test triggering it is attached:

gt-3184-test.zip

Edit: misinterpreted error message in this environment. Ignore for now while I revisit.

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

To be clear, we are still having the error identified here in our environment: #3184 (comment)

Just need to fix the RasterFrames notebook to reproduce.

@pomadchin
Copy link
Member

pomadchin commented Mar 5, 2020

@metasim could you also print all the availble GDALOptions from the application.conf file?

println(geotrellis.raster.gdal.config.GDALOptionsConfig.conf)

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

'GDALOptionsConfig(Map(CPL_VSIL_CURL_CHUNK_SIZE -> 1000000, CPL_VSIL_CURL_ALLOWED_EXTENSIONS -> .tif,.tiff,.jp2,.mrf,.idx,.lrc,.mrf.aux.xml,.vrt, AWS_REQUEST_PAYER -> requester, GDAL_HTTP_MAX_RETRY -> 4, GDAL_PAM_ENABLED -> NO, GDAL_DISABLE_READDIR_ON_OPEN -> YES, GDAL_CACHEMAX -> 512, GDAL_HTTP_RETRY_DELAY -> 1),List(SOURCE, WARPED),1048576)'

Also: running against GDAL 2.4.4

@pomadchin
Copy link
Member

pomadchin commented Mar 5, 2020

@metasim look into configuration:

GDALOptionsConfig(...,1048576)

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

@pomadchin
Copy link
Member

pomadchin commented Mar 5, 2020

@metasim it didn't pick up; are you sure that the assembly jar contains an appropriate configuration file?

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

Good question... I'll double check.

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

crap... the assembly merged the geotrellis reference.conf and the rasterframes reference.conf, with the former overriding the latter.

Suggestions on how to override a GT setting?... have an application.conf in the assembly?

@pomadchin
Copy link
Member

@metasim application.conf can be the way, and you can also work on the merging strategies of a reference.conf file probably

@pomadchin
Copy link
Member

Another option can be to leave only yours reference.conf ._. Or to decline the GDAL reference conf

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

Testing with -Dgeotrellis.raster.gdal.number-of-attempts=2147483647 and so far it's still running.

@pomadchin
Copy link
Member

@metasim have you printed the GDALOptionsConf after adding this option?

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

I moved the geotrellis overrides to an application.conf (as they should be) and can see the value getting set properly:

'GDALOptionsConfig(Map(CPL_VSIL_CURL_CHUNK_SIZE -> 1000000, CPL_VSIL_CURL_ALLOWED_EXTENSIONS -> .tif,.tiff,.jp2,.mrf,.idx,.lrc,.mrf.aux.xml,.vrt, AWS_REQUEST_PAYER -> requester, GDAL_HTTP_MAX_RETRY -> 10, CPL_DEBUG -> ON, GDAL_PAM_ENABLED -> NO, GDAL_DISABLE_READDIR_ON_OPEN -> YES, GDAL_CACHEMAX -> 512, GDAL_HTTP_RETRY_DELAY -> 2),List(SOURCE, WARPED),2147483647)'

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

Sadly, after all this, still appears to be happening:

Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 4
	at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
Py4JJavaError: An error occurred while calling o123.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 135 in stage 1.0 failed 1 times, most recent failure: Lost task 135.0 in stage 1.0 (TID 192, localhost, executor driver): java.lang.IllegalArgumentException: Error fetching data for one of: GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/5/31/0/R60m/B08.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/5/31/0/R60m/B12.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/9/13/0/R60m/B08.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/9/13/0/R60m/B12.jp2)
	at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs.eval(RasterSourceToRasterRefs.scala:81)
	at org.apache.spark.sql.execution.GenerateExec$$anonfun$1$$anonfun$3.apply(GenerateExec.scala:95)
	at org.apache.spark.sql.execution.GenerateExec$$anonfun$1$$anonfun$3.apply(GenerateExec.scala:92)
	at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
	at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
	at scala.collection.Iterator$JoinIterator.hasNext(Iterator.scala:212)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:123)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 4
	at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
	at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
	at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
	at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
	at org.locationtech.rasterframes.ref.SimpleRasterInfo$.apply(SimpleRasterInfo.scala:71)
	at org.locationtech.rasterframes.ref.GDALRasterSource$$anonfun$tiffInfo$1.apply(GDALRasterSource.scala:53)
	at org.locationtech.rasterframes.ref.GDALRasterSource$$anonfun$tiffInfo$1.apply(GDALRasterSource.scala:53)
	at scala.compat.java8.functionConverterImpls.AsJavaFunction.apply(FunctionConverters.scala:262)
	at com.github.benmanes.caffeine.cache.LocalCache.lambda$statsAware$0(LocalCache.java:139)
	at com.github.benmanes.caffeine.cache.UnboundedLocalCache.lambda$computeIfAbsent$2(UnboundedLocalCache.java:238)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
	at com.github.benmanes.caffeine.cache.UnboundedLocalCache.computeIfAbsent(UnboundedLocalCache.java:234)
	at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
	at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at com.github.blemale.scaffeine.Cache.get(Cache.scala:40)
	at org.locationtech.rasterframes.ref.SimpleRasterInfo$.apply(SimpleRasterInfo.scala:49)
	at org.locationtech.rasterframes.ref.GDALRasterSource.tiffInfo(GDALRasterSource.scala:53)
	at org.locationtech.rasterframes.ref.GDALRasterSource.extent(GDALRasterSource.scala:57)
	at org.locationtech.rasterframes.ref.RFRasterSource.rasterExtent(RFRasterSource.scala:71)
	at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs$$anonfun$1.apply(RasterSourceToRasterRefs.scala:65)
	at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs$$anonfun$1.apply(RasterSourceToRasterRefs.scala:63)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs.eval(RasterSourceToRasterRefs.scala:63)
	... 29 more

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:944)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299)
at org.apache.spark.sql.Dataset$$anonfun$collectToPython$1.apply(Dataset.scala:3263)
at org.apache.spark.sql.Dataset$$anonfun$collectToPython$1.apply(Dataset.scala:3260)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:3260)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Error fetching data for one of: GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/5/31/0/R60m/B08.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/5/31/0/R60m/B12.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/9/13/0/R60m/B08.jp2), GDALRasterSource(s3://sentinel-s2-l2a/tiles/22/L/EP/2019/9/13/0/R60m/B12.jp2)
at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs.eval(RasterSourceToRasterRefs.scala:81)
at org.apache.spark.sql.execution.GenerateExec$$anonfun$1$$anonfun$3.apply(GenerateExec.scala:95)
at org.apache.spark.sql.execution.GenerateExec$$anonfun$1$$anonfun$3.apply(GenerateExec.scala:92)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$JoinIterator.hasNext(Iterator.scala:212)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 4
at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:143)
at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
at geotrellis.raster.RasterMetadata$class.cols(RasterMetadata.scala:52)
at geotrellis.raster.RasterSource.cols(RasterSource.scala:44)
at org.locationtech.rasterframes.ref.SimpleRasterInfo$.apply(SimpleRasterInfo.scala:71)
at org.locationtech.rasterframes.ref.GDALRasterSource$$anonfun$tiffInfo$1.apply(GDALRasterSource.scala:53)
at org.locationtech.rasterframes.ref.GDALRasterSource$$anonfun$tiffInfo$1.apply(GDALRasterSource.scala:53)
at scala.compat.java8.functionConverterImpls.AsJavaFunction.apply(FunctionConverters.scala:262)
at com.github.benmanes.caffeine.cache.LocalCache.lambda$statsAware$0(LocalCache.java:139)
at com.github.benmanes.caffeine.cache.UnboundedLocalCache.lambda$computeIfAbsent$2(UnboundedLocalCache.java:238)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
at com.github.benmanes.caffeine.cache.UnboundedLocalCache.computeIfAbsent(UnboundedLocalCache.java:234)
at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
at com.github.blemale.scaffeine.Cache.get(Cache.scala:40)
at org.locationtech.rasterframes.ref.SimpleRasterInfo$.apply(SimpleRasterInfo.scala:49)
at org.locationtech.rasterframes.ref.GDALRasterSource.tiffInfo(GDALRasterSource.scala:53)
at org.locationtech.rasterframes.ref.GDALRasterSource.extent(GDALRasterSource.scala:57)
at org.locationtech.rasterframes.ref.RFRasterSource.rasterExtent(RFRasterSource.scala:71)
at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs$$anonfun$1.apply(RasterSourceToRasterRefs.scala:65)
at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs$$anonfun$1.apply(RasterSourceToRasterRefs.scala:63)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs.eval(RasterSourceToRasterRefs.scala:63)
... 29 more

Reproducible in this notebook environment:

docker run -p 8888:8888 s22s/rasterframes-notebook:0.9.0-9560e45

with this notebook and data: gt-3184-test.zip

You can run the notebook on the command line with ipython if you prefer.

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

@pomadchin
Copy link
Member

@metasim okay, this is another error code afterall; error code 4
do you have any coredumps / smth like that?

show to run this notebook?

Do I need ec2 m4.x4large?

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

I get the error running on my 8 core macbook.

To run, unpack gt-3184-test.zip and execute this:

docker run --rm -v $PWD:/home/jovyan s22s/rasterframes-notebook:0.9.0-9560e45 ipython gt-3184.ipynb

edit: no coredumps

@metasim
Copy link
Member Author

metasim commented Mar 5, 2020

@pomadchin I have a sneaking suspicion that "GDAL Error Code: 4" here might be triggered by an AWS identity error caused by reading a requester-pays bucket and not having a ~/.aws/credentials file or env vars to tell S3 who you are. I just added a .aws/credentials file and it's running much longer than usual (I expect the job to take 1.5hrs), so it's not definitive, but I bet if we had more error message context it would point that way.

@pomadchin
Copy link
Member

@metasim Error code 4 means GDAL failed to open the dataset, so it can be the case; wating till confirmation from you than. Thanks for the update!

@metasim
Copy link
Member Author

metasim commented Mar 6, 2020

Pretty sure this last round was a false alarm due to:

  1. dependencyOverrides not being transitive
  2. reference.conf values not getting overridden
  3. Missing AWS credentials
  4. Error codes masking true error causes

With all those things addressed, the original test case now completes.

I suggest this ticket be closed once an updated GT referencing com.azavea.gdal:gdal-warp-bindings:33.f746890 is published to Maven Central.

@pomadchin
Copy link
Member

@metasim 👍

@pomadchin
Copy link
Member

GDAL 1.0.0 is published! Also look into the CHANGELOG for the all changes that were also a part of this release. Closing it now, feel free to reopen / open a new issue if smth would happen with it again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants