New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Windows That Conform To GeoTiff Segments #2402

Merged
merged 14 commits into from Oct 11, 2017

Conversation

Projects
4 participants
@jamesmcclain
Member

jamesmcclain commented Oct 2, 2017

  • Ensure that S3 window size is being properly interpreted (is windowsSize a length or an area in this context?)
  • Ensure that handling of striped segments is viable (are tiles of one pixel in height okay?)

Connects #2173

@jamesmcclain jamesmcclain force-pushed the jamesmcclain:fix/confirming-windows branch 4 times, most recently from bc6626c to 860cf98 Oct 3, 2017

@jamesmcclain jamesmcclain changed the title from [WiP] Generate Windows That Conform GeoTiff Segments to [WiP] Generate Windows That Conform To GeoTiff Segments Oct 4, 2017

@jamesmcclain jamesmcclain force-pushed the jamesmcclain:fix/confirming-windows branch 3 times, most recently from bb9d4d1 to bccb415 Oct 4, 2017

@jamesmcclain jamesmcclain changed the title from [WiP] Generate Windows That Conform To GeoTiff Segments to Generate Windows That Conform To GeoTiff Segments Oct 5, 2017

@jamesmcclain jamesmcclain referenced this pull request Oct 5, 2017

Merged

Filter GeoTiffRDDs by Geometry #2409

1 of 1 task complete
val segments: RDD[(String, Array[GridBounds])] =
sourceGeoTiffInfo.segmentsByPartitionBytes(partitionBytes, windowSize)
case (_, Some(partitionBytes)) => {
val maxSize = math.min(options.maxTileSize.getOrElse(1<<10), windowSize.getOrElse(1<<10)) // XXX is windowSize a length or an area?

This comment has been minimized.

@echeipesh

echeipesh Oct 6, 2017

Contributor

windowSize is length, implicitly for square windows.

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member
  1. Why can't we write here just 1024 instead of 1 << 10?
  2. Don't think that getOrElse is a good idea here, as it can hide a possible user error. What logic do you want to follow here? Mb we can add these attributes into the match function above? match supports logical or too.
  3. windowSize is length

@echeipesh echeipesh requested a review from pomadchin Oct 6, 2017

@echeipesh

This comment has been minimized.

Contributor

echeipesh commented Oct 6, 2017

Ensure that handling of striped segments is viable (are tiles of one pixel in height okay?)

It seems strange but the thinking went is that if you don't have an opinion about the shape of tiles you want you're probably using this RDD as input to tiler. From the perspective of optimizing for IO reading the segments as skinny tiles is fine, from the perspective of tiler chunking a skinny tile into multiple tiles is not problematic either.

@jamesmcclain

This comment has been minimized.

Member

jamesmcclain commented Oct 6, 2017

It seems strange but the thinking went is that if you don't have an opinion about the shape of tiles you want you're probably using this RDD as input to tiler. From the perspective of optimizing for IO reading the segments as skinny tiles is fine, from the perspective of tiler chunking a skinny tile into multiple tiles is not problematic either.

Sounds capital.

@pomadchin

It works fine! A couple of comments and most of them are related to code style / api questions.

val segments: RDD[(String, Array[GridBounds])] =
sourceGeoTiffInfo.segmentsByPartitionBytes(partitionBytes, windowSize)
case (_, Some(partitionBytes)) => {
val maxSize = math.min(options.maxTileSize.getOrElse(1<<10), windowSize.getOrElse(1<<10)) // XXX is windowSize a length or an area?

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member
  1. Why can't we write here just 1024 instead of 1 << 10?
  2. Don't think that getOrElse is a good idea here, as it can hide a possible user error. What logic do you want to follow here? Mb we can add these attributes into the match function above? match supports logical or too.
  3. windowSize is length
val layout = sourceGeoTiffInfo.getGeoTiffInfo(s"s3://$bucket/$key").segmentLayout.tileLayout
RasterReader
.listWindows(cols, rows, options.maxTileSize.getOrElse(1<<10), layout.tileCols, layout.tileRows)

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

The same is here:

Looks not safe:

options.maxTileSize.getOrElse(1<<10)

Mb to throw an exception or to handle it in a way user will know that smth is used by default (at least to add some warning). In addition, somewhere above you already used getOrElse two times with 1024 default value, mb it makes sense to add these default values somewhere? As we can easily forget about these default values.

Finally the question, mb Options should contain 1024 value for maxTileSize and windowSize by default? (everything is defined in S3GeoTiffRDD object).

This comment has been minimized.

@jamesmcclain

jamesmcclain Oct 11, 2017

Member

As long as changing the default maxTileSize is not construed to constitute an API change, I prefer that.

@@ -59,15 +60,70 @@ private [geotrellis] trait GeoTiffInfoReader extends LazyLogging {
}
}
def windowsByBytes(

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

Function description is missing, would be good to add.
Btw have you compared it to segmentsByPartitionBytes? Should we remove segmentsByPartitionBytes at all?

(options.maxTileSize, options.partitionBytes) match {
case (_, Some(partitionBytes)) => {
val windows: RDD[(String, Array[GridBounds])] =
sourceGeoTiffInfo.windowsByBytes(partitionBytes, options.maxTileSize.getOrElse(1<<10))

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

The same comments about maxTileSize and windowSize and 1 << 10 are valid for Hadoop too.

options: Options
)(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)] = {
val conf = new SerializableConfiguration(configuration(path, options))

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

Why conf should be wrapped here? Looks like it's only used for input formats arguments => can be used a common configuration without extra wrapper.

This comment has been minimized.

@jamesmcclain

jamesmcclain Oct 11, 2017

Member

That variable makes its way into a sc.parallelize inside of HadoopGeoTiffInfoReader.

This comment has been minimized.

@pomadchin

pomadchin Oct 11, 2017

Member

Good point, my bad.

)(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)] = {
val conf = new SerializableConfiguration(configuration(path, options))
val path2 = path.toString

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

Well it's always a bit confusing when you see in the code base such variable names. In fact you can write just: // to avoid variable naming confusion

HadoopGeoTiffInfoReader(path.toString, conf, options.tiffExtensions)
val layout = info.getGeoTiffInfo(objectRequest.toString).segmentLayout.tileLayout
RasterReader
.listWindows(cols, rows, options.maxTileSize.getOrElse(1<<10), layout.tileCols, layout.tileRows)

This comment has been minimized.

@pomadchin

pomadchin Oct 10, 2017

Member

maxTileSize comment here too

@jamesmcclain jamesmcclain force-pushed the jamesmcclain:fix/confirming-windows branch from 80cdadb to 395fe2d Oct 11, 2017

@jamesmcclain

This comment has been minimized.

Member

jamesmcclain commented Oct 11, 2017

All comments have been addressed, I believe.

@jamesmcclain

This comment has been minimized.

Member

jamesmcclain commented Oct 11, 2017

💯

@rossbernet rossbernet added this to the 1.2 milestone Oct 11, 2017

@rossbernet rossbernet added this to Needs review in 1.2 release Oct 11, 2017

@echeipesh echeipesh merged commit f2be4df into locationtech:master Oct 11, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
ip-validation
Details

@echeipesh echeipesh referenced this pull request Oct 11, 2017

Closed

Partition GeoTiff reading by GeoTiff segment layout #2173

2 of 2 tasks complete

@echeipesh echeipesh moved this from Needs review to DONE! in 1.2 release Oct 12, 2017

@jamesmcclain jamesmcclain deleted the jamesmcclain:fix/confirming-windows branch Nov 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment