New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windowed Reading GeoTiffs from S3 and Hdfs #1763

Merged
merged 45 commits into from Nov 15, 2016

Conversation

Projects
4 participants
@jbouffard
Contributor

jbouffard commented Nov 1, 2016

This PR was based on another, preexisting PR: #1758

Two new features are added to Geotrellis with this PR: Reading windowed GeoTiffs from S3 and Hdfs. These features will allow for the breaking up of large GeoTiffs into smaller tiles over a distributed system. By using this method, the computation time for each GeoTiff that needs to analyzed can be shortened considerably.

To do:

  • Unit Testing for Windowed Reading for Hdfs
  • Implement Windowed Reading for S3
    • SpatialGeoTiffs
    • TemporalGeoTiffs
  • Unit Testing for Windowed Reading for S3
@lossyrob

This comment has been minimized.

Member

lossyrob commented Nov 1, 2016

@jbouffard can you make sure to include the changes @pomadchin mentions here: #1758 (comment)

@jbouffard

This comment has been minimized.

Contributor

jbouffard commented Nov 1, 2016

@lossyrob Yeah, I was just about to do that.

@lossyrob lossyrob added this to the 1.0 milestone Nov 1, 2016

@jbouffard jbouffard force-pushed the jbouffard:geotiff-rdd branch from ad86812 to d54fe8c Nov 4, 2016

@jbouffard jbouffard force-pushed the jbouffard:geotiff-rdd branch 2 times, most recently from f393fb2 to 2eaecd5 Nov 8, 2016

jbouffard added some commits Nov 8, 2016

Fixed issues that caused Travis to fail
Fixed issue that caused Travis to fail

Fixed the errors for real this time

@jbouffard jbouffard force-pushed the jbouffard:geotiff-rdd branch from 2eaecd5 to cf6e572 Nov 9, 2016

@jbouffard jbouffard changed the title from [WIP] Windowed Reading GeoTiffs from S3 and Hdfs to Windowed Reading GeoTiffs from S3 and Hdfs Nov 9, 2016

@echeipesh echeipesh self-assigned this Nov 11, 2016

@@ -51,12 +51,9 @@ object ArraySegmentBytes {
val result = Array.ofDim[Array[Byte]](offsets.size)
cfor(0)(_ < offsets.size, _ + 1) { i =>
byteReader.position(offsets(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i), offsets(i))

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

Stale doc string, byteBuffer no longer a prameter

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

oldOffset is unused. Looks like the previous version of this file used it to return position, I would guess thats not a good idea and this variable needs to be removed now.

This comment has been minimized.

@jbouffard

jbouffard Nov 14, 2016

Contributor

Fixed the doc string. Yeah, oldOffset was actually always redundant since getSignedByteArray does it already.

@@ -51,12 +51,9 @@ object ArraySegmentBytes {
val result = Array.ofDim[Array[Byte]](offsets.size)
cfor(0)(_ < offsets.size, _ + 1) { i =>
byteReader.position(offsets(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i), offsets(i))

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

This maybe a conversation that already happened but why are there ByteReaderExtensions when we control the trait we're extending? Why shouldn't getSignedByteArray and others be methods on ByteReader trait directly?

This comment has been minimized.

@jbouffard

jbouffard Nov 14, 2016

Contributor

ByteReader is meant to be a replacement for ByteBuffer, so I wanted the two to have the same methods (or at least have them be as close as possible). Since none of the methods in ByteReaderExtensions are in ByteBuffer, they were made into their own separate thing.

@@ -17,9 +17,9 @@ import spire.syntax.cfor._
* @param byteReader: A ByteReader that contains bytes of the GeoTiff
* @param storageMethod: The [[StorageMethod]] of the GeoTiff
* @param tifftags: The [[TiffTags]] of the GeoTiff
* @return A new instance of BufferSegmentBytes
* @return A new instance of LazySegmentBytes

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

@param section is out of sync with code

This comment has been minimized.

@jbouffard

jbouffard Nov 14, 2016

Contributor

Fixed

@@ -173,33 +173,22 @@ trait ByteReaderExtensions {
}
final def getSignedByteArray(length: Long, valueOffset: Long): Array[Byte] = {
val arr = Array.ofDim[Byte](length.toInt)
// NOTE: We don't support lengths greater than Int.MaxValue yet (or ever).

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

This comment should be in a doc string

This comment has been minimized.

@echeipesh

echeipesh Nov 14, 2016

Contributor

Also if so, why is the length parameter Int?

This comment has been minimized.

@jbouffard

jbouffard Nov 14, 2016

Contributor

Fixed it. Why it was converted to Int? That's because you can't make an Array.ofDim using a Long.

@lossyrob lossyrob merged commit 35b7570 into locationtech:master Nov 15, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@lossyrob lossyrob removed the Needs Review label Nov 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment