Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windowed Reading GeoTiffs from S3 and Hdfs #1763

Merged
merged 45 commits into from Nov 15, 2016

Conversation

jbouffard
Copy link

@jbouffard jbouffard commented Nov 1, 2016

This PR was based on another, preexisting PR: #1758

Two new features are added to Geotrellis with this PR: Reading windowed GeoTiffs from S3 and Hdfs. These features will allow for the breaking up of large GeoTiffs into smaller tiles over a distributed system. By using this method, the computation time for each GeoTiff that needs to analyzed can be shortened considerably.

To do:

  • Unit Testing for Windowed Reading for Hdfs
  • Implement Windowed Reading for S3
    • SpatialGeoTiffs
    • TemporalGeoTiffs
  • Unit Testing for Windowed Reading for S3

@lossyrob
Copy link
Member

lossyrob commented Nov 1, 2016

@jbouffard can you make sure to include the changes @pomadchin mentions here: #1758 (comment)

@jbouffard
Copy link
Author

@lossyrob Yeah, I was just about to do that.

@lossyrob lossyrob added this to the 1.0 milestone Nov 1, 2016
@jbouffard jbouffard force-pushed the geotiff-rdd branch 2 times, most recently from f393fb2 to 2eaecd5 Compare November 9, 2016 13:48
Fixed issue that caused Travis to fail

Fixed the errors for real this time
@jbouffard jbouffard changed the title [WIP] Windowed Reading GeoTiffs from S3 and Hdfs Windowed Reading GeoTiffs from S3 and Hdfs Nov 9, 2016
@echeipesh echeipesh self-assigned this Nov 11, 2016
@@ -51,12 +51,9 @@ object ArraySegmentBytes {
val result = Array.ofDim[Array[Byte]](offsets.size)

cfor(0)(_ < offsets.size, _ + 1) { i =>
byteReader.position(offsets(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i), offsets(i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale doc string, byteBuffer no longer a prameter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oldOffset is unused. Looks like the previous version of this file used it to return position, I would guess thats not a good idea and this variable needs to be removed now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the doc string. Yeah, oldOffset was actually always redundant since getSignedByteArray does it already.

@@ -51,12 +51,9 @@ object ArraySegmentBytes {
val result = Array.ofDim[Array[Byte]](offsets.size)

cfor(0)(_ < offsets.size, _ + 1) { i =>
byteReader.position(offsets(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i).toInt)
result(i) = byteReader.getSignedByteArray(byteCounts(i), offsets(i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe a conversation that already happened but why are there ByteReaderExtensions when we control the trait we're extending? Why shouldn't getSignedByteArray and others be methods on ByteReader trait directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteReader is meant to be a replacement for ByteBuffer, so I wanted the two to have the same methods (or at least have them be as close as possible). Since none of the methods in ByteReaderExtensions are in ByteBuffer, they were made into their own separate thing.

@@ -17,9 +17,9 @@ import spire.syntax.cfor._
* @param byteReader: A ByteReader that contains bytes of the GeoTiff
* @param storageMethod: The [[StorageMethod]] of the GeoTiff
* @param tifftags: The [[TiffTags]] of the GeoTiff
* @return A new instance of BufferSegmentBytes
* @return A new instance of LazySegmentBytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@param section is out of sync with code

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -173,33 +173,22 @@ trait ByteReaderExtensions {
}

final def getSignedByteArray(length: Long, valueOffset: Long): Array[Byte] = {
val arr = Array.ofDim[Byte](length.toInt)

// NOTE: We don't support lengths greater than Int.MaxValue yet (or ever).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be in a doc string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if so, why is the length parameter Int?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it. Why it was converted to Int? That's because you can't make an Array.ofDim using a Long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants