# 0. [Dependencies](https://spacenetchallenge.github.io/#Dependencies)
> The [AWS Command Line Interface (CLI)](https://aws.amazon.com/cli/) must be installed with an active AWS account. Configure the AWS CLI using ‘aws configure’

# 1. [Accessing the SpaceNet Data on AWS](https://aws.amazon.com/public-datasets/spacenet/#Accessing_the_SpaceNet_Data_on_AWS)
> The SpaceNet dataset is being released in several Areas of Interest. All AOIs will follow a similar directory structure and data format. The imagery is GeoTIFF satellite imagery and corresponding GeoJSON building footprints...

> For more detailed information on how to access specific files within the dataset, see [here](https://github.com/SpaceNetChallenge/utilities/tree/master/content/download_instructions).

> _The spacenet-dataset S3 bucket is provided as a Requester Pays bucket, see [here](https://docs.aws.amazon.com/AmazonS3/latest/dev/RequesterPaysBuckets.html) for more information._

# 2. Downloading Rio raster and vector data with [Boto](https://boto3.readthedocs.io/en/latest/index.html)
Since the bucket is Request Pays, we cannot successfully curl images. Instead, Boto, the AWS SDK for Python, provides an interface to download files from Request Pays buckets. The [S3Transfer](https://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer) class has a download method that can take in a 'RequestPayer' argument.

### Setting up paths for download

In [1]:
import os
bucket = "spacenet-dataset"

aoi_path = "AOI_1_Rio"
aoi_data_path = os.path.join(aoi_path, "srcData")
building_labels_path = os.path.join(aoi_data_path, "buildingLabels")
mosaic_3band_path = os.path.join(aoi_data_path, "mosaic_3band")

tmp_path = "/tmp"

### Setting up Boto for download

In [2]:
import boto3

client = boto3.client("s3")
transfer = boto3.s3.transfer.S3Transfer(client)

### Getting list of imagery files to download

In [3]:
mosaic_3band_object_list = client.list_objects_v2(
    Bucket=bucket, Prefix=mosaic_3band_path,
    RequestPayer='requester')
mosaic_3band_key_list = [obj["Key"] for obj in mosaic_3band_object_list["Contents"]]

### Downloading Rio imagery files, if not downloaded

In [4]:
for mosiac_3band_key in mosaic_3band_key_list:
    mosiac_3band_name = mosiac_3band_key.split("/")[-1]
    mosiac_3band_filename = os.path.join(tmp_path, mosiac_3band_name)
    if (not os.path.isfile(mosiac_3band_filename)):
        transfer.download_file(
            bucket=bucket, key=mosiac_3band_key, filename=mosiac_3band_filename,
            extra_args={"RequestPayer": "requester"})

### Downloading Rio outline, if not downloaded

In [5]:
outline_filename = "Rio_OUTLINE_Public_AOI.geojson"
outline_key = os.path.join(building_labels_path, outline_filename)
full_outline_filename = os.path.join(tmp_path, outline_filename)

if (not os.path.isfile(full_outline_filename)):
    transfer.download_file(
        bucket=bucket, key=outline_key,filename=full_outline_filename,
        extra_args={"RequestPayer": "requester"})

### Downloading Rio building footprints file, if not downloaded

In [6]:
buildings_filename = "Rio_Buildings_Public_AOI_v2.geojson"
buildings_key = os.path.join(building_labels_path, buildings_filename)
full_buildings_filename = os.path.join("/tmp", buildings_filename)

if (not os.path.isfile(full_buildings_filename)):
    transfer.download_file(
        bucket=bucket, key=buildings_key, filename=full_buildings_filename,
        extra_args={"RequestPayer": "requester"})

# 3. Wrangling imagery with [GDAL](http://www.gdal.org/gdal_translate.html)
The downloaded imagery is JPEG compressed, and [GeoTrellis's Decompressor.scala](https://github.com/locationtech/geotrellis/blob/master/raster/src/main/scala/geotrellis/raster/io/geotiff/compression/Decompressor.scala#L119-L122) throws an exceptin, stating that "Compression type JPEG is not supported by this reader." The [Python binding]() of The [gdal_translate](http://www.gdal.org/gdal_translate.html) utility can convert the image to

### Making directory for converted images, if not created

In [7]:
spacenet_data_path = os.path.join(tmp_path, "spacenet-data")
if not os.path.exists(spacenet_data_path):
    os.makedirs(spacenet_data_path)

### Converting imagery to remove JPEG compression, if not converted

In [8]:
from osgeo import gdal

for mosiac_3band_key in mosaic_3band_key_list:
    mosiac_3band_name = mosiac_3band_key.split("/")[-1]
    mosiac_3band_filename = os.path.join(tmp_path, mosiac_3band_name)
    translated_mosiac_3band_filename = os.path.join(spacenet_data_path, mosiac_3band_name)
    if (not os.path.isfile(mosiac_3band_filename)):
        gdal.Translate(
            destName=translated_mosiac_3band_filename, srcDS=mosiac_3band_filename,
            creationOptions=['COMPRESS=LZW'])

# 4. Ingesting imagery for fast viewing with [GeoPySpark](https://github.com/locationtech-labs/geopyspark)

###  Setting up Spark context for ingest of Rio imagery

In [9]:
import geopyspark as gps
from pyspark import SparkContext
conf = gps.geopyspark_conf("local[*]", "spacenet-ingest")
sc = SparkContext.getOrCreate(conf)

### Ingesting Rio imagery files

In [11]:
from geopyspark.geotrellis.geotiff import get
from geopyspark.geotrellis.constants import LayerType
from geopyspark.geotrellis.catalog import write

# Read the GeoTiff locally
catalog_uri = spacenet_data_path

rdd = get(LayerType.SPATIAL, catalog_uri)
# Error: https://github.com/locationtech/geotrellis/issues/2268
metadata = rdd.collect_metadata()

# tile the rdd to the layout defined in the metadata
laid_out = rdd.tile_to_layout(metadata)

# reproject the tiled rasters using a ZoomedLayoutScheme
reprojected = laid_out.reproject("EPSG:3857").cache().repartition(200)

# pyramid the TiledRasterRDD to create 12 new TiledRasterRDDs
# one for each zoom level
pyramided = reprojected.pyramid(start_zoom=12, end_zoom=1)

# Save each TiledRasterRDD locally
for tiled in pyramided:
    write("file:///tmp/spacenet-catalog", "spacenet-ingest", tiled)

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/usr/local/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1035, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 883, in send_command
    response = connection.send_command(command)
  File "/usr/local/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1040, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:40417)
Traceback (most recent call last):
  File "/usr/lib/py

Py4JError: An error occurred while calling o29.collectMetadata

# 5. Showing Rio’s outline, imagery, and building footprints on a map with [GeoNotebook](https://github.com/OpenGeoscience/geonotebook)

### Reading geojson of Rio outline to vector data

In [None]:
from geonotebook.wrappers import VectorData
outline_vector = VectorData(outline_filename)

### Centering map at centroid of Rio outline vector

In [None]:
outline_polygons = [polygon for polygon in outline_vector.polygons]
outline_polygon = outline_polygons[0]
outline_centroid = outline_polygon.centroid
x = outline_centroid.x
y = outline_centroid.y
z = 12
M.set_center(x, y, z)

### Adding layer of Rio outline vector

In [None]:
M.add_layer(outline_vector, name=outline_name);

In [None]:
# buildings_vector = VectorData(buildings_filename)
# M.add_layer(building_vector, name=buildings_name);