# Find, process, and download ASTER satellite imagery from Google Earth Engine

This notebook generates the satellite imagery that will be used to train a machine learning model to distinguish between Porphyry Copper Deposits (PCDs) and places lacking Porphyry Copper Deposits (nPCDs).

It takes locations where PCDs are known and where PCDs are known to be absent, generates square images of the areas around those locations, and exports them to a destination folder with a naming convention that distinguishes between them.

Locations of PCDs were taken from the [Global Assessment of Undiscovered Copper Resources](https://mrdata.usgs.gov/sir20105090z/) (GAUCR), provided by the United States Geological Survey (USGS). Only deposits that had never been exploited were included.

Locations of nPCDs were compiled from a variety of mineral databases also hosted by the USGS on the assumption that the observation of other minerals and not PCDs indicated a true absence of PCDs in these locations. Locations that fell within 20 km of any PCD in the GAUCR or in a second database, [Porphyry Copper Deposits of the World](https://mrdata.usgs.gov/porcu/) were excluded. Locations that fell outside of the PCD 'permissive tracts' areas in the GAUCR were also excluded. As a result, the nPCDs database contains locations where PCDs could theoretically be occur but do not.

## Preparation

### Import Google Earth Engine, sign in, and initialize it

In [None]:
import ee
ee.Authenticate()
ee.Initialize()

### Mount the drive and load in the files containing true positives and true negatives

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import json

In [None]:
positives_file = "" # path to geojson file containing locations of known, unexploited PCDs
with open(positives_file) as data_file:
  p= json.load(data_file)

In [None]:
negatives_file = "" # path to geojson file containing locations where PCDs are known not to occur
with open(negatives_file) as data_file:
  n= json.load(data_file)

### Import custom modules for ASTER imagery preprocessing and band engineering

When working in Colab, upload these modules into session storage first

In [None]:
import preprocessing

In [None]:
import band_engineering

### Declare a function for creating a box around a point

This bounding box should be bigger than the desired target image size. The box is defined using coordinates in latitude and longitude, but to maintain consistancy in areal coverage, the imagery is projected before it is downloaded. As a result, the box appears tilted, and clipping the resulting numpy array will cut off the corners.

This function takes a pair of coordinates in X, Y (i.e., Longitude, Latitude) format, draws a circle centered on that point with a radius given by the distance parameter, and then creates abounding box around that circle. It returns an ee.Geometry object.

In [None]:
def create_box(coords, distance = 5000):
  point = ee.Geometry.Point(coords)
  buffer = point.buffer(distance)
  bbox = buffer.bounds()
  return bbox

## Execution

### Declare a function to export one image based on the location of one feature

This function takes a single feature from a geojson file, a category (PCD or nPCD), and a destination folder. It creates a box centered on that point and calls `aster_pre_processing()` from the `preprocessing` module to retrieve ASTER satellite imagery and preprocess it. This returns an ee.Image object, along with a coordinate reference system (crs) and a crs transform in a dictionary, which are used to project the downloaded image.

It then calls `band_engineering()` from the `band_engineering` module to calculate band combinations that highlight features indicative of PCDs.

It uses the input category and an ID field from the geojson file to create a name, and then exports the image to the given folder with that name.

In [None]:
def export_image_TIFF(feature, category, folder):
  coords = feature['geometry']['coordinates']
  bbox = create_box(coords)
  image_dict = preprocessing.aster_pre_processing(ee.ImageCollection("ASTER/AST_L1T_003"), bbox)
  pp_image = image_dict['imagery']
  pp_crs = image_dict['crs']
  pp_crs_transform = image_dict['transform']
  eng_image = band_engineering.band_engineering(pp_image)
  if category == 'PCD':
    name = f"{category}_{feature['properties']['GMRAP_ID']}"
  elif category == 'nPCD':
    name = f"{category}_{feature['id']}"
  else:
    return "YOU SCREWED UP THE CATEGORY"
  task = ee.batch.Export.image.toDrive(
      image = eng_image,
      description = name,
      folder = folder,
      scale = 30,
      crs = pp_crs,
      crsTransform = pp_crs_transform,
      region = bbox
  )
  task.start()

### Use a for loop to call `export_image_TIFF` on all features in the PCD and non-PCD datasets

In [None]:
destination_folder = "" # destination folder path here

In [None]:
for feat in p['features']:
  export_image_TIFF(feat, 'PCD', destination_folder)

In [None]:
for feat in n['features']:
  export_image_TIFF(feat, 'nPCD', destination_folder)