<a href="https://colab.research.google.com/github/remis/mining-discovery-with-deep-learning/blob/master/loadSentinelDams.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Creates image samples from Sentinel 2 collections

This script is part of a research project published on the paper "Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning" by Remis Balaniuk, Olga Isupova and Steven Reece. This project was developed at the University of Oxford from September 2019 to February 2020.
It was prepared to be used on the Google Colaboratory platform (see https://colab.research.google.com/notebooks/welcome.ipynb ).  

In [21]:
# !pip install earthengine-api
# !pip install geopandas
import os
import sys
import math

The user must have an Google account and sign up to use the Google Earth Engine (see https://earthengine.google.com/).

In [2]:
# Import the Earth Engine library.
import ee
# '4/1AX4XfWhI7Z01bqf5Vq4F8I-0vnWg2eDLfSavUq5PEJnxOv--Nrby3WpEP6Q'
# Trigger the authentication flow.
ee.Authenticate()


Enter verification code: 160569566253-5h2e0udg8puhomist406tnudhrs7k3pp.apps.googleusercontent.com


KeyError: 'client_id'

Image samples will be saved on the user Google Drive. The drive must be mounted before proceeding.

In [20]:
from google.colab import drive
drive.mount('/content/drive')

ModuleNotFoundError: No module named 'google.colab'

In [11]:
# Import the Earth Engine Python Package
import ee
import pandas as pd
import numpy as np
import geopandas as gpd

In [10]:
# Initialize the Earth Engine object, using the authentication credentials.
ee.Initialize()

In [22]:
# Cloud masking function for Sentinel-2.
def maskS2clouds(image):
  cloudShadowBitMask = ee.Number(2).pow(3).int()
  cloudsBitMask = ee.Number(2).pow(5).int()
  qa = image.select('QA60')
  mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
    qa.bitwiseAnd(cloudsBitMask).eq(0))
  return image.updateMask(mask).select(bands).divide(10000)


Editing the next cell the user can select the spectral bands to be included on the image patches.

In [23]:
# Use these bands for prediction.
bands = ['B1', 'B2', 'B3', 'B4', 'B5','B6', 'B7', 'B8', 'B9', 'B10', 'B11', 'B12']

# Use Sentinel 2 surface reflectance data.
sentinel = ee.ImageCollection("COPERNICUS/S2")


Editing the next cell the user can select the time interval (filterDate) and the cloud cover percentage ('CLOUDY_PIXEL_PERCENTAGE') to filter the images used on compose the patches. The shorter the interval the greater the chances to have pixels with no data to display. Regions with frequent cloud cover, like the rain forest, will require a long interval to ensure a complete pixel set.

In [24]:
# The image input data is cloud-masked median composite.
image = sentinel.filterDate('2018-01-01','2020-01-01').filter(ee.Filter.lte('CLOUDY_PIXEL_PERCENTAGE', 20)).map(maskS2clouds).median().toFloat()


On the following the user will be able to choose a csv file from his Google Drive root containing the coordinates (latitude and longitude) of the spots from which he wants to extract the image patches. Additionally, he will be prompted to inform the columns separator used in the csv file. 

The polygons delimiting the areas of interest described on the csv file can be defined using one of the following schemes:

1: using two pairs of coordinates indicating the lower-left  (souththwest) and the upper right (northeast) corners of the polygon;

2: defining the coordinates of a central point and the length of the side of a square defined around that point.

The user will be prompted to inform which scheme should be used to read the csv file (all records on the file should use the same scheme).

A last column on the csv file should be used to inform a class name for the sample. This class name will be used as prefix to name the image files.

The csv records should look like this:

####-column separator =';' and scheme 1:

> lower left y latitude; lower left x longitude; upper right y latitude; upper right x longitude;  class

> -20.893706;-45.271998;-18.854222;-41.958905;area1


####-column separator =';' and scheme 2:
> central point latitude; central point longitude; class name

>-23.82113889;-50.42022222;dam



In [25]:
def offset(lat,lon,x,y):

	#Earth’s radius, sphere
	R=6378137

	#Coordinate offsets in radians
	dLat = x/R
	dLon = y/(R*math.cos(math.pi*lat/180))

	return lat + dLat * 180/math.pi, lon + dLon * 180/math.pi
 

def exportImage(data,scheme,size=0):

	# Loop the csv file.

	for d in range(data.shape[0]):

		if scheme == 2:	
			x = data[d][0]
			y = data[d][1]

			llx , lly = offset(x,y,-size/2,-size/2)
			urx , ury = offset(x,y,size/2,	size/2)

			label = data[d][2]
	 
		else:

			llx = data[d][0]
			lly = data[d][1]
			urx = data[d][2]
			ury = data[d][3]	

			label = data[d][4]	

		geometry = [[llx,lly], [llx,ury], [urx,ury], [urx,lly]]

		task_config = {
	    'scale':  10 ,
	    'region': geometry
	    }
		
		name = label + str(d)
		# Create a task.
		task = ee.batch.Export.image(image, name, task_config)

		# Send the task to the earth engine.
		task.start()    

In [27]:
#MAIN WORKFLOW

# assuming the csv file on the My drive root folder (change the %cd if it is not the case)
%cd /content/drive/My Drive/
files = []
count=0
for f in os.listdir('./'):
  name, ext = os.path.splitext(f)
  if ext == '.csv':
    files.append(f)
    count+=1
    print(count,":",f)

print("Choose your file:")
try:
  r=int(input('Input:'))
except ValueError:
  print("Not a number")

print("csv separator? (typically ';' or ',')")
sep=input('Input:')

data = pd.read_csv(files[r-1], sep= sep)
data = data.values

print(data.shape[0],"records with",data.shape[1],"columns")

if data.shape[1]==3:
  print("Central point scheme. Please inform the square side length (in meters):")
  try:
    size=int(input('Input:'))
  except ValueError:
    print("Not a number")
  exportImage(data,2,size)
elif data.shape[1]==5:
  exportImage(data,1)
else:
  print("Invalid csv file!")
  sys.exit(0)

/Users/ernestopoku-kwarteng/Documents/Pycharm Projects/Anaconda/Machine-Learning/mining-discovery-with-deep-learning-master/data
Choose your file:
Not a number
csv separator? (typically ';' or ',')


NameError: name 'r' is not defined

If the script was succesfull the tasks should be visible on Google Earth Engine code editor (https://code.earthengine.google.com/) interface. The user must log on to authorize the tasks execution.