<a href="https://colab.research.google.com/github/YoungHyunKoo/GEE_remote_sensing/blob/main/test2024_5_1_Deep_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **[GEO 6083] Remote Sensing Imge Processing - Spring 2024**
# **WEEK 5-1. Deep learning with Earth Engine**

### OBJECTIVES
1. Import earth engine images as numpy arrays
2. Train and test a simple neural network (NN) model with the imported images

Credited by Younghyun Koo (kooala317@gmail.com)

## **Convert Earth Engine image into Xarray format**

In this tutorial, we will train and test a **neural network (NN)** model to filter out cloud coveres in a Landsat 8 imagery. Earth Engine provides its own AI platform ([Earth Engine Vertex AI example](https://developers.google.com/earth-engine/guides/ml_examples)), but this service requires commercial licenses and charges for the access to Google Cloud. Therefore, instead of using the GEE AI platform, we will extract the `ee.Image` into more accessible and Python-compatible array format to conduct advanced machine learning and AI training. In this tutorial, we will export a GEE image into `xarray` format using an external library called `wxee`. `xarray` is a very popular and useful package specialized for processing multi-dimensional array data. You can find more information here: https://xarray.dev/



In [None]:
# Import ee library
import ee

# Authenticate
ee.Authenticate()

# Initialize with your own project.
ee.Initialize(project = "utsa-spring2024")

In [None]:
# Import geemap library
import geemap

In [None]:
# Import geopandas and pandas library
import geopandas
import pandas as pd
import numpy as np

In [None]:
# Install wxee library (convert ee to xarray)
!pip install wxee

In [None]:
# For interactual plot
!pip install ipympl

In [None]:
import wxee
import matplotlib.pyplot as plt

In [None]:
# Target area: San Antonio area
AOI = ee.Geometry.Rectangle(
  [
    [-98.50, 29.20],
    [-98.30, 29.50]
  ]
)

# import image data
dataset = ee.ImageCollection("LANDSAT/LC08/C02/T1_TOA")\
.filterDate('2015-01-01', '2015-12-31')\
.filterMetadata('CLOUD_COVER', 'greater_than', 20) \
.filterMetadata('WRS_ROW', 'equals', 40)\
.filterMetadata('WRS_PATH', 'equals', 27)\
.filterBounds(AOI) \
.sort("CLOUD_COVER")

img = dataset.first().clip(AOI) #mean()#.clip(AOI)

# trueColor = dataset.select(['R', 'G', 'B']);
trueColorVis = {
    'bands': ['B4', 'B3', 'B2'],
    'min': 0.0,
    'max': 0.3,
};

Map = geemap.Map()

Map.centerObject(img, 12);
Map.addLayer(img, trueColorVis, 'True Color')

Map

In [None]:
# Get the information about this image
img.getInfo()

In [None]:
# Export this image into xarray format (B1-B7)
bands = ['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7']
arr = img.select(bands).wx.to_xarray(scale = 30, crs = "EPSG:32614", progress = True)

In [None]:
arr

In [None]:
# Show RGB true color image
RGB = np.dstack([arr['B4'][0], arr['B3'][0], arr['B2'][0]])
plt.imshow(RGB)

## **Prepare training datasets**

The NN model is fully **data-driven**, which means that this model learns how to distinguish clouds and non-cloud area using given sets of data. Therefore, in order to train the NN model, we need a lot of labeled training datasets. In this section, we will manually digitize some cloud and non-cloud areas.

In [None]:
# Define a function that extracts band information for a specific point
def derive_inputs(data, coord, bands):
  # data: xarray format of the input image
  # coord: x and y pixel location of the selected point
  # bands: input bands (B1-B7)

  # Output - dataframe format
  df = pd.DataFrame({})

  if len(data) > 0:

    for band in bands:

      df.loc[0, band] = data[band][0][(coord[1], coord[0])]

  return df

In [None]:
%matplotlib widget
fig, ax = plt.subplots(constrained_layout=True, figsize = (8, 8))

plt.imshow(RGB)

# Function for storing and showing the clicked values
coord1 = []
coord2 = []

def onclick(event):

    global coord1, coord2, cloud

    x = int(event.xdata)
    y = int(event.ydata)

    if event.button == 1: # Left click (clouds)
        coord1.append((x, y))
        ax.scatter(x, y, marker = "x", color = "r")
        df = derive_inputs(arr, (x,y), bands)
        df['cloud'] = 1
        cloud = pd.concat([cloud, df]).reset_index(drop = True)

    elif event.button == 3: # Right click (non-clouds)
        coord2.append((x, y))
        ax.scatter(x, y, marker = "x", color = "b")
        df = derive_inputs(arr, (x,y), bands)
        df['cloud'] = 0
        cloud = pd.concat([cloud, df]).reset_index(drop = True)

    # berg.to_csv(filepath + "\\" + filename)

    fig.canvas.draw() #redraw the figure

cloud = pd.DataFrame({})

fig.canvas.mpl_connect('button_press_event', onclick)

plt.show()

In [None]:
%matplotlib widget
fig, ax = plt.subplots(constrained_layout=True, figsize = (8, 8))

plt.imshow(RGB)

# Function for storing and showing the clicked values
coord1 = []
coord2 = []

def onclick(event):

    global coord1, coord2, cloud, shadow

    x = int(event.xdata)
    y = int(event.ydata)

    if event.button == 1: # Left click (clouds)
        coord1.append((x, y))
        ax.scatter(x, y, marker = "x", color = "r")
        df = derive_inputs(arr, (x,y), bands)
        df['cloud'] = 2
        shadow = shadow.append(df).reset_index(drop = True)

    # berg.to_csv(filepath + "\\" + filename)

    fig.canvas.draw() #redraw the figure

shadow = pd.DataFrame({})

fig.canvas.mpl_connect('button_press_event', onclick)

plt.show()

In [None]:
cloud

In [None]:
# X: input data (band values); y: output data (binary classification of cloud and non-cloud)
X = cloud[bands]
y = cloud['cloud']

In [None]:
X

In [None]:
y

In [None]:
from sklearn.model_selection import train_test_split

# Split train and test datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4)

## **Train neural network**

Neural Network, especially **Multi-layer Perceptron (MLP)** in this case, is a supervised learning algorithm that learns a function by training on a dataset. The process of creating a neural network begins with the **perceptron (or neural)**. In simple terms, the perceptron receives inputs, multiplies them by some weights, and then passes them into an activation function (such as logistic, relu, tanh, identity) to produce an output. Neural networks are created by adding the layers of these perceptrons together, known as a multi-layer perceptron. There are three layers of a neural network - the input, hidden, and output layers. The input layer directly receives the data, whereas the output layer creates the required output. The layers in between are known as hidden layers where the intermediate computation takes place. [Multilayer perceptron](https://scikit-learn.org/stable/modules/neural_networks_supervised.html)

https://dev.to/dattran1999/how-neural-networks-work-dma

<img src = "https://www.ibm.com/content/dam/connectedassets-adobe-cms/worldwide-content/cdp/cf/ul/g/3a/b8/ICLH_Diagram_Batch_01_03-DeepNeuralNetwork.component.simple-narrative-xl.ts=1708454686214.png/content/adobe-cms/us/en/topics/neural-networks/jcr:content/root/table_of_contents/body/content_section_styled/content-section-body/simple_narrative_2144712998/image" width = 600>

<img src = "https://res.cloudinary.com/practicaldev/image/fetch/s--nlat4t7K--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pythonmachinelearning.pro/wp-content/uploads/2017/09/Single-Perceptron.png.webp" width = 600>

In [None]:
# Dependencies: Import keras package
import keras
from keras.models import Sequential
from keras.layers import Dense

# Design a neural network
model = Sequential()
model.add(Dense(10, input_dim=7, activation='ReLU'))
model.add(Dense(10, activation='ReLU'))
model.add(Dense(1, activation='sigmoid'))

*** NOTE ***
Various types of activation functions:

<img src = "https://aman.ai/primers/ai/assets/activation/1.png" width = 500>

In [None]:
model.summary()

In [None]:
# Compile model: loss function, gradient optimizer, etc.
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=["accuracy"])

<img src = "https://miro.medium.com/v2/resize:fit:1400/format:webp/1*SCz0aTETjTYC864Bqjt6Og.png" width = 600>

In [None]:
# Train model: set up some parameters, e.g., epochs, batch size, etc.
history = model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs=100, batch_size=16)

In [None]:
# Check final test accuracy
test_loss, test_acc = model.evaluate(X_test,  y_test, verbose=2)

In [None]:
# Make matplotlib inactive mode
%matplotlib inline

# Draw learning plot
plt.figure()
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
def predict_cloud(data, model, bands):
  # data: should be xarray with attributes name
  # model: keras NN model that is already trained
  # bands: list of bands (attributes of xarray) to be used as input features

  # Initialize the output result into 0
  result = np.zeros((arr.y.shape[0], arr.x.shape[0]))

  # Get all pixel values as a single dataframe (initialize dataframe)
  x_input = pd.DataFrame(columns = bands)

  # Assign pixel values to dataframe
  for b in bands:
    x_input[b] = data[b][0].values.flatten()

  # Predict cloud probability using NN model
  result0 = model.predict(x_input)

  # Reshape the tabular data to grid format (same gridsize with the original array)
  result0 = result0.reshape((data.y.shape[0], data.x.shape[0]))

  # Assign binary values (0 or 1) to result (threshold 0.5)
  result[result0 > 0.5] = 1

  return result


In [None]:
# Apply model to the entire image
result = predict_cloud(arr, model, bands)

In [None]:
%matplotlib inline
fig, ax = plt.subplots(1,2,figsize = (12,5))
ax[0].imshow(RGB, vmin = 0.0, vmax = 0.2)
ax[0].set_title("Original RGB True color")
ax[1].imshow(result)
ax[1].set_title("Result: detected clouds")
plt.show()

## References
- https://wxee.readthedocs.io/en/latest/
- https://www.tensorflow.org/tutorials