# Transfer Learning with Convolutional Neural Networks for Hydrological Streamline Detection

## Authors:
Nattapon Jaroenchai a, b
Shaowen Wang a, b, *
Lawrence V. Stanislawski c
Ethan Shavers c
E. Lynn Usery c
Shaohua Wang a, b
Sophie Wang
Li Chen a, b

a Department of Geography and Geographic Information Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
b CyberGIS Center for Advanced Digital and Spatial Studies, University of Illinois at Urbana-Champaign, Urbana, IL, USA
c U.S. Geology Survey, Center of Excellence for Geospatial Information Science, Rolla, MO, USA
d School of Geoscience and Info-Physics, Central South University, Changsha, Hunan, China

## Last Updated Date: August 10, 2023

## Abstract
Streamline network delineation plays a vital role in various scientific disciplines and business applications, such as agriculture sustainability, river dynamics, wetland inventory, watershed analysis, surface water management, and flood mapping. Traditionally, flow accumulation techniques have been used to extract streamlines, which delineate streamline primarily based on topological information. Recently, machine learning techniques such as the U-net model have been applied for streamlining delineation. Even though the model shows promising performance in geographic areas that it has been trained on, its performance drops significantly when applied to other areas. In this paper, we apply a transfer learning approach in which we use the pre-trained network architectures that have been trained on a large dataset, ImageNet. Then, we fine-tuned the neural networks using smaller datasets collected from Rowan Creek and Covington areas in the US. When we compared the models pre-trained on ImageNet with an attention U-net model which are fine-tuned on the Rowan Creek area, we found that the DenseNet169 model achieved an F1-score of 85% which is about 4% higher than the attention U-net model. Then, to compare the transferability of the models, the top three models in Rowan Creek area and the attention U-net model were fine-tuned further with the samples from the Covington area. We were able to achieve an F1-score of 71.87% in predicting the steam pixels in the Covington area which is significantly higher than training the model from scratch with the samples collected from the Covington area and slightly higher than the attention U-net model.

## Keywords:
Transfer Learning, Convolutional neural network, Remote Sensing, Streamline detection

## Table of Content
1. [Study Areas and Input Data](https://colab.research.google.com/drive/1IpcytvADe_jgczhTnPRg2iTlPTAV37zG#scrollTo=fu-qrWMzfDh_&line=1&uniqifier=1)
2. [Model Training Process](#)
3. [Model Evaluation process](#)


## 1. Introduction

In this study, we investigate the transferability of models across two distinct locations: the watershed in Rowan County, North Carolina, and the Covington area in Virginia.

### 1.1 Study Areas

#### 1.1.1 Rowan County, North Carolina

The data for Rowan County, North Carolina (Figure 1), is sourced from the study by Xu et al. (2021). This area comprises a network of tributaries flowing into Second Creek, the primary flowline feature of 12-digit NHD watershed 030401020504. The dataset encompasses 1,400 training samples and 30 validation samples extracted from the upper portion of the area. The test data covers the entire lower area.

![Figure 1: Rowan County area](notebook_data/rowan_county_figure.jpg)
*Figure 1: Rowan County area (left: boundary of North Carolina state; middle: a 1-m resolution image of the study area from National Agriculture Imagery Program (NAIP); right: reference stream feature). Source: Xu et al., 2021.*

Eight raster layers are stacked to create the dataset, including a 1-m resolution digital elevation model (DEM), geometric curvature, topographic position index (TPI), zenith angle positive openness, return intensity, and point density information. The statistics for each raster layer are summarized in Table 1.

**Table 1: Summary statistics raster images for Rowan County, NC**  

|Raster Image Name|Minimum|Maximum|Mean|Standard Deviation|Range|  
|---|---|---|---|---|---|  
|Digital elevation model (meters)|194.11|256.19|229.07|12.96|62.07|  
|Geometric curvature|-97.25|97.93|0.01|3.05|195.18|  
|Topographic position index (3x3 window)|-8.59|5.58|6.38|0.18|14.17|  
|Topographic position index (21x21 window)|-13.62|13.29|0|0.93|26.91|  
|Openness (R10, D32) degrees|21.52|118.8|83.41|7.35|97.28|  
|Return intensity|0|55185.39|29047.18|10624.11|55185.39|  
|Return point density 1 ft above ground (points per m²)|0|0.94|0.02|0.04|0.94|  
|Return point density 3 ft above ground (points per m²)|0|2.89|0.12|0.23|2.89|  
*Source: Xu et al., 2021.*

\
#### 1.1.2 Covington River Watershed, Virginia

The second study area is the 12-digit NHD Hydrologic Unit (HU) 020801030302 watershed, encompassing primary tributaries of Covington and Rush Rivers in Rappahannock County, northern Virginia (Figure 2). The area covers 108 square kilometers and exhibits diverse land cover, temperature ranges, and elevation characteristics. The watershed's features were rasterized to 1-m resolution for reference.

![Figure 2: Covington area](notebook_data/covington_area_figure.jpg)
*Figure 2: Covington area (left: boundary of Virginia, USA; middle: a 1-m resolution image of the study area from National Agriculture Imagery Program (NAIP); right: reference stream feature).*

Eight 1-m resolution Lidar and elevation-derived raster data layers were employed for training, validation, and testing. These layers encompass digital elevation models, geometric curvature, slope, positive openness, topographic position indices, return intensity, geomorphons, and TPI. Summary statistics for these layers are presented in Table 2.

**Table 2: Summary statistics raster images for Covington River watershed, VA**

|Raster Image Name|Minimum|Maximum|Mean|Standard Deviation|Range|
|---|---|---|---|---|---|
|Digital elevation model (DEM) (m)|125.45|1039.15|365.90|190.98|913.70|
|Geometric curvature|-1.99|1.99|0.0001|0.0941|3.99|
|TPI with moving window size 3|-5.70|5.72|0.00000235|0.0661|11.42|
|TPI with moving window size 21|-14.70|12.80|0.0001|0.2873|27.50|
|Positive openness|45.35|162.61|88.83|2.4694|117.26|
|Lidar reflectance|0.00|255.00|39.32|12.61|255.00|
|Slope data (degree)|0.00|14.26|0.23|0.17|14.26|

Note. Geomorphons are integer-coded discrete class there for we do not include the statistics in this table.

For training, 200 initial sample patches were extracted and augmented to generate 1400 samples for the training dataset. Additionally, 30 unaugmented samples were extracted for the validation dataset. The southern region of the study area served as the test dataset for evaluating model performance and generalization.
