# Using GreenlandCHANGES

The goals of this tutorial are:
1. describe the steps to initiate the changes module
2. highlight several features of the GreenlandCHANGES class to faciliate custom usage
3. show an example to run GreenlandCHANGES to compile ArcticDEM elevation data

Note that an `init.py` file is additionally provided in the `changes` module which can be edited and run to achieve similar results as those described below.

## Initializing the GreenlandCHANGES class

To initialize the module from any directory, first add the package to your path:

In [1]:
import sys
sys.path.insert(0, "/Users/mhwood/Documents/Research/Scripts/CHANGES/GreenlandCHANGES")

Next, define two directories on your local drive as follows:

| directory | purpose | approximate file size |
|-----------|---------|-----------------------|
|`project_folder` | This is the path where output data from the changes module will be stored - the data to be used directly for analysis. | The file sizes for this path can be up to a few GB depending on the size and resolution of the sample domain, and the number of sources accessed. |
|`data_folder` | This is the path where ice velocity and elevation data, from their respective sources, will be stored. The data_folder option was create facilitate data storage on external drives. | Depending on the data source and whether raw data is kept on disk, this can be several hundreds of GB. |

In [2]:
project_folder = '/Users/mhwood/Documents/Research/Projects/CHANGES/Examples/'
data_folder='/Volumes/mhwood/Research/Data Repository/Greenland'

Next, initialize the GlacierCHANGES object - this object will contain all pertinent information to initialize the data grids in your region of interest.

In [3]:
import changes.initiation.GreenlandCHANGES as gc
GC = gc.GreenlandCHANGES(project_folder,data_folder)

To view the initial attributes in the `GC` class, use the method `print_initiation_parameters`:

In [4]:
GC.print_initiation_parameters()

Region Parameters:
    region_initiated:  False
    region_name:  untitled_region
    extents:  []
 
Velocity Parameters:
    compile_velocity:  True
    velocity_grid_posting:  300
    velocity_grid_epsg:  3413
    create_velocity_stacks:  True
    Velocity Sources:
        compile_golive_data: True
        compile_tsx_data: True
 
Elevation Parameters:
    compile_elevation:  True
    elevation_grid_posting:  50
    elevation_grid_epsg:  3413
    create_elevation_stacks:  True
    Elevation Sources:
        compile_arcticDEM_data: True
        compile_gimp_data: True
        compile_glistin_data: True
        compile_icesat2_data: True
        compile_oib_data: True


There are a few things to note for the initial parameters:
1. The region has not yet been initiated
2. The default parameters for velocity and elevation are all "True". In other words, the module, by default, will attempt to download ALL available velocity and elevation.
In the next steps, we will adjust these modules to our needs

**When all of the parameters have been set correctly, the module will be ready to run using the the `execute_velocity_and_elevation_compilations` method**

## Defining the region of interest

First, try to run the execute_velocity_and_elevation_compilations method without specifying the region and its extents:

In [5]:
GC.execute_velocity_and_elevation_compilations()

!! Alert !!
!! Please define a region name and valid extents before running the compilation !!
   There are two options:
   Option 1: Pre-defined glacier extent:
      GC.set_extents_by_glacier([glacier name here])
   Option 2: Custom region and extents:
      GC.region_name = [your region here]
      GC.extents = [min_x,min_y,max_x,max_y]


Here, we get an alert indicating that the region has not yet been defined - the module does not yet know where to look for available ice velocity and elevation data. Lucky there are two options.

#### 1. Defining the domain with pre-defined glacier extents

In the GreenlandCHANGES package, we have provided approximate extents for over 200 Greenland glaciers. These are stored in the `reference` repository.

For example, if you were interested in studying Helheim glacier, you can define the region and extents as follows:

In [6]:
GC.set_extents_by_glacier('Helheim')

Now, you can check the parameters of the `GC` class to see that the region has now been defined and the extents are present:

In [7]:
GC.print_initiation_parameters()

Region Parameters:
    region_initiated:  True
    region_name:  Helheim
    extents:  [291775.0, -2597975.0, 331975.0, -2557775.0]
 
Velocity Parameters:
    compile_velocity:  True
    velocity_grid_posting:  300
    velocity_grid_epsg:  3413
    create_velocity_stacks:  True
    Velocity Sources:
        compile_golive_data: True
        compile_tsx_data: True
 
Elevation Parameters:
    compile_elevation:  True
    elevation_grid_posting:  50
    elevation_grid_epsg:  3413
    create_elevation_stacks:  True
    Elevation Sources:
        compile_arcticDEM_data: True
        compile_gimp_data: True
        compile_glistin_data: True
        compile_icesat2_data: True
        compile_oib_data: True


#### 2. Defining a custom region and extents

Suppose you wanted to look at a region which did not come pre-defined in this package. In this case, you can manually set the extents of the class by simply accessing and editing the `region_name` and `extents` attributes. 

For example, if you wanted to look at Petermann glacier, and the pre-defined extents were not suitable for your needs, you could set the region and extents as follows:

In [8]:
GC.region_name = 'Petermann'
min_x = -306933
min_y = -1019405
max_x = -200129
max_y = -916708
GC.extents = [min_x, min_y, max_x, max_y] #note the order of the extents

Note that the current version of this package only supports coordinates in polar stereographic coordinates.

Using the `print_initiation_parameters` method again shows the changes to the class:

In [9]:
GC.print_initiation_parameters()

Region Parameters:
    region_initiated:  True
    region_name:  Petermann
    extents:  [-306933, -1019405, -200129, -916708]
 
Velocity Parameters:
    compile_velocity:  True
    velocity_grid_posting:  300
    velocity_grid_epsg:  3413
    create_velocity_stacks:  True
    Velocity Sources:
        compile_golive_data: True
        compile_tsx_data: True
 
Elevation Parameters:
    compile_elevation:  True
    elevation_grid_posting:  50
    elevation_grid_epsg:  3413
    create_elevation_stacks:  True
    Elevation Sources:
        compile_arcticDEM_data: True
        compile_gimp_data: True
        compile_glistin_data: True
        compile_icesat2_data: True
        compile_oib_data: True


## An Example: Elevation data from ArcticDEM

For this example, we will demonstrate how to use the `changes` module to obtain [ArcticDEM data from the Polar Geospatial Center](https://www.pgc.umn.edu/data/arcticdem/). Here, we will use Kangerlussuaq glacier as a test example. As describe in the steps above, we will initiate the class and define the region:

In [19]:
GC = gc.GreenlandCHANGES(project_folder,data_folder)
GC.region_name = 'Kangerlussuaq'
min_x = 474475.0
min_y = -2314325.0
max_x = 514675.0
max_y = -2273975.0
GC.extents = [min_x, min_y, max_x, max_y]

As we saw in the examples above, all velocity and elevation sources will be run by default. To run the routine for the ArcticDEM data only, we turn off all other sources as follows:

In [11]:
GC.deactivate_all_sources()
GC.compile_elevation = True
GC.compile_arcticDEM_data = True

We can check that the run parameters are stored as expected by calling the `print_initiation_parameters` method:

In [12]:
GC.print_initiation_parameters()

Region Parameters:
    region_initiated:  True
    region_name:  Kangerlussuaq
    extents:  [474475.0, -2314325.0, 514675.0, -2273975.0]
 
Velocity Parameters:
    compile_velocity:  False
 
Elevation Parameters:
    compile_elevation:  True
    elevation_grid_posting:  50
    elevation_grid_epsg:  3413
    create_elevation_stacks:  True
    Elevation Sources:
        compile_arcticDEM_data: True
        compile_gimp_data: False
        compile_glistin_data: False
        compile_icesat2_data: False
        compile_oib_data: False


Each data source contains additional parameters, which can be viewed with the `print_[source]_parameters` commands:

In [13]:
GC.print_arcticDEM_parameters()

ArcticDEM Parameters:
    compile_arcticDEM_data:  True
    download_new_arcticDEM_data:  True
    keep_high_resolution_arcticDEM_data:  False
    max_number_of_arcticDEM_files:  all


Here, we see a few more details:
1. First, the routine will download new arcticDEM data, but the high resolution data will not be saved. Instead, the high resolution data will be downloaded, down-sampled to the posting of the elevation grid, and then the high resolution data will be deleted. This option is the default because the ArcticDEM files are quite large and take up excessive disk space.
2. The routine will be run for all available data that overlaps the domain - this option can be changed for testing or for examples.

In this example, we will set the maximum number of ArcticDEM files to be 5 - a small subset of the total files for the purposes of illustration in this tutorial.

In [14]:
GC.max_number_of_arcticDEM_files = 5

Now, we are ready to run the module! Note, this will download 5 files for a total of approximately 2 GB (but much of this data will be deleted when the routine is complete). This make take several minutes, depening on your internet and processing speed.

In [15]:
GC.execute_velocity_and_elevation_compilations()

Creating elevation compilation for Kangerlussuaq
    Running compilation for the ArcticDEM (Worldview) data
        Finding a list of ArcticDEM files which overlap the domain
            Searching through shapefile provided by PGC to find overlapping files
                (This may take a minute or two)
            Found 5 files
        Downloading files and down-sampling (if not already available)
            Checking file SETSM_W1W1_20100417_102001000B398600_102001000DD59500_seg1_2m_v3.0 (n68w034, 1 of 5)
              Downloading file...
              Downsampling file... 
                Untaring the data
                Working on the down sample
                  Reading in the file
                  Finding the nearest neighbors
            Checking file SETSM_W1W1_20110413_10200100127D6D00_10200100136A9F00_seg1_2m_v3.0 (n68w033, 2 of 5)
              Downloading file...
              Downsampling file... 
                Untaring the data
                Working on the down sam

### Examining the outputs

As the elevation data is compiled, a lot of metadata is printed out to inform you of the progress. When the routine is complete, lets take a look at the outputs:

### project_folder

First, start with your `project_folder`. We can see that a directory named `Kangerlussuaq` was created in your `project_folder` with a directory as follows:

| File | Description |
|------|-------------|
|Kangerlussuaq CHANGES Process Metadata.txt|A concise output of the metadata printed from the routine|
|Elevation\Metadata\Kangerlussuaq ArcticDEM Files.csv|A list of all files which overlap the region of interest defined by the extents|
|Elevation\Data\Kangerlussuaq ArcticDEM Elevation Grids.nc|A netCDF4 file which contains the elevation grids sampled onto the same grid|

The final file output is the heart of this package - the homogenized elevation grids.

By examing the file in a netCDF4 viewer (such as [Panoply](https://www.giss.nasa.gov/tools/panoply/), recommended), we can see there are 6 output variables - one each for the `x` and `y` dimensions, and 4 each for the data grids. Each data grid is stored by its date (YYYYMMDD) and contains metadata for the files used to generate the grid.

But wait! Didn't we specific 5 files? Yes! But two of the files represent elevation measurements on the same day - these were stitched together in one grid. Unfortunately, the first 3 dates in this output (20100417,20100413, and 20100421) do not contain any data on the glacier itself - they just overlap the domain in some way. The final layer (20100425) contains data on the actual glacier terminus. By running this routine for all files available (as listed in Kangerlussuaq ArcticDEM Files.csv), we can identify all available ArcticDEM data for Kangerlussuaq glacier.

### data_folder

Now, take a look at your `data_folder`. Here you will find a new `Elevation` directory, which contains a new `ArcticDEM` directory, which itself contains 3 new subdirectories as follows:

#### 2m_tiles

The `2m_tiles` directory will have 2 subdirectories - n68w033 and n68w034. However, these directories are empty! These were used to store the initial data downloaded from PGC, but the files were subsequently deleted after they were processed to save room on the drive. If you would like to keep the raw files, you can set the `keep_high_resolution_arcticDEM_data` method in the `GC` object to `True` prior to running the routine.

#### Metadata

This directory now contains a shapefile from PGC which outlines the extent of all high resolution ArcticDEM data files available.


#### Regridded_50m_tiles

In this directory, we have two subdirectories which mirror those in the `2m_tiles` directory: n68w033 and n68w034. These contain the regridded files, sampled at 50m - the posting specified for the output grid. These grids total just over 10 MB in size, compared to the original data size of 2 GB.