Skip to content
Friedrich Knuth edited this page Sep 13, 2018 · 10 revisions

GlacierHack 2018

Exploring the data science landscape to perform time series analysis on DEMs (digital elevation models).

Team

Team Lead
Shashank Bhushan - Glaciology and Geospatial Image Analysis

Team
Elad Dente - Hydro-Geomorphology
Håvard Holm - Applied Mathematics
Daniel Howard - Applied Mathematics
Michelle Hu - Hydrology - Snow Water Runoff
Lynn Kaak - Applied Mathematics
Joachim Meyer - Computer Science and Software Development
Wei Wei - Geophysics - Ice Sheets and Ocean Interactions

Data Science Lead
Friedrich Knuth - Data Science Methods and Geospatial Image Analysis

Dataset

Khumbu Time Series

Data Science Questions (What?)

  • Can we quantify inter-annual changes in digital elevation models that represent glacial mass balance?
  • Can we improve upon time series analysis methods capturing changes in digital elevation models (DEMs)?
  • What can we learn from image analysis and statistical methods (machine learning), applied to this 4 dimensional array?
  • How do our solutions perform at scale? Can we leverage the xarray stack and processing power of a Pangeo? Pangeo is a kubernetes powered jupyterhub configuration that enables distributed data processing and analysis through dask and xarray.

Relevance (So What?)

  • Predict the fate of glaciers and impact for water resource management. How much water is being released / added to the system?
  • Explore if methods developed for this dataset can be applied to other glacier systems, such as glaciers that experience periodic surges. Is the trend fitting robust to systems that experience high variability?
  • Learn new data science methods.

Objectives (Now What?)

Day 1)

  • Form a team (done!)
  • Identify the dataset (done!)
  • Identify the data science questions (done but also in progress... you know)

Day 2)
What can we realistically accomplished during our time together?

Day 3) Episodes and Leads:

  • Run rainier_dem_example workflow on Khumbu dataset. Leads: Lynn, Havard
  • Read DEMs into Xarray and compare performance to pygeotools operations. Leads: Joe, Shashank
  • Visualizing and extracting elevation profiles from DEMs. Leads: Michelle, Elad
  • Read in DEM data straight from Google Drive. Leads: Friedrich, Daniel
  • Velocity maps using vmap ASP. Leads: Mei

Each episode will be delivered as a notebook presentation in the main folder, then compiled into a single presentation.

Product Ideas (add libraries in parentheses)

  • Simple notebook to open and visualize co-registered DEM models (pygeotools, rasterio, matplotlib)
  • Interactive iPython Widget - 2D plot - altitude vs time with lat and lon sliders (ipywidgets)
  • Interactive iPython Widget - raster images - rendering elevation as color relative to the mean (ipywidgets)
  • Getting a stack of GeoTIFF raster DEMs into an nD xarray object, then performing some basic manipulations of the elevation time series in xarray (xarray)
  • Getting a stack of GeoTIFF raster DEMs into a Dask array and performing distributed computing on pangeo.pydata.org (xarray and dask)
  • Exploring our dataset and products in Google Earth Engine

Roles

GitHub Integrator: Joachim