Open-access database of englacial temperature measurements compiled from published literature and submissions.
The dataset adheres to the Frictionless Data Tabular Data Package specification.
The metadata in datapackage.yaml
describes, in detail, the contents of the tabular data files:
data/source.csv
: Description of each data source (either a direct contribution or the reference to a published study).data/borehole.csv
: Description of each borehole (location, elevation, etc), linked tosource.csv
viasource_id
and less formally via source identifiers innotes
.data/profile.csv
: Description of each profile (date and time), linked toborehole.csv
viaborehole_id
and tosource.csv
viasource_id
.data/measurement.csv
: Description of each measurement (depth and temperature), linked toprofile.csv
viaborehole_id
andprofile_id
.
For boreholes with many profiles (e.g. from automated loggers), pairs of profile.csv
and measurement.csv
are stored separately in subfolders of data
named {source.id}-{glacier}
, where glacier
is a simplified and kebab-cased version of the glacier name (e.g. flowers2022-little-kluane
).
Folder sources
contains subfolders (with names matching column source.id
) with files that document how and from where the data was extracted. Files with a .png
, .jpg
, or .pdf
extension are figures, tables, maps, or text from the publication. Pairs of files with .pgw
and .{png|jpg}.aux.xml
extensions georeference a .{png|jpg}
image, and files with .geojson
extension are the subsequently-extracted spatial coordinates. Files with an .xml
extension document how numeric values were extracted from maps and figures using Plot Digitizer (https://plotdigitizer.sourceforge.net). Of these, digitized temperature profiles are named {borehole.id}_{profile.id}{suffix}
where borehole.id
and profile.id
are either a single value or a hyphenated range (e.g. 1-8
). Those without the optional suffix
use temperature
and depth
as axis names. Those with a suffix
are unusual cases which, for example, may be part of a series (e.g. _lower
) or use a non-standard axis (e.g. _date
).
To contribute data, send an email to jacquemart@vaw.baug.ethz.ch. Please structure your data as either comma-separated values (CSV) files (borehole.csv
and measurement.csv
) or as an Excel file (with sheets borehole
and measurement
). The required and optional columns for each table are described below and in the submission metadata: contribute/datapackage.yaml
. Consider using our handy Excel template: contribute/template.xlsx
!
Note: We also welcome submissions of data that have already been digitized, as they allow us to assess the accuracy of the digitization process.
name | description | type | constraints |
---|---|---|---|
id |
Unique identifier. | integer | required: True unique: True minimum: 1 |
glacier_name |
Glacier or ice cap name (as reported). | string | required: True pattern: [^\s]+( [^\s]+)* |
glims_id |
Global Land Ice Measurements from Space (GLIMS) glacier identifier. | string | pattern: G[0-9]{6}E[0-9]{5}[NS] |
latitude |
Latitude (EPSG 4326). | number [degree] | required: True minimum: -90 maximum: 90 |
longitude |
Longitude (EPSG 4326). | number [degree] | required: True minimum: -180 maximum: 180 |
elevation |
Elevation above sea level. | number [m] | required: True maximum: 9999.0 |
label |
Borehole name (e.g. as labeled on a plot). | string | |
date_min |
Begin date of drilling, or if not known precisely, the first possible date (e.g. 2019 → 2019-01-01). | date | format: %Y-%m-%d |
date_max |
End date of drilling, or if not known precisely, the last possible date (e.g. 2019 → 2019-12-31). | date | format: %Y-%m-%d |
drill_method |
Drilling method: - mechanical: Push, percussion, rotary, ... - thermal: Hot point, electrothermal, steam, ... - combined: Mechanical and thermal |
string | enum: ['mechanical', 'thermal', 'combined'] |
ice_depth |
Starting depth of ice. Infinity (INF) indicates that ice was not reached. | number [m] | |
depth |
Total borehole depth (not including drilling in the underlying bed). | number [m] | |
to_bed |
Whether the borehole reached the glacier bed. | boolean | |
temperature_accuracy |
Thermistor accuracy or precision (as reported). Typically understood to represent one standard deviation. | number [°C] | |
notes |
Additional remarks about the study site, the borehole, or the measurements therein. Literature references should be formatted as {url} or {author} {year} ({url}) . |
string | pattern: [^\s]+( [^\s]+)* |
name | description | type | constraints |
---|---|---|---|
borehole_id |
Borehole identifier. | integer | required: True |
depth |
Depth below the glacier surface. | number [m] | required: True |
temperature |
Temperature. | number [°C] | required: True |
date_min |
Measurement date, or if not known precisely, the first possible date (e.g. 2019 → 2019-01-01). | date | format: %Y-%m-%d |
date_max |
Measurement date, or if not known precisely, the last possible date (e.g. 2019 → 2019-12-31). | date | format: %Y-%m-%d required: True |
time |
Measurement time. | time | format: %H:%M:%S |
utc_offset |
Time offset relative to Coordinated Universal Time (UTC). | number [h] | |
equilibrated |
Whether temperatures have equilibrated following drilling. | boolean |
You can validate your CSV files (borehole.csv
and measurement.csv
) before submitting them using the frictionless Python package.
-
Clone this repository.
git clone https://github.com/mjacqu/glenglat.git cd glenglat
-
Either install the
glenglat-contribute
Python environment (withconda
):conda env create --file contribute/environment.yaml conda activate glenglat
Or install
frictionless
into an existing environment (withpip
):pip install "frictionless~=5.13"
-
Validate, fix any reported issues, and rejoice! (
path/to/csvs
is the folder containing your CSV files)python contribute/validate_submission.py path/to/csvs
Follow the instructions below to run a full test of the data package.
-
Clone this repository.
git clone https://github.com/mjacqu/glenglat.git cd glenglat
-
Install the
glenglat
Python environment (withconda
):conda env create --file tests/environment.yaml conda activate glenglat
-
Run the basic (
frictionless
) tests.frictionless validate datapackage.yaml
-
Run the custom (
pytest
) tests.pytest tests
-
An optional test checks that
borehole.glims_id
is consistent with borehole coordinates. This requires a GeoParquet file of glacier outlines from the GLIMS dataset with columnsgeometry
(glacier outline) andglac_id
(glacier id). To run, first installgeopandas
andpyarrow
, then set theGLIMS_PATH
environment variable before callingpytest
.conda install -c conda-forge geopandas=0.13 pyarrow GLIMS_PATH=/path/to/parquet pytest tests
The scripts
directory contains Python scripts that update certain files:
build_zenodo_json.py
: Build.zenodo.json
file (for Zenodo releases) fromdatapackage.yaml
anddata/source.csv
.build_submission_yaml.py
: Buildcontribute/datapackage.yaml
fromdatapackage.yaml
.build_submission_md.py
: Updates tables inREADME.md
fromcontribute/datapackage.yaml
.build_submission_xlsx.py
: Buildcontribute/template.xlsx
fromcontribute/datapackage.yaml
.
Assuming the glenglat
Python environment is installed and activated (see above), they can be run as follows:
python scripts/build_zenodo_json.py
python scripts/build_submission_yaml.py
python scripts/build_submission_md.py
python scripts/build_submission_xlsx.py