# Standardize the format of the global datasets

We have data in several formats (netCDF, csv, gdf) which can be hard to deal with. Most of our global data are grids and could be distributed as netCDF files to make the students' lives easier.

In this notebook, we'll load these datasets, clean them up and combine them (when necessary), and then export them to netCDF.

In [1]:
from pathlib import Path
import lzma
import numpy as np
import xarray as xr
import harmonica as hm
import boule as bl

In [2]:
data_dir = Path("../data")

## Global gravity and topography

We'll start with the gravity, topography, and geoid data from ICGEM. After loading, copy the `attrs` to the `DataArray` so that it doesn't get lost when merging all of the datasets.

In [3]:
with lzma.open(data_dir / "EIGEN-6C4-geoid.gdf.xz", mode="rt") as file:
    geoid = hm.load_icgem_gdf(file, dtype="float32")
geoid.geoid.attrs = geoid.attrs
geoid

In [4]:
with lzma.open(data_dir / "EIGEN-6C4-gravity.gdf.xz", mode="rt") as file:
    gravity = hm.load_icgem_gdf(file, dtype="float32").rename(
        {"h_over_geoid": "height", "gravity_earth": "gravity"}
    )
gravity.gravity.attrs = gravity.attrs

In [5]:
with lzma.open(data_dir / "etopo1.gdf.xz", mode="rt") as file:
    topography = hm.load_icgem_gdf(file, dtype="float32").rename(
        {"topography_grd": "topography"}
    )
topography.topography.attrs = topography.attrs

In [6]:
with lzma.open(data_dir / "etopo1-filtered.gdf.xz", mode="rt") as file:
    topography_smooth = hm.load_icgem_gdf(file, dtype="float32").rename(
        {"topography_shm": "topography"}
    )
topography_smooth.topography.attrs = topography_smooth.attrs

Convert all heights to geometric by subtracting the geoid height.

In [7]:
gravity["height"] += geoid.geoid
topography["topography"] += geoid.geoid
topography_smooth["topography"] += geoid.geoid

Calculate the gravity disturbance.

In [8]:
disturbance = gravity.gravity - bl.WGS84.normal_gravity(gravity.latitude, gravity.height)
disturbance.attrs = gravity.attrs

In [9]:
bouguer = disturbance - hm.bouguer_correction(topography.topography)
bouguer.attrs = gravity.attrs

Merge all of the grids into one so we can save a single file.

In [10]:
grid = xr.merge([
    disturbance.to_dataset(name="gravity_disturbance"),
    bouguer.to_dataset(name="gravity_bouguer"),
    topography_smooth.topography.to_dataset(name="topography_smoothed"),
    topography.topography.to_dataset(name="topography"),
]).assign_coords(height=gravity.height)
grid

Export to netCDF using the netcdf4 engine which results in smaller files.

In [11]:
grid.to_netcdf(
    data_dir / "global-gravity.nc", 
    format="NETCDF4", 
    engine="netcdf4",
)