# SNODAS Data Access

This script is designed to access and process data from the Snow Data Assimilation (SNODAS) system. 

Data is accessed through NSIDC. Because SNODAS is not available through the cloud, we must use HTTPS data querying to download and process the data.

This script is partially adapted from code written by Aakash Ahamed (https://github.com/kashingtonDC/SNODAS).

In [1]:
import requests
from bs4 import BeautifulSoup

import os
import glob
import time
import gzip
import shutil

import datetime
import subprocess 

from tqdm import tqdm

In [2]:
# Year/month/day setup for SNODAS archive
year = "2023"
month = "Mar" # 3-character abbreviation for month
day = "03" # 2-digit number for day of month

# Get urls for SNODAS archive
archive_url = f'https://noaadata.apps.nsidc.org/NOAA/G02158/masked/{year}/{day}_{month}/'
r = requests.get(archive_url)
data = BeautifulSoup(r.text, "html.parser")

# Extract data from SNODAS archive
dir = "/home/jovyan/shared-public/SnowPit/tmp/"
for l in data.find_all("a")[1:]:
    r = requests.get(archive_url+l['href'])
    with open(os.path.join(dir, l['href']), 'wb') as f:
        f.write(r.content)

Using the above cell, we are able to access all of the SNODAS data from March 2023. However, said data is provided in `.tar` format, which must then be extracted from the `.gz` format. The below function extracts the data we need to `.dat`, and `.txt` formats.

Notice also the `snovars` input: the given input (`1034`) extracts the SNODAS snow water equivalent (SWE) variable from each tar file.

In [3]:
def process_tarfile(tarfile, writedir, snovars=['1034']):
    # Extract date from tarfile
    date = os.path.splitext(os.path.split(tarfile)[1])[0].replace("SNODAS_","")
    
    # Untar the files using OS commands
    cmd = '''tar -xvf {} -C {}'''.format(tarfile, writedir)
    os.system(cmd)

    # Find untarred .gz files
    gz_files = [os.path.join(writedir,x) for x in os.listdir(writedir) if date in x if x.endswith(".gz")]

    # Get variable strings from each file
    varstrs = [x[x.find("ssmv")+5:x.find("ssmv")+9] for x in gz_files]

    # Compare variable strings to wanted variables
    for varstr,file in zip(varstrs, gz_files):
        outfn = os.path.splitext(file)[0]
        if varstr in snovars:
            with gzip.open(file, 'r') as f_in, open(outfn, 'wb') as f_out:
                shutil.copyfileobj(f_in, f_out)
        else:
            continue

    datfiles = [os.path.join(writedir,x) for x in os.listdir(writedir) if date in x if x.endswith(".dat")]
    txtfiles = [os.path.join(writedir,x) for x in os.listdir(writedir) if date in x if x.endswith(".txt")]
    gz_files = [os.path.join(writedir,x) for x in os.listdir(writedir) if date in x if x.endswith(".gz")]

    return datfiles, txtfiles, gz_files

In [8]:
tmp_dir = "/home/jovyan/shared-public/SnowPit/tmp/"
# Create ENVI headers, and save to tmp directory
for file in os.listdir(tmp_dir):
    if file.endswith(".tar"):
        datfiles,txtfiles,gz_files = process_tarfile(os.path.join(tmp_dir, file), tmp_dir)

us_ssmv11050lL00T0024TTNATS2023030105DP000.txt.gz
us_ssmv11050lL00T0024TTNATS2023030105DP000.dat.gz
us_ssmv11044bS__T0024TTNATS2023030105DP000.txt.gz
us_ssmv11044bS__T0024TTNATS2023030105DP000.dat.gz
us_ssmv11039lL00T0024TTNATS2023030105DP000.txt.gz
us_ssmv11039lL00T0024TTNATS2023030105DP000.dat.gz
us_ssmv11038wS__A0024TTNATS2023030105DP001.txt.gz
us_ssmv11038wS__A0024TTNATS2023030105DP001.dat.gz
us_ssmv11036tS__T0001TTNATS2023030105HP001.txt.gz
us_ssmv11036tS__T0001TTNATS2023030105HP001.dat.gz
us_ssmv11034tS__T0001TTNATS2023030105HP001.txt.gz
us_ssmv11034tS__T0001TTNATS2023030105HP001.dat.gz
us_ssmv01025SlL01T0024TTNATS2023030105DP001.txt.gz
us_ssmv01025SlL01T0024TTNATS2023030105DP001.dat.gz
us_ssmv01025SlL00T0024TTNATS2023030105DP001.txt.gz
us_ssmv01025SlL00T0024TTNATS2023030105DP001.dat.gz
us_ssmv11050lL00T0024TTNATS2023030205DP000.txt.gz
us_ssmv11050lL00T0024TTNATS2023030205DP000.dat.gz
us_ssmv11044bS__T0024TTNATS2023030205DP000.txt.gz
us_ssmv11044bS__T0024TTNATS2023030205DP000.dat

In order to interpret the `.dat` files in Python, we need to convert the text files to the ENVI header format. This requires us to define a few parameters for the header, which were obtained from the following website: https://nsidc.org/data/user-resources/help-center/how-do-i-convert-snodas-binary-files-geotiff-or-netcdf

In [4]:
# Dictionary of header parameters
hdr_parms = {
    "samples": 6935,
    "lines": 3351,
    "bands": 1,
    "data_type": 2,
    "interleave": "bsq",
    "byte_order": 1
}

In [5]:
def create_envi_header(txt_path, hdr_path, hdr_parms):
    """
    Creates an ENVI header (.hdr) file.

    Args:
        txt_path (str): Path to the input .txt file (used only for the 'map info' field).
        hdr_path (str): Path to save the output .hdr file.
        description (str): Description of the data.
        samples (int): Number of samples (columns).
        lines (int): Number of lines (rows).
        bands (int): Number of bands.
        data_type (int): ENVI data type code (e.g., 1 for byte, 4 for float).
        interleave (str): Interleave type ('bsq', 'bip', or 'bil').
        byte_order (int): Byte order (0 for little-endian, 1 for big-endian).
    """
    # Open .txt files
    with open(txt_path, 'r') as file:
      lines_txt = file.readlines()
    map_info_line = next((line for line in lines_txt if "map info" in line.lower()), None)

    # Write header parameters to .hdr file
    with open(hdr_path, 'w') as hdr_file:
        hdr_file.write("ENVI\n")
        hdr_file.write(f"samples = {hdr_parms['samples']}\n")
        hdr_file.write(f"lines = {hdr_parms['lines']}\n")
        hdr_file.write(f"bands = {hdr_parms['bands']}\n")
        hdr_file.write(f"header offset = 0\n")
        hdr_file.write(f"file type = ENVI Standard\n")
        hdr_file.write(f"data type = {hdr_parms['data_type']}\n")
        hdr_file.write(f"interleave = {hdr_parms['interleave']}\n")
        hdr_file.write(f"sensor type = Unknown\n")
        hdr_file.write(f"byte order = {hdr_parms['byte_order']}\n")
        print(f"Saving header to file: {hdr_path}")
        if map_info_line:
          hdr_file.write(f"{map_info_line}")
        else:
          hdr_file.write(f"map info = {{UTM, 1.000, 1.000, 0.000, 0.000, 1, 1, WGS-84, units=Meters}}\n")

In [9]:
tmp_dir = "/home/jovyan/shared-public/SnowPit/tmp/"
# Create ENVI headers, and save to tmp directory
for file in os.listdir(tmp_dir):
    if file.endswith(".txt"):
        file_path = os.path.join(tmp_dir, file)
        hdr_path = f"{file_path[:-4]}.hdr"
        create_envi_header(file_path, hdr_path, hdr_parms)

Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030105HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030205HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030305HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030405HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030505HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030605HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030705HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030805HP001.hdr
Saving header to file: /home/jovyan/shared-public/SnowPit/tmp/us_ssmv11034tS__T0001TTNATS2023030905HP001.hdr
Saving header to fi

Now that we have both `.dat` and `.hdr` files, we can finally convert them to GeoTiffs for analysis.

In [10]:
def dat2tif(datfiles, writedir):
    # Set dicionary of desired parameters
    prod_lookup = dict({
        "1034": "SNWE"
    })

    outfnsv1 = {}

    # Create output tiff file for snow output
    for file in datfiles:
        date = file[file.find("TS")+2:file.find("TS")+10]
        for k,v in prod_lookup.items():
            if k in file:
                outfnsv1[file] = date + v + ".tif"

    # Create path for saved file
    outfnsvf = {}
    for k,v in outfnsv1.items():
        outfnsvf[k] = os.path.join(writedir, v)

    # Use GDAL to convert .dat and .hdr to .tif
    outfiles = []
    for infile,outfile in outfnsvf.items():
        if not os.path.exists(outfile):
            cmd = '''gdal_translate -of GTIff -a_srs '+proj=longlat +ellps=WGS84 +no_defs' -a_nodata -9999 -a_ullr -124.73333333 52.87500000 -66.94166667 24.95000000 {} {}'''.format(infile,outfile)
            os.system(cmd)
        else:
            print("{} already exists - moving to next file".format(outfile))

        outfiles.append(outfile)

    return outfiles

In [11]:
# Look through .dat files for conversion to .tif
for file in os.listdir(tmp_dir):
    if file.endswith(".dat"):
        print(file)
        dat_path = os.path.join(tmp_dir, file)
        tiff_path = f"{dat_path[:-4]}.tif"
        tiffile = dat2tif([dat_path], tmp_dir)  

us_ssmv11034tS__T0001TTNATS2023030105HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030205HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...70...80...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030305HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030405HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030505HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...70...80...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030605HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030705HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030805HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023030905HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031005HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031105HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031205HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031305HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031405HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031505HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031605HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031705HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031805HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023031905HP001.dat
Input file size is 6935, 3351
0...10...20...30...40

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...50...60...70...80...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032005HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032105HP001.dat
Input file size is 6935, 3351
0...10...20...30...40

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...50...60...70...80...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032205HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032305HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032405HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032505HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032605HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023033105HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032705HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032805HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023032905HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.
us_ssmv11034tS__T0001TTNATS2023033005HP001.dat
Input file size is 6935, 3351
0...10...20...30...40...50...60...70...80

ERROR 1: PROJ: proj_create_conversion_utm: Invalid zone number
ERROR 1: PROJ: proj_create_projected_crs: missing required input


...90...100 - done.


In [1]:
# Clean up tmp directory, removing excess .tar, .gz, .txt, .dat, .hdr files
import os

tmp_dir = "/home/jovyan/shared-public/SnowPit/tmp/"
files = os.listdir(tmp_dir)

for f in files:
    file_path = os.path.join(tmp_dir, f)
    if os.path.isfile(file_path) and not file_path.endswith(".tif"):
        os.remove(file_path)

print("Excess files deleted. Only Tiffs remaining.")

Excess files deleted. Only Tiffs remaining.


It took a fair bit of effort, but we finally have some usable SNODAS data. Let's take a look at one of the files.

In [2]:
import rioxarray as rxr

ds = rxr.open_rasterio("/home/jovyan/shared-public/SnowPit/tmp/20230301SNWE.tif")
ds

In [None]:
ds.plot(vmin=0, vmax=500)

The values look a bit large for SWE...that's because we need to apply a scale factor to the data to have proper values.

According to the SNODAS User Guide (https://nsidc.org/sites/default/files/g02158-v001-userguide_2_1.pdf), the scale factor for SWE is 1000, which converts the values to units of meters. If desired, the above values can be kept to represent SWE in millimeters (mm).

In [3]:
# Convert SWE data to meters
ds_meters = ds/1000

In [None]:
import matplotlib.pyplot as plt

# Plot SWE data over continental U.S.
fig, ax = plt.subplots()
ds_meters.where(ds>0).plot(vmin=0, vmax=1, cbar_kwargs={'label': "SWE [mm]"})
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.set_title(" ")
fig.tight_layout()