# erddapy and urllib in Python
Feb. 13, 2025

## About
This document explains how I searched the CEOTR ERDDAP server for Halifax Line glider data using Python. The process for searching and downloading the data is in https://data.ceotr.ca/erddap/tabledap/documentation.html, and is shown here.

erddapy: Install from https://github.com/ioos/erddapy#--erddapy \
urllib: built-in to Python

## Search the ERDDAP server
Here I used the CEOTR ERDDAP server (Dal glider data).

In [3]:
import sys
from urllib.request import urlretrieve
from erddapy import ERDDAP
import pandas as pd

server = 'https://data.ceotr.ca/erddap'
e = ERDDAP(server=server)

# Search keywords
# search_for = "Halifax Line Monitoring -realtime"
search_for = "halifax glider delayed -'wave glider'"

url = e.get_search_url(search_for=search_for, response="csv")

results = pd.read_csv(url)  # Returns a pandas dataframe

In [5]:
# Print the dataset id column of the dataframe
results['Dataset ID']

0            we10_20140902_58_delayed
1            we04_20150728_59_delayed
2           fundy_20180517_83_delayed
3          dal556_20170425_71_delayed
4          otn200_20160624_62_delayed
                   ...               
59    peggy_20220310_146_delayed_test
60    fundy_20220413_149_delayed_test
61    fundy_20200601_111_delayed_test
62    cabot_20220525_153_delayed_test
63    cabot_20210513_129_delayed_test
Name: Dataset ID, Length: 64, dtype: object

## Download search results to netCDF
On the CEOTR ERDDAP server, the data from each mission are separated by cast. Downloads can be slow, especially if there are lots of casts (e.g. Mission \#39 was too big to download into one file). The following code downloads all the data from a single mission into one netCDF file.

You can specify a list of variables to download from each mission, but an error will be thrown if a mission doesn't have one of those specific variables, so I downloaded all variables by default.

In [None]:
# Initialize a list to hold the names of the saved netCDF files
filenames = []
output_dir = './erddapy_search_results/'  # Output directory

# Download all 64 missions using the dataset IDs in the results data frame
# Instead of using "range" could use the tqdm package's "trange" function in 
# the same way to show a progress bar
for i in range(len(results)):
    dataset_id = results.loc[i, 'Dataset ID']
    url1 = e.get_download_url(
        dataset_id=dataset_id, protocol='tabledap',  # variables=variables,
        response='nc',
    )
    filename = output_dir + dataset_id + '.nc'
    print(filename)
    path, headers = urlretrieve(url1, filename)
    filenames.append(filename)