# Hycom Download Demo

## Download a batch of FMRC Files
Once a `HycomDataset` entry has been defined from the Hycom catalog, the `updateFMRCData` method takes care of identifying avaiable FMRCs as well as staging and downloading the Data files.
Below is a demonstration using default options except for `limit` which is set to 12 to avoid creating an uncessasary server load.

### Expore Options For Download Jobs

A series of types have been created to specify options:  
`FMRCSubsetOptions` - Hycom related options for the Thredds server   
`FMRCDownloadOptions` - Options realted to how to request files from the server  
`FMRCDownloadJobOptions` - C3 Batch Job Options for managing the downloads  
The next cell displays the default settings for each option type.

In [3]:
# Default subset options
print(c3.FMRCSubsetOptions())
# Default download options
print(c3.FMRCDownloadOptions())
#print(dlo)
# Use default download options
print(c3.FMRCDownloadJobOptions())

c3.FMRCSubsetOptions(
 timeStride=1,
 vars='surf_el,salinity,water_temp,water_u,water_v',
 disableLLSubset='on',
 disableProjSubset='on',
 horizStride=1,
 vertStride=1,
 addLatLon='false',
 accept='netcdf4')
c3.FMRCDownloadOptions(
 externalDir='hycom-data',
 maxTimesPerFile=1,
 maxForecastDepth=-1,
 defaultHycomDatasetId='GOMu0.04_901m000_FMRC_1.0.1')
c3.FMRCDownloadJobOptions(batchSize=4, limit=-1)


### Update FMRC's and submit the Download Job

In [4]:
# Ensure we have a Dataset entry for the desired catalog
cat_url = "https://tds.hycom.org/thredds/catalog/GOMu0.04/expt_90.1m000/FMRC/runs/catalog.xml"
gom_dataset = c3.HycomDataset.upsertHycomDatasetFromCatalog(url = cat_url)

# Create an updateFMRCData job.
# Use defaults for Subset ond download options
# Limit # of files downloaded for demo

data_dir = 'hycom-test'

job = c3.HycomFMRC.updateFMRCData(
    fmrcSubsetOptions = c3.FMRCSubsetOptions(),
    fmrcDownloadOptions = c3.FMRCDownloadOptions(
        **{
            'externalDir': data_dir
        }
    ),
    fmrcDownloadJobOptions = c3.FMRCDownloadJobOptions(
        **{
            'batchSize': 4,
            'limit': 20
        }
    )
)
job.status()

c3.BatchJobStatus(
 started=datetime.datetime(2021, 9, 29, 21, 32, 11, tzinfo=datetime.timezone.utc),
 startedby='podolsky@berkeley.edu',
 status='running',
 newBatchSubmitted=True)

In [5]:
# Monitor the job status until completed and display total # of files downloaded
import time
from IPython.display import clear_output
status = job.status()
while status.status == 'running':
    time.sleep(2)
    clear_output()
    status = job.status()
    gom_dataset = c3.HycomDataset.fetch().objs[0]
    filecount = c3.FMRCFile.fetchCount(spec={'filter':"status=='downloaded'"})
    print(f"FMRC Archive Size: {round(gom_dataset.fmrcArchiveSize/(1024**3),2)} GiB")
    print(f"Download count: {filecount}")
    print (status)

FMRC Archive Size: 3.5 GiB
Download count: 99
c3.BatchJobStatus(
 started=datetime.datetime(2021, 9, 29, 21, 32, 11, tzinfo=datetime.timezone.utc),
 startedby='podolsky@berkeley.edu',
 completed=datetime.datetime(2021, 9, 30, 20, 53, 55, tzinfo=datetime.timezone.utc),
 status='failed',
 errors=c3.Arry<JobRunErrorDetail>([c3.JobRunErrorDetail(
           failedActionId='1966.1187849',
           errorMsg='Error executing command: '
                     '/usr/local/share/c3/condaEnvs/dev/tc01d/py-hycom_1_0_0/bin/python '
                     '/tmp/pythonActionSourceCache605507753435165592/FMRCDownloadJob_processBatch.py\n'
                     'p_logger=main url=http://dev-dti-app-w-001:8080 '
                     'connector=null mode="thick" Action failed!\n'
                     'Traceback (most recent call last):\n'
                     '  File '
                     '"/tmp/pythonActionSourceCache605507753435165592/FMRCDownloadJob_processBatch.py", '
                     'line 379, in

In [6]:
#job.status()

## Open NetCDF files from Archive

In [4]:
# Pick the first avalable file, open it and print the NetCDF metadata
file = c3.FMRCFile.fetch().objs[1]
ds = c3.HycomUtil.nc_open(file.file.url,'/tmp')
print(ds)
c3.HycomUtil.nc_close(ds, file.file.url, '/tmp')

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    classification_level: UNCLASSIFIED
    distribution_statement: Approved for public release. Distribution unlimited.
    downgrade_date: not applicable
    classification_authority: not applicable
    institution: Naval Oceanographic Office
    source: HYCOM archive file
    history: archv2ncdf3z ;
FMRC Run 2021-09-10T12:00:00Z Dataset
    field_type: instantaneous
    Conventions: CF-1.4, NAVO_netcdf_v1.1
    cdm_data_type: GRID
    featureType: GRID
    location: Proto fmrc:GOMu0.04_901m000_FMRC
    History: Translated to CF-1.0 Conventions by Netcdf-Java CDM (CFGridWriter2)
Original Dataset = fmrc:GOMu0.04_901m000_FMRC; Translation Date = 2021-09-16T18:12:30.911Z
    geospatial_lat_min: 18.1200008392334
    geospatial_lat_max: 31.920000076293945
    geospatial_lon_min: -98.0
    geospatial_lon_max: -76.4000244140625
    dimensions(sizes): time(1), lat(346), lon(541), depth(40)
    variables(dime

1

## Cleanup Records and Files Created for this Demo

In [3]:
# Cleanup
#print(f"Removed {c3.HycomFMRCFile.removeAll()} HycomFMRCFile records.")
print(f"Removed {c3.FMRCFile.removeAll()} FMRCFile records.")
print(f"Removed {c3.FMRCDataArchive.removeAll()} FMRCDataArchive records.")
print(f"Removed {c3.HycomFMRC.removeAll()} HycomFMRC records.")
print(f"Removed {c3.HycomDataset.removeAll()} HycomDataset records")
files = c3.FileSystem.inst().listFiles(data_dir)
if files.files:
    print(f"Deleting {len(files.files)} files")
    c3.FileSystem.inst().deleteFilesBatch(files.files)
print("Done.")

Removed 0 FMRCFile records.
Removed 0 FMRCDataArchive records.
Removed 0 HycomFMRC records.
Removed 0 HycomDataset records


NameError: name 'data_dir' is not defined

## Testing

In [None]:
# Check for errors
errors = c3.FMRCFile.fetchCount(spec={'filter':"status=='error'"})
print(errors)

In [None]:
# Grab a dataArchive record
da = c3.FMRCDataArchive.fetch(spec = {'include': 'this,dataFiles'}).objs[0]
da

In [None]:
# Check the total size of the files in this archive
print(f"Archive Size: {round(da.archiveSize/(1024**3),2)} GiB")

In [None]:
# Check the total Size of All FMRC Data
gom_dataset = c3.HycomDataset.fetch().objs[0] # The HycomFMRC is the first on only record here, now.
print(f"FMRC Archive Size: {round(gom_dataset.fmrcArchiveSize/(1024**3),2)} GiB")

In [1]:
from some_package.some_module import hey
hey()

ModuleNotFoundError: No module named 'some_package'

In [3]:
c3.FileSystem.inst().listFiles("hycom-data")

c3.ListFilesResult(
 files=c3.Arry<File>([c3.AzureFile(
          contentLength=39715517,
          contentLocation='fs/dti/mpodolsky/hycom-data/GOMu0.04_901m000_FMRC_RUN_2021-09-10T12:00:00Z-2021-09-10T12:00:00Z.nc',
          eTag='"0x8D9793D8BE260F2"',
          lastModified=datetime.datetime(2021, 9, 16, 18, 12, 29, tzinfo=datetime.timezone.utc),
          contentMD5='Ty6OY12e3zZ3I/M9350wXw==',
          hasMetadata=False,
          url='azure://dev-dti/fs/dti/mpodolsky/hycom-data/GOMu0.04_901m000_FMRC_RUN_2021-09-10T12:00:00Z-2021-09-10T12:00:00Z.nc',
          blobType='BLOCK_BLOB'),
         c3.AzureFile(
          contentLength=39704070,
          contentLocation='fs/dti/mpodolsky/hycom-data/GOMu0.04_901m000_FMRC_RUN_2021-09-10T12:00:00Z-2021-09-10T13:00:00Z.nc',
          eTag='"0x8D9793D94E13207"',
          lastModified=datetime.datetime(2021, 9, 16, 18, 12, 44, tzinfo=datetime.timezone.utc),
          contentMD5='wouV/KyZIdN/gFFTPR1ZHA==',
          hasMetadata=False,
     