# UFS Prototype 8: ocn_2D data download notebook

*Loren Doyle*

This notebook shows how I downloaded the ocn_2D data from the noaa aws. It is a bit clunky and brute force, and there is most likely a better way to download it, but I found that this was the easiest way for me. (I tried using glob, but it would not work with the cache data)

Notebook is run in clim_model environment.

## Import the necessary libraries

In [38]:
import xarray as xr
import numpy as np
import xesmf as xe

import fsspec
import os

import datetime
import cftime
import gc

## Define the cache directory to store the temp files

Make a folder in your scratch directory and change it to that path.

In [39]:
# directory to store temporary files
cachedir = '/scratch/ldoyle4/ufs8/cache/'



## Define the time elements

Here is where it gets a bit clunky. I define the initialization months, days, and years for the path. Then I define the days, steps, and individual months for each gfs.[] folder.

In [3]:
# initialization times for path
init_months = ['01','02','03','04','05','06','07','08','09','10','11','12']
init_days = ['01','15']
init_years = ['2014']

#days
days_31 = ['01','02','03','04','05','06','07','08','09','10','11','12','13','14',
        '15','16','17','18','19','20','21','22','23','24','25','26','27','28',
        '29','30','31']
days_30 = ['01','02','03','04','05','06','07','08','09','10','11','12','13','14',
        '15','16','17','18','19','20','21','22','23','24','25','26','27','28',
        '29','30']
days_28 = ['01','02','03','04','05','06','07','08','09','10','11','12','13','14',
        '15','16','17','18','19','20','21','22','23','24','25','26','27','28']
days_29 = ['01','02','03','04','05','06','07','08','09','10','11','12','13','14',
        '15','16','17','18','19','20','21','22','23','24','25','26','27','28','29']

#steps
steps = ['00','06','12','18']

#months
january = ['201401','201402']
february = ['201402','201403']
march = ['201403','201404']
april = ['201404','201405']
may = ['201405','201406']
june = ['201406','201407']
july = ['201407','201408']
august = ['201408','201409']
september = ['201409','201410']
october = ['201410','201411']
november = ['201411','201412']
december = ['201412','201501']


Here, I write a function that writes out and reads the temporary cache data as a monthly breakdown, based on the initialization day.

For 01 initialization, we index the files from 1 to 141, which gives us the 35 days of the forecast starting on the 1st of the month.
For 15 initialization, we index the files from 57 to 197, which gives us the 35 days of the forecast starting on the 15th of the month.

In [4]:
def files_01(months, days, steps, init_months, init_days, init_years):
    file = [f'filecache::https://noaa-ufs-prototypes-pds.s3.amazonaws.com/Prototype8/{iy}{im}{ids}/ocn_2D/gfs.{iy}{im}{ids}/00/ocean/ocn_2D_{month}{day}{step}.01.{iy}{im}{ids}00.nc' 
            for month in months for day in days for step in steps for iy in init_years for im in init_months for ids in init_days]
    of = fsspec.open_local(file[1:141],s3={'anon': True}, filecache={'cache_storage':cachedir})
    
    return(of)

def files_15(months, days, steps, init_months, init_days, init_years):
    file = [f'filecache::https://noaa-ufs-prototypes-pds.s3.amazonaws.com/Prototype8/{iy}{im}{ids}/ocn_2D/gfs.{iy}{im}{ids}/00/ocean/ocn_2D_{month}{day}{step}.01.{iy}{im}{ids}00.nc' 
            for month in months for day in days for step in steps for iy in init_years for im in init_months for ids in init_days]
    of = fsspec.open_local(file[57:197],s3={'anon': True}, filecache={'cache_storage':cachedir})
    
    return(of)


    #print(file)

I run the function for each month. I could not figure out how to write a loop that does it for me, so this was the best solution I could come up with. 

To change it to the 1st initialization day, just change every 15 to a 01.

I print each definition, just to make sure its correct.

In [5]:
jan_files_01 = files_01(january, days_31, steps, ['01'], ['01'], ['2014'])
print(jan_files_01)



['/scratch/ldoyle4/ufs8/cache/21469ddb40a13ed5912a0a3f35d586ba4d78f6950d664e1d9932e59d782449c2', '/scratch/ldoyle4/ufs8/cache/ac6616883e9abffc30419b6e76aa19aa368fc7e8a89b198ec2e5b99445b8c01f', '/scratch/ldoyle4/ufs8/cache/cfe2cd4613ab737e2f33a4feec2e8b437bf79bec3196d4475c554ed3070f2928', '/scratch/ldoyle4/ufs8/cache/e14b4e6af4e56224816da0156fdd7bd560c68d5424cead6b89aba1050bbf5d4d', '/scratch/ldoyle4/ufs8/cache/bad9b3bd78fe5d9c5c36659b4e17782698313e699c18b91857b5375b8a94a2ee', '/scratch/ldoyle4/ufs8/cache/3fb884419d1ea2d8b8bb1869b4ca5af37a231085f5c3d1e0cd29e8d7ce9a612a', '/scratch/ldoyle4/ufs8/cache/49fabc8d30dd467943dd544e2ce1232749f70c49b0e8a9b964570c2403058dad', '/scratch/ldoyle4/ufs8/cache/f4371f74b0461a264e0f1448054299e479e0f48f43a4967d31233fe53c8ba0d7', '/scratch/ldoyle4/ufs8/cache/49a56a9a03489584894cd6f23303ff1e5b310ac8861cefe1c0fee554af87fc8d', '/scratch/ldoyle4/ufs8/cache/5319bf87338ea0051ecc4c7a7f53143c7daba38d50aa2b4a522a91ae32555988', '/scratch/ldoyle4/ufs8/cache/39a3ab2253

In [7]:
feb_files_01 = files_01(february, days_28, steps, ['02'], ['01'], ['2014'])
print(feb_files_01)



['/scratch/ldoyle4/ufs8/cache/d927d0913651d23a51bc96d6d9197a1ea5df5cb1ce8c7648ff9909f426813f51', '/scratch/ldoyle4/ufs8/cache/27b1878c8f45f047a71e7499581dd631829e77a92b0fc18b6009028f6b0434e7', '/scratch/ldoyle4/ufs8/cache/d9c32466f96b36688dfc08365365041d2588b5b1c5eeb15a576afe3eced67f03', '/scratch/ldoyle4/ufs8/cache/f3a753d3a872acde66e68f4241c5652d1a23be8d84bc483b23e9b3216f16266d', '/scratch/ldoyle4/ufs8/cache/6242a948e62535e6da22104542beea775f6ac04681121885fee9f874ec36b295', '/scratch/ldoyle4/ufs8/cache/47348c27787eabf0a92c0bea26b5547f28e521e4bdbfbf99d82faa0a87857507', '/scratch/ldoyle4/ufs8/cache/528372eaee68d211dbae0cd5125b2c7b7b55b883237fbd894d7d8974330d07ed', '/scratch/ldoyle4/ufs8/cache/e730114a31ebf8663222f06628612a86c0e282f26070e95dab4d3b5737f75c83', '/scratch/ldoyle4/ufs8/cache/26bc2180858ed96fe433982ebbd5b17b28cef2853a02b1ecb8c4669ab94e33c8', '/scratch/ldoyle4/ufs8/cache/8b463dcb605ddf1c3d8a3764a9c0dac2236e4eaa9687a1e6deed00532904d554', '/scratch/ldoyle4/ufs8/cache/208594167f

In [8]:
mar_files_01 = files_01(march, days_31, steps, ['03'], ['01'], ['2014'])
print(mar_files_01)



['/scratch/ldoyle4/ufs8/cache/5fc63ee7ca55c5a28f225b7fe1a36b3baa844cec4c1b2c33089684f798cd8ed3', '/scratch/ldoyle4/ufs8/cache/d6c486ddc32a0f9894c23c6650c79c84662cd790c0cb0d521717411c91ce3104', '/scratch/ldoyle4/ufs8/cache/f1732a4de0600cc680f973deddb14873da1c8e135200ac57d8ecbc8ac15df812', '/scratch/ldoyle4/ufs8/cache/e5ff416573e8fc1dee68da4da10167290f6738be1b9a56c5010c2357bde30533', '/scratch/ldoyle4/ufs8/cache/002e86859ad912c33003f3db4d702187a2e83bf31058e69d3b59fe79698cf56c', '/scratch/ldoyle4/ufs8/cache/c4e02831127042b9d4c2858b0d6ee80f4d1cf32c3c7db95e48a6c7793c8f74dc', '/scratch/ldoyle4/ufs8/cache/13b141d2a8f7a27cbe70aec30bf426e758b38436df34636067219ccbdd0042a8', '/scratch/ldoyle4/ufs8/cache/3935f6e34257942b6484fe5a02a1b6df0542be65a9df890e8dfca9d2ccec2944', '/scratch/ldoyle4/ufs8/cache/92ad9b907441b4740043044672a30478bcf23adb0d99230ddb55106ca807d28f', '/scratch/ldoyle4/ufs8/cache/0a0a6236af17f5f601a4976d8db7b19bacfa166ff145162dafcbb6ecfa09c018', '/scratch/ldoyle4/ufs8/cache/c44119cebd

In [9]:
apr_files_01 = files_01(april, days_30, steps, ['04'], ['01'], ['2014'])
print(apr_files_01)



['/scratch/ldoyle4/ufs8/cache/f669b35e510db65925d68a1e6dec780b447741ded76a68e02cf8ddaf8ff537c8', '/scratch/ldoyle4/ufs8/cache/1b4ec6165f569ed4cac9ecb2df59d8eeec47fee2c2a7ab2b541e3fa1d3d257c7', '/scratch/ldoyle4/ufs8/cache/04d6275fb78ec69ef9f4d15da9929605f04e78e75130363acfe431f1887acd72', '/scratch/ldoyle4/ufs8/cache/2278b4c7681ab2aa7428c9964bb3c554e12dfe50871d6d0a972caa692f534da6', '/scratch/ldoyle4/ufs8/cache/c54de3f8074bcbf2974c12fc744f0bfe2986f6b4924d79810c45636b6a2496f0', '/scratch/ldoyle4/ufs8/cache/b19a06c7553cc2b00facea088da11105288d502f86eecb4bc7c15da9d21edbde', '/scratch/ldoyle4/ufs8/cache/5b6866b42a727c7cdad576aa0151754ab76ed74e7cd7c272307a285f2473d164', '/scratch/ldoyle4/ufs8/cache/3c3b41f56b2f72d0a504d52aec3ad757cbb04ad2c834e1b605e92d200090318f', '/scratch/ldoyle4/ufs8/cache/263c5cb502a6025fd08611096508583cdb6798fa052a394b7a9325a16d103989', '/scratch/ldoyle4/ufs8/cache/8b5847c0886a3bf39f6e970be3f525510063e1c6abb8f2e4338f9e24861c5a93', '/scratch/ldoyle4/ufs8/cache/c65582ff88

In [10]:
ma_files_01 = files_01(may, days_31, steps, ['05'], ['01'], ['2014'])
#print(ma_files_01)


In [11]:
jun_files_01 = files_01(june, days_30, steps, ['06'], ['01'], ['2014'])
#print(jun_files_01)



In [12]:
jul_files_01 = files_01(july, days_31, steps, ['07'], ['01'], ['2014'])
#print(jul_files_01)



In [13]:
aug_files_01 = files_01(august, days_31, steps, ['08'], ['01'], ['2014'])
#print(aug_files_01)



In [14]:
sep_files_01 = files_01(september, days_30, steps, ['09'], ['01'], ['2014'])
#print(sep_files_01)



In [15]:
oct_files_01 = files_01(october, days_31, steps, ['10'], ['01'], ['2014'])
#print(oct_files_01)



In [16]:
nov_files_01 = files_01(november, days_30, steps, ['11'], ['01'], ['2014'])
#print(nov_files_01)



In [17]:

dec_files_01 = files_01(december, days_31, steps, ['12'], ['01'], ['2014'])
#print(dec_files_01)

Then, I make a list that contains each month

In [18]:
month_files_01 = [jan_files_01, feb_files_01, mar_files_01, apr_files_01, ma_files_01, jun_files_01, jul_files_01,
                  aug_files_01, sep_files_01, oct_files_01, nov_files_01, dec_files_01]

## Open the data as an xarray dataset

list the variables you want to drop, since the files contain lots of variables. 

I'm using taux, tauy, SST, SSU, and SSV, so I drop everything else, but if there is a variable you want to keep, just remove it from drop_var.

In [19]:
drop_var=['Heat_PmE','LW','LwLatSens','MLD_003','MLD_0125','SSH','SSS','SW','average_DT','average_T1','average_T2',
          'cos_rot','ePBL','evap','fprec','frazil','geolon','geolat','geolon_c','geolat_c','geolon_u','geolat_u', 
          'geolon_v','geolat_v','latent','lprec','lrunoff','sensible','sin_rot','speed','wet_c','wet_u','wet_v']



I wrote a function that concatenates the data and opens as an xarray dataset, and then i run it in a for loop for the entire year.

In [20]:
def dataset(of, drop_variables):
    ds_iso = xr.open_mfdataset(of, engine='netcdf4', drop_variables=drop_variables, combine='nested', concat_dim='time')
    return(ds_iso)

In [21]:
data = []

for mon in month_files_01:
    datas = dataset(mon, drop_var)
    data.append(datas)
    

In [22]:
data

[<xarray.Dataset>
 Dimensions:    (time: 140, yh: 1080, xh: 1440, xq: 1440, yq: 1080, nv: 2, z_i: 41, z_l: 40)
 Coordinates:
   * nv         (nv) float64 1.0 2.0
   * time       (time) object 2014-01-01 03:00:00 ... 2014-02-04 21:00:00
   * xh         (xh) float64 -299.7 -299.5 -299.2 -299.0 ... 59.53 59.78 60.03
   * xq         (xq) float64 -299.6 -299.3 -299.1 -298.9 ... 59.66 59.91 60.16
   * yh         (yh) float64 -80.39 -80.31 -80.23 -80.15 ... 89.73 89.84 89.95
   * yq         (yq) float64 -80.35 -80.27 -80.19 -80.11 ... 89.78 89.89 90.0
   * z_i        (z_i) float64 0.0 10.0 20.0 ... 3.728e+03 4.225e+03 4.737e+03
   * z_l        (z_l) float64 5.0 15.0 25.0 ... 3.489e+03 3.977e+03 4.481e+03
 Data variables:
     SST        (time, yh, xh) float32 dask.array<chunksize=(1, 1080, 1440), meta=np.ndarray>
     SSU        (time, yh, xq) float32 dask.array<chunksize=(1, 1080, 1440), meta=np.ndarray>
     SSV        (time, yq, xh) float32 dask.array<chunksize=(1, 1080, 1440), meta=np.nda

## Process the data: daily means and regridding

Here is where it gets a bit clunky again. This needs to be run 5 times, 1 for each variable: taux, tauy, SST, SSU, SSV.

The attributes for 

taux: 
* units = Pa
* name = Zonal Windstress

tauy:
* units = Pa
* name = Meridional Windstress

SST:
* units = degC
* name = Sea Surface Temperature

SSU:
* units = m s-1
* name = Sea Surface Zonal Velocity

SSV:
* units = m s-1
* Sea Surface Meridional Velocity

Resample the time axis for each variable from 6 hourly to daily and then take the average of each day. Then regrid the data from a tripolar grid to lat lon. I chose 0.25 degree resolution as the basis, but you can change it if you need to.

I print out the results of the regrid just to make sure everything is running okay.

In [45]:
tauy_ds = []
for i in data:
    tauy = i.tauy.load()
    
    tauy = tauy.resample(time='D').mean()
    tauy.attrs['units'] = 'Pa'
    tauy.attrs['name'] = 'Meridional Wind Stress'
    time_daily_mean = tauy['time']
    
    #print(tauy)

    new_lat=np.arange(-90,90.25,0.25)
    new_lon=np.arange(0,360.25,0.25)
    regrid = xr.Dataset({'lat': (['lat'], new_lat),
                    'lon': (['lon'], new_lon)
                    })

    regridder = xe.Regridder(tauy, regrid, 'bilinear', periodic=True)
    tauy = regridder(tauy)
    
    #print(tauy)

    tauy = tauy.rename('tauy')
    tauy.attrs['units'] = 'Pa'
    tauy.attrs['name'] = 'Meridional Wind Stress'
    
    print(tauy)
    
    tauy_ds.append(tauy)


  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.04434917, -0.04508682, -0.03978698, ..., -0.02180599,
         -0.03446795, -0.04434917],
        [-0.03528704, -0.03670482, -0.03393634, ..., -0.02028754,
         -0.02870704, -0.03528704],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.0270551 , -0.02710105, -0.02331325, ..., -0.01677369,
         -0.02298932, -0.0270551 ],
        [-0.02392963, -0.02234001, -0.01881228, ..., -0.01793003,
         -0.02256072, -0.02392963],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.11958265, -0.12093299, -0.12897582, ..., -0.11491521,
         -0.11903284, -0.11958265],
        [-0.1271704 , -0.13424598, -0.14123721, ..., -0.12661146,
         -0.1285263 , -0.1271704 ],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.05046664, -0.05427557, -0.05354462, ..., -0.03606118,
         -0.0437356 , -0.05046664],
        [-0.04630261, -0.04948849, -0.04947748, ..., -0.03309929,
         -0.04022257, -0.04630261],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.03460768, -0.03683424, -0.0395206 , ..., -0.02499877,
         -0.0306513 , -0.03460768],
        [-0.03301555, -0.03544063, -0.03764489, ..., -0.02227001,
         -0.02867498, -0.03301555],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.00704531, -0.00875491, -0.01091996, ..., -0.00779193,
         -0.00706464, -0.00704531],
        [-0.00803525, -0.00689333, -0.00725865, ..., -0.01064076,
         -0.01010047, -0.00803525],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.0010156 , -0.0021597 , -0.00359746, ..., -0.00045127,
         -0.0004549 , -0.0010156 ],
        [-0.00058169, -0.00106636, -0.00205157, ..., -0.00025187,
         -0.00033328, -0.00058169],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [ 0.00509659,  0.00479363,  0.00543724, ...,  0.0079796 ,
          0.00635948,  0.00509659],
        [ 0.00713181,  0.00647253,  0.00665404, ...,  0.01009113,
          0.00860981,  0.00713181],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.01162241, -0.01031824, -0.0095852 , ..., -0.01227439,
         -0.01229501, -0.01162241],
        [-0.0172869 , -0.01629017, -0.01500601, ..., -0.01741893,
         -0.01756516, -0.0172869 ],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [ 0.15245084,  0.15281777,  0.15440355, ...,  0.14799473,
          0.15161881,  0.15245084],
        [ 0.15107481,  0.15432997,  0.1575411 , ...,  0.14911357,
          0.15012321,  0.15107481],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [-0.01601858, -0.01889497, -0.02008468, ..., -0.006675  ,
         -0.01167454, -0.01601858],
        [-0.01478155, -0.01659759, -0.01724824, ..., -0.0075727 ,
         -0.0115095 , -0.01478155],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

  return key in self.data


<xarray.DataArray 'tauy' (time: 35, lat: 721, lon: 1441)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [ 0.01883476,  0.01539223,  0.01415206, ...,  0.03543208,
          0.02576859,  0.01883476],
        [ 0.01928522,  0.01362881,  0.01003331, ...,  0.03284436,
          0.02629942,  0.01928522],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
     

Here I print the appended data to make sure its correct.

In [81]:
SST_ds

[<xarray.DataArray 'SST' (time: 35, lat: 721, lon: 1441)>
 array([[[       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
         [       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
         [       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
         ...,
         [-1.7855469, -1.7294363, -1.6925484, ..., -1.8333464,
          -1.8240175, -1.7855469],
         [-1.794058 , -1.7396836, -1.7169994, ..., -1.8190552,
          -1.8209257, -1.794058 ],
         [       nan,        nan,        nan, ...,        nan,
                 nan,        nan]],
 
        [[       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
         [       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
         [       nan,        nan,        nan, ...,        nan,
                 nan,        nan],
 ...
         [-1.8188515, -1.8047773, -1.7

## Save the data to your computer

The last place it gets clunky. Again, I couldn't figure out a way to make this smoother, but feel free to mess around with it and make it more functional.

I define the date for each file. 

In [46]:
years = ['2014']

months_num = np.arange(1,13,1)
months = []
for month in months_num: months.append('{:02d}'.format(month))
print(months)

day_init = ['01'] # or ['01']

['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']


write the dates using an f string

In [47]:
date_for_file = [f'{year}{month}{day}' for year in years for month in months for day in day_init]
print(date_for_file)

['20140101', '20140201', '20140301', '20140401', '20140501', '20140601', '20140701', '20140801', '20140901', '20141001', '20141101', '20141201']


I wrote a function that saves the data for the date given. The variable should be changed to whichever variable you are using at the time.

In [48]:
def save_as(path, dataset, date_range, encoding):
    """
    Given a path, the dataset, the date range, and the encoding: will write out the data as a netcdf and save to the path given.
    
    Make sure to change the variable!
    
    The structure of the function return should have the format:
    return(dataset.to_netcdf((path)+(variable: taux, tauy, SST, SSU, SSV)+"_"+(date_range)+".nc", encoding=encoding))
    """
    
    return(dataset.to_netcdf((path)+"tauy_"+(date_range)+".nc", encoding=encoding))


Then I run the function as a for loop over the date_for_file dates for each object in the appended dataset of resampled and regridded data. 

Again, the variable should be changed for the one your using (I use it in my path, the dataset, and the encoding, but if you have your data saved differently, then make it save to however you structure it.

In [49]:
for i in range(len(date_for_file)):
    save_as(path= "/scratch/ldoyle4/ufs8/daily/mean/tauy/",
            dataset = tauy_ds[i],
            date_range = date_for_file[i],
            encoding = {'lat': {'dtype':'float32','_FillValue': None},
                'lon': {'dtype':'float32','_FillValue': None},
                'time': {'dtype':'int32'},
                'tauy':{'dtype':'float32'}
                })

## Sanity check

Make sure the data saved correctly by reading it in. It should contain 35 times, the correct lat and lon you regridded to, and should contain the units and variable name.

In [50]:
path = '/scratch/ldoyle4/ufs8/daily/mean/tauy/'
fname = 'tauy_20140301.nc'
ds = xr.open_dataset(path+fname)
ds