# `NCTiles.jl` Creates Files With Meta Data

[NCTiles.jl](https://gaelforget.github.io/NCTiles.jl/dev/) creates [NetCDF](https://en.wikipedia.org/wiki/NetCDF) files that follow the [CF Metadata Conventions](http://cfconventions.org). It can be used either (1) in stand-alone mode or (2) in combination with [MeshArrays.jl](https://juliaclimate.github.io/MeshArrays.jl/dev/). The examples below include:

1. Writing mapped model output, on a regular `lat-lon` grid, to a single `NetCDF` file
  - 2D example
  - 3D example
2. Writing tiled model output, on `C-grid` subdomains, to a collection of `NetCDF` files
  - 2D surface example
  - 3D temperature example
  - 3D staggered vector example

### Packages & Helper Functions

_These will be used throughout the notebook_

In [1]:
if false
    using Pkg
    Pkg.add(PackageSpec(name="NCTiles", rev="master"))
    Pkg.add(PackageSpec(name="MITgcmTools", rev="master"))
end

using NCTiles
include("helper_functions.jl");

### File Paths & I/O Back-End

_These will be used throughout the notebook_

In [2]:
# File Paths
inputs = "../inputs/nctiles-testcases/"
get_testcases_if_needed(inputs)
pth=input_file_paths(inputs)

outputs = "../outputs/nctiles-newfiles/"
if ~ispath(outputs); mkpath(outputs); end

# I/O Back-End
nc=NCTiles.NCDatasets

NCDatasets

## Interpolated Data Examples


This example uses 2D and 3D model output that has been interpolated to a rectangular half-degree grid. It reads the data from binary files, adds meta data, and then writes it all to a single `NetCDF` file per model variable. 

First, we need to define coordinate variables, array sizes, and meta data:

In [3]:
writedir = joinpath(outputs,"interp") #output files path
if ~ispath(writedir); mkpath(writedir); end

Γ = grid_etc_interp(pth) #dimensions, sizes, and meta data

Dict{String,Any} with 9 entries:
  "lon_c"  => NCvar("lon_c", "degrees_east", (720,), -179.75:0.5:179.75, Dict("…
  "dep_l"  => NCvar("dep_l", "m", (50,), Float32[10.0, 20.0, 30.0, 40.0, 50.0, …
  "dep_c"  => NCvar("dep_c", "m", (50,), Float32[5.0, 15.0, 25.0, 35.0, 45.0, 5…
  "lat_c"  => NCvar("lat_c", "degrees_north", (360,), -89.75:0.5:89.75, Dict("l…
  "n1"     => 720
  "readme" => ["Please replace this placeholder file with a descriptive", "para…
  "tim"    => NCvar("tim", "days since 1992-01-01 0:0:0", Inf, 1.5:3.0:7.5, Dic…
  "n2"     => 360
  "n3"     => 50

### 2D example

Choose variable to process and get the corresponding list of input files

In [4]:
prec = Float32
dataset = "state_2d_set1"
fldname = "ETAN"
flddatadir = joinpath(pth["interp"],fldname)
fnames = joinpath.(Ref(flddatadir),filter(x -> occursin(".data",x), readdir(flddatadir)))

3-element Array{String,1}:
 "../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000000732.data"
 "../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000001428.data"
 "../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000002172.data"

Get meta data for the chosen variable

In [5]:
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

Dict{String,Any} with 7 entries:
  "mate"    => ""
  "units"   => "m"
  "diagNum" => 23
  "fldname" => "ETAN"
  "title"   => "Surface Height Anomaly"
  "code"    => "SM      M1"
  "levs"    => 1

Define:

- a `BinData` struct to contain the file names, precision, and array size.
- a `NCvar` struct that sets up the subsequent `write` operation (incl. `BinData` struct.

In [6]:
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"]))
dims = [Γ["lon_c"],Γ["lat_c"],Γ["tim"]]
field = NCvar(fldname,diaginfo["units"],dims,flddata,
    Dict("long_name" => diaginfo["title"]),nc)

NCvar("ETAN", "m", NCvar[NCvar("lon_c", "degrees_east", (720,), -179.75:0.5:179.75, Dict("long_name" => "longitude"), NCDatasets), NCvar("lat_c", "degrees_north", (360,), -89.75:0.5:89.75, Dict("long_name" => "longitude"), NCDatasets), NCvar("tim", "days since 1992-01-01 0:0:0", Inf, 1.5:3.0:7.5, Dict("long_name" => "time","standard_name" => "time"), NCDatasets)], BinData(["../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000000732.data", "../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000001428.data", "../inputs/nctiles-testcases/diags_interp/ETAN/ETAN.0000002172.data"], Float32, (720, 360), 1), Dict("long_name" => "Surface Height Anomaly"), NCDatasets)

Create the NetCDF file and write data to it.

In [7]:
# Create the NetCDF file and populate with dimension and field info
ds,fldvar,dimlist = createfile(joinpath(writedir,fldname*".nc"),field,Γ["readme"])

# Add field and dimension data
addData(fldvar,field)
addDimData.(Ref(ds),field.dims)

# Close the file
close(ds)

### 3D example

In [9]:
# Get the filenames for our first dataset and other information about the field.
dataset = "WVELMASS"
fldname = "WVELMASS"
flddatadir = joinpath(pth["interp"],fldname)
fnames = flddatadir*'/'.*filter(x -> occursin(".data",x), readdir(flddatadir))
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

# Define the field for writing using an NCvar struct. BinData contains the filenames
# where the data sits so it's only loaded when needed.
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"],Γ["n3"]))
dims = [Γ["lon_c"],Γ["lat_c"],Γ["dep_l"],Γ["tim"]]
field = NCvar(fldname,diaginfo["units"],dims,flddata,Dict("long_name" => diaginfo["title"]),nc)

# Create the NetCDF file and populate with dimension and field info
ds,fldvar,dimlist = createfile(joinpath(writedir,fldname*".nc"),field,Γ["readme"])

# Add field and dimension data
addData(fldvar,field)
addDimData.(Ref(ds),field.dims)

# Close the file
close(ds)

## Tiled Data Examples

This example reads in global variables defined over a collection of subdomain arrays (_tiles_) using `MeshArrays.jl`, and writes them to a collection of `NetCDF` files (_nctiles_) using `NCTiles.jl`.

First, we need to define coordinate variables, array sizes, and meta data:

In [10]:
writedir = joinpath(outputs,"tiled")
~ispath(writedir) ? mkpath(writedir) : nothing

Γ=grid_etc_native(pth);

### 2D example

Choose variable to process and get the corresponding list of input files

In [11]:
prec = Float32
dataset = "state_2d_set1"
fldname = "ETAN"
fnames = pth["native"]*'/'.*filter(x -> (occursin(".data",x) && occursin(dataset,x)), readdir(pth["native"]))
savepath = joinpath(writedir,fldname)
if ~ispath(savepath); mkpath(savepath); end
savenamebase = joinpath.(Ref(savepath),fldname)
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname);

In [12]:
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"]))

BinData(["../inputs/nctiles-testcases/diags//state_2d_set1.0000000732.data", "../inputs/nctiles-testcases/diags//state_2d_set1.0000001428.data", "../inputs/nctiles-testcases/diags//state_2d_set1.0000002172.data"], Float32, (90, 1170), 1)

Prepare dictionary of `NCvar` structs and write to `NetCDF` files.

In [13]:
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"]))
tilfld = TileData(flddata,Γ["tilesize"],Γ["mygrid"])
numtiles = Γ["numtiles"]

dims = [Γ["icvar"],Γ["jcvar"],Γ["tim"]]
coords = join(replace([dim.name for dim in dims],"i_c" => "lon", "j_c" => "lat")," ")
flds = Dict([fldname => NCvar(fldname,diaginfo["units"],dims,tilfld,Dict(["long_name" => diaginfo["title"], "coordinates" => coords]),nc),
            "lon" => Γ["loncvar"],
            "lat" => Γ["latcvar"],
            "area" => Γ["areacvar"],
            "land" => Γ["land2Dvar"]
])

writeNetCDFtiles(flds,savenamebase,Γ["readme"])

### 3D example

In [14]:
# Get the filenames for our first dataset and other information about the field.
prec = Float32
dataset = "state_3d_set1"
fldname = "THETA"
fnames = pth["native"]*'/'.*filter(x -> (occursin(".data",x) && occursin(dataset,x)), readdir(pth["native"]))
savepath = joinpath(writedir,fldname)
if ~ispath(savepath); mkpath(savepath); end
savenamebase = joinpath.(Ref(savepath),fldname)
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

# Fields to be written to the file are indicated with a dictionary of NCvar structs.
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"],Γ["n3"]))
dims = [Γ["icvar"],Γ["jcvar"],Γ["dep_c"],Γ["tim"]]
tilfld = TileData(flddata,Γ["tilesize"],Γ["mygrid"])
coords = join(replace([dim.name for dim in dims],"i_c" => "lon", "j_c" => "lat")," ")
flds = Dict([fldname => NCvar(fldname,diaginfo["units"],dims,tilfld,Dict(["long_name" => diaginfo["title"], "coordinates" => coords]),nc),
            "lon" => Γ["loncvar"],
            "lat" => Γ["latcvar"],
            "area" => Γ["areacvar"],
            "land" => Γ["land3Dvar"],
            "thic" => Γ["thiccvar"]
])

# Write to NetCDF files
writeNetCDFtiles(flds,savenamebase,Γ["readme"])

### 3D vector example

Here we process the three staggered components of a vector field (`UVELMASS`, `VVELMASS` and `WVELMASS`). On a `C-grid` these components are staggered in space.

First component : `UVELMASS`

In [15]:
# Get the filenames for our first dataset and create BinData struct
prec = Float32
dataset = "trsp_3d_set1"
fldname = "UVELMASS"
fnames = pth["native"]*'/'.*filter(x -> (occursin(".data",x) && occursin(dataset,x)), readdir(pth["native"]))
savepath = joinpath(writedir,fldname)
if ~ispath(savepath); mkpath(savepath); end
savenamebase = joinpath.(Ref(savepath),fldname)
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

# Define field- BinData contains the filenames where the data sits so it's only loaded when needed
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"],Γ["n3"]))
dims = [Γ["iwvar"],Γ["jwvar"],Γ["dep_c"],Γ["tim"]]
tilfld = TileData(flddata,Γ["tilesize"],Γ["mygrid"])
coords = join(replace([dim.name for dim in dims],"i_w" => "lon", "j_w" => "lat")," ")
flds = Dict([fldname => NCvar(fldname,diaginfo["units"],dims,tilfld,Dict(["long_name" => diaginfo["title"], "coordinates" => coords]),nc),
            "lon" => Γ["lonwvar"],
            "lat" => Γ["latwvar"],
            "area" => Γ["areawvar"],
            "land" => Γ["landwvar"],
            "thic" => Γ["thiccvar"]
        ])

writeNetCDFtiles(flds,savenamebase,Γ["readme"])

Second component : `VVELMASS`

In [16]:
# Get the filenames for our first dataset and create BinData struct
prec = Float32
dataset = "trsp_3d_set1"
fldname = "VVELMASS"
fnames = pth["native"]*'/'.*filter(x -> (occursin(".data",x) && occursin(dataset,x)), readdir(pth["native"]))
savepath = joinpath(writedir,fldname)
if ~ispath(savepath); mkpath(savepath); end
savenamebase = joinpath.(Ref(savepath),fldname)
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

# Define field- BinData contains the filenames where the data sits so it's only loaded when needed
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"],Γ["n3"]))
dims = [Γ["isvar"],Γ["jsvar"],Γ["dep_c"],Γ["tim"]]
tilfld = TileData(flddata,Γ["tilesize"],Γ["mygrid"])
coords = join(replace([dim.name for dim in dims],"i_s" => "lon", "j_s" => "lat")," ")
flds = Dict([fldname => NCvar(fldname,diaginfo["units"],dims,tilfld,Dict(["long_name" => diaginfo["title"], "coordinates" => coords]),nc),
            "lon" => Γ["lonsvar"],
            "lat" => Γ["latsvar"],
            "area" => Γ["areasvar"],
            "land" => Γ["landsvar"],
            "thic" => Γ["thiccvar"]
])

writeNetCDFtiles(flds,savenamebase,Γ["readme"])

Third component : `WVELMASS`

In [17]:
# Get the filenames for our first dataset and create BinData struct
prec = Float32
dataset = "trsp_3d_set1"
fldname = "WVELMASS"
fnames = pth["native"]*'/'.*filter(x -> (occursin(".data",x) && occursin(dataset,x)), readdir(pth["native"]))
savepath = joinpath(writedir,fldname)
if ~ispath(savepath); mkpath(savepath); end
savenamebase = joinpath.(Ref(savepath),fldname)
diaginfo = readAvailDiagnosticsLog(pth["diaglist"],fldname)

# Define field- BinData contains the filenames where the data sits so it's only loaded when needed
flddata = BinData(fnames,prec,(Γ["n1"],Γ["n2"],Γ["n3"]))
dims = [Γ["icvar"],Γ["jcvar"],Γ["dep_l"],Γ["tim"]]
tilfld = TileData(flddata,Γ["tilesize"],Γ["mygrid"])
coords = join(replace([dim.name for dim in dims],"i_c" => "lon", "j_c" => "lat")," ")
flds = Dict([fldname => NCvar(fldname,diaginfo["units"],dims,tilfld,Dict(["long_name" => diaginfo["title"], "coordinates" => coords]),nc),
            "lon" => Γ["loncvar"],
            "lat" => Γ["latcvar"],
            "area" => Γ["areacvar"],
            "land" => Γ["land3Dvar"],
            "thic" => Γ["thiclvar"]
])

writeNetCDFtiles(flds,savenamebase,Γ["readme"])