# Reading WRF data into Xarray and Calculating CAPE

The ***typical*** data workflow within the Python ecosystem when working with Weather Research and Forecasting (WRF) data is to use the [wrf-python](https://wrf-python.readthedocs.io/en/latest/) package! Traditionally, it can be difficult to utilize the `xarray` data model with WRF data, requiring the following:
- Read the data into wrf-python
- Calculate your diagnostics
- Convert to an xarray dataset

In this example, we show how you can use the ***extremely experimental package*** `xWRF`, in addition to the new experimental package `xCAPE` to read in WRF data and apply a calculation. We will also contrast this to the previous implementation, providing a timing comparison between the two.

Again, the stress here is **experimental** such that this is a proof of concept - not meant to be used directly in workflows; but rather to show what is ***possible*** given further development

## Imports
Here, we only need a few packages; `xwrf`, `xcape`, `dask`, and `xarray`

In [66]:
import glob

import xarray as xr
import xwrf
from distributed import Client
from metpy.units import units
from ncar_jobqueue import NCARCluster

## Spin up a Cluster

In [17]:
cluster = NCARCluster()
cluster.scale(10)
client = Client(cluster)
client

0,1
Connection method: Cluster object,Cluster type: PBSCluster
Dashboard: https://jupyterhub.hpc.ucar.edu/stable/user/mgrover/proxy/8787/status,

0,1
Dashboard: https://jupyterhub.hpc.ucar.edu/stable/user/mgrover/proxy/8787/status,Workers: 0
Total threads:  0,Total memory:  0 B

0,1
Comm: tcp://10.12.206.37:33972,Workers: 0
Dashboard: https://jupyterhub.hpc.ucar.edu/stable/user/mgrover/proxy/8787/status,Total threads:  0
Started:  Just now,Total memory:  0 B


## Grab a list of files

In [7]:
files = sorted(glob.glob('/glade/scratch/bruyerec/IAG/METGRID/*.nc'))

In [25]:
file_subset = files[-80:]

## Examine of the files
We can open up one of the files, and inspect which variables are included, as well as descriptions of those variables!

In [57]:
wrf_ds = xr.open_dataset(files[0])
wrf_ds

Well ***that*** wasn't super helpful... let's take a look what's in those variables

In [60]:
for var in ds_test:
    try:
        print(f'variable: {var}, description: {ds_test[var].description}')
    except:
        pass

variable: PRES, description: 
variable: SOIL_LAYERS, description: 
variable: SM, description: 
variable: ST, description: 
variable: GHT, description: Height
variable: SM100289, description: Soil moisture of 100-289 cm ground layer
variable: SM028100, description: Soil moisture of 28-100 cm ground layer
variable: SM007028, description: Soil moisture of 7-28 cm ground layer
variable: SM000007, description: Soil moisture of 0-7 cm ground layer
variable: ST100289, description: T of 100-289 cm ground layer
variable: ST028100, description: T of 28-100 cm ground layer
variable: ST007028, description: T of 7-28 cm ground layer
variable: ST000007, description: T of 0-7 cm ground layer
variable: SNOWH, description: Physical Snow Depth
variable: SNOW, description: Water Equivalent of Accumulated Snow Depth
variable: SST, description: Sea-Surface Temperature
variable: SEAICE, description: Sea-Ice Fraction
variable: SKINTEMP, description: Sea-Surface Temperature
variable: PMSL, description: Sea-le

We are only interested in the variables required for calculating Convective Available Potential Energy (CAPE) here, which are:
- Temperature
- Pressure
- Dewpoint

You'll notice here that Dewpoint **is not** one of the variables written out of the model, but we **can** calculate it given the following variables:
- Temperature
- Relative Humidity

## Read in the Dataset

In [55]:
%%time
variables = ["PRES", "TT", "RH"]


def preprocess(ds):
    return ds[variables]


ds = xr.open_mfdataset(
    file_subset,
    engine="xwrf",
    parallel=True,
    concat_dim="Time",
    combine="nested",
    preprocess=preprocess,
    chunks={'Time': 80},
)

CPU times: user 5.18 s, sys: 845 ms, total: 6.02 s
Wall time: 11.9 s


## Convert the Units
For this calculation, `xCAPE` requires the units:
- Pressure (mb)
- Temperature (degC)
- Dewpoint (degC)

In [61]:
ds

Unnamed: 0,Array,Chunk
Bytes,640 B,8 B
Shape,"(80,)","(1,)"
Count,240 Tasks,80 Chunks
Type,datetime64[ns],numpy.ndarray
"Array Chunk Bytes 640 B 8 B Shape (80,) (1,) Count 240 Tasks 80 Chunks Type datetime64[ns] numpy.ndarray",80  1,

Unnamed: 0,Array,Chunk
Bytes,640 B,8 B
Shape,"(80,)","(1,)"
Count,240 Tasks,80 Chunks
Type,datetime64[ns],numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 708.75 kiB 708.75 kiB Shape (378, 480) (378, 480) Count 395 Tasks 1 Chunks Type float32 numpy.ndarray",480  378,

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 708.75 kiB 708.75 kiB Shape (378, 480) (378, 480) Count 395 Tasks 1 Chunks Type float32 numpy.ndarray",480  378,

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 708.75 kiB 708.75 kiB Shape (378, 480) (378, 480) Count 395 Tasks 1 Chunks Type float32 numpy.ndarray",480  378,

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 708.75 kiB 708.75 kiB Shape (378, 480) (378, 480) Count 395 Tasks 1 Chunks Type float32 numpy.ndarray",480  378,

Unnamed: 0,Array,Chunk
Bytes,708.75 kiB,708.75 kiB
Shape,"(378, 480)","(378, 480)"
Count,395 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.05 GiB 26.30 MiB Shape (80, 38, 378, 480) (1, 38, 378, 480) Count 240 Tasks 80 Chunks Type float32 numpy.ndarray",80  1  480  378  38,

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.05 GiB 26.30 MiB Shape (80, 38, 378, 480) (1, 38, 378, 480) Count 240 Tasks 80 Chunks Type float32 numpy.ndarray",80  1  480  378  38,

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.05 GiB 26.30 MiB Shape (80, 38, 378, 480) (1, 38, 378, 480) Count 240 Tasks 80 Chunks Type float32 numpy.ndarray",80  1  480  378  38,

Unnamed: 0,Array,Chunk
Bytes,2.05 GiB,26.30 MiB
Shape,"(80, 38, 378, 480)","(1, 38, 378, 480)"
Count,240 Tasks,80 Chunks
Type,float32,numpy.ndarray


In [51]:
(ds_test.PRES.isel(south_north=0, west_east=0).values * units.Pa).to('hPa')

0,1
Magnitude,[[1028.593994140625 1000.0 975.0 950.0 925.0 900.0 875.0 850.0 825.0  800.0 775.0 750.0 700.0 650.0 600.0 550.0 500.0 450.0 400.0 350.0 300.0  250.0 225.0 200.0 175.0 150.0 125.0 100.0 70.0 50.0 30.0 20.0 10.0 7.0  5.0 3.0 2.0 1.0]]
Units,hectopascal


In [None]:
imp