## GREMLIN dataset

This dataset is from this link: https://mountainscholar.org/handle/10217/235392

and licensed CC:BY for the most part.

Paper on the UNEt they used is here: https://journals.ametsoc.org/view/journals/apme/60/1/jamc-d-20-0084.1.xml?tab_body=pdf

In [1]:
import xarray as xr #have to install the python netCDF reader as well
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

In [2]:
from platform import python_version

print(python_version())

3.8.10


# Loading in the netCDF

In [3]:
data = 'gremlin_conus2_dataset.nc'

In [4]:
ds = xr.open_dataset(data)

In [5]:
ds

While this dataset loads in fine, one of the issues with it is the lack of coordinates. Lets try and fix that. Little explainer on the different terminology: https://docs.xarray.dev/en/stable/user-guide/terminology.html

## Variable Load

In [6]:
np.shape(ds.GOES_ABI_C07.data)

(2246, 256, 256)

In [7]:
C07 = ds.GOES_ABI_C07.data.ravel()

In [8]:
np.shape(C07)

(147193856,)

Is this the same size as 

In [9]:
2246*256*256

147193856

yes! Now lets make the other variables:

In [10]:
C09 = ds.GOES_ABI_C09.data.ravel()
C13 = ds.GOES_ABI_C13.data.ravel()

GLM = ds.GOES_GLM_GROUP.data.ravel()

In [11]:
MRMS = ds.MRMS_REFC.data.ravel()

We can turn this into a pandas dataset. 

In [12]:
d = {'ABI_C07': C07, 
     'ABI_C09': C09,
     'ABI_C13': C13,
     'GLM': GLM,
     'MRMS_REFC': MRMS
    }


df = pd.DataFrame(data=d)

In [14]:
df

Unnamed: 0,ABI_C07,ABI_C09,ABI_C13,GLM,MRMS_REFC
0,286.763000,246.694229,280.063721,0.0,-99.0
1,283.740601,246.943253,275.322601,0.0,-99.0
2,279.786469,246.943253,269.778900,0.0,-99.0
3,280.359436,247.517395,270.916687,0.0,-99.0
4,283.418549,247.923325,274.986267,0.0,-99.0
...,...,...,...,...,...
147193851,273.754028,232.555328,255.567947,0.0,-99.0
147193852,270.319733,232.431717,252.671616,0.0,-99.0
147193853,265.780182,232.307632,248.288879,0.0,-99.0
147193854,263.070190,232.183105,246.337784,0.0,-99.0


In [None]:
df.describe()