In [None]:
%load_ext autoreload
%autoreload 1
%matplotlib inline

In [None]:
import pandas as pd
from ecodatatk.Spatial import NetCDFSpatial

#### a) Initialize a NetCDFSpatial instance with the file path:


In [14]:
file = r'../Example Files/netcdf/Mangueira2000_01_01to2018_01_01.nc'
dataset = NetCDFSpatial(file)

  return _prepare_from_string(" ".join(pjargs))


#### b) Visualize the reader data and variables in file:


In [15]:
dataset.data
dataset.variables

['latitude', 'longitude', 'sp', 'u10', 'v10', 't2m', 'd2m', 'skt']

#### c) We can choice work with only interest variables by:

In [None]:
dataset.var_choice()
dataset.variables

If you already know the variables names, can insert through a list by:

In [16]:
dataset.var_choice(variables = ['sp', 'u10','v10','t2m'])
dataset.variables

Index(['sp', 'u10', 'v10', 't2m', 'longitude', 'latitude', 'geometry'], dtype='object')

This method filter our dataset for only the specific variables.

#### d) We can apply a temporal filter in dataset through by:

In [17]:
#Insert dates on format Month/Day/Year
StartDate = "01/21/2000"
EndDate   =  "01/21/2000"

In [18]:
dataset.temporalFilter(StartDate,EndDate)

#### e) It is also possible to perform a spatial filter by two methods.

The first method is to inform the coordinates of a point of interest, where a search will be carried out in the dataset looking for the data closest to that point and extracting its data series:   

In [None]:
# Set interest point coordinates (Lon/Lat):
interest_point = [-50.75, -33.55]

In [None]:
dataset.pointFilter(interest_point)
dataset.data

The dataset returned by this method is just more than the time series of the variables chosen for the point found.

The second method performs the spatial filter using a search radius starting from the center of an informed shapefile: 

In [None]:
# Read dataset again and apply a temporal filter:
dataset = NetCDFSpatial(file)
dataset.temporalFilter(StartDate,EndDate)

In [19]:
# Shapefile path:
mask_path =  r"../Example Files/shapefile/polimirim.shp"

In [20]:
# Apply the filter:
dataset.centroidFilter(mask_path, radius=1) 
# Note that radius unit is dependent from crs, in this case radius = 1 is equivalent 1 degree

Invalid Shape or non-existent path.


Some data sets have a coarse grid, this implies that in smaller scale study areas, a filter using only the shapefile area could return a null result.

#### f) As we are working with time series, we may wish to perform a time resample. We can do this by:

In [21]:
dataset.resampleDS(freq='1D', method='ffill', order=1)
dataset.data

Unnamed: 0_level_0,sp,u10,v10,t2m,longitude,latitude,geometry
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2000-01-21,100229.28125,-2.094112,-5.796221,298.854767,-52.625,-33.174999,POINT (-52.62500 -33.17500)
2000-01-21,100229.28125,-2.094112,-5.796221,298.854767,-52.625,-33.174999,POINT (-52.62500 -33.17500)


We can inform resample frequency by 'freq' and resample method by 'method' arguments. The frequency argument can be a hourly, daily or monthly fraction to a specific temporal resample (e.g. freq = '0.5H' makes a resample to each 30 minutes). More information on the resample methods available, see help(NetCDFSpatial.resample).


#### g) We can export dataset to a .csv format:

In [None]:
# Makes a point filter:
interest_point = [-50.75, -33.55]
a = dataset
a.pointFilter(interest_point)

In [23]:
# Exporto to csv:
dataset.pts2csv('output/dataset-to-csv.csv')

In [24]:
# Read csv file:
df = pd.read_csv('output/dataset-to-csv.csv', sep =';')
df.head()

Unnamed: 0,time,sp,u10,v10,t2m,longitude,latitude,geometry
0,2000-01-21,100229.28,-2.094112,-5.796221,298.85477,-52.625,-33.174999,POINT (-52.625 -33.17499923706055)
1,2000-01-21,100229.28,-2.094112,-5.796221,298.85477,-52.625,-33.174999,POINT (-52.625 -33.17499923706055)


#### h) Besides, we can export the data to .tiff format: 

- The data is interpolate on rectangular raster with bounds definite by user through outputBounds argument:

    
    outputBounds = [Longitude-Left Upper Point, Latitude-Left Upper Point, Longitude-Right Lower Point, Latitude-Right Lower Point]

    The outputBounds argument default value catch the limit points in dataset and create a interpolation area.

- We can inform a shapefile mask to crop the resulting .tif file through 'mask_shp' argument. On this mask, we can also add a buffer in your area through buffer_mask (percentage) argument.

- Its possible select variables to export through 'var_list' argument.

In [None]:
# Read dataset again and apply a temporal filter:
file = r'../Example Files/netcdf/Mangueira2000_01_01to2018_01_01.nc'
dataset = NetCDFSpatial(file)

# Apply a temporal filter:
StartDate = "01/21/2000"
EndDate   =  "01/21/2000"
dataset.temporalFilter(StartDate,EndDate)

# Shapefile path:
mask_path =  r"..Examples/Example Files/shapefile/polimirim.shp"

# Apply centroid filter:
dataset.centroidFilter(mask_path, radius=1) 

# Apply temporal filter:
dataset.resampleDS(freq='1D', method='ffill', order=1)

In [None]:
import geopandas as gpd
gpd.GeoDataFrame.from_file(mask_path)

In [None]:
# Exports data:
dataset.ptsTime2Raster('../Example Files/raster-output/raster_base_name', var_list = ['sp','t2m'], outputBounds = None, outCRS = 'WGS84', mask_shp = mask_path, buffer_mask = 0)