# Example usage

## Initialize a GeslaDataset object
Place the `gesla.py` file in your working directory (or elsewhere on your path), and import the `GeslaDataset` class. Selecting and loading data files requires paths to the metadata .csv file and the directory containing the data files. Initialize a `GeslaDataset` object with these paths as follows.

In [3]:
from gesla import GeslaDataset

meta_file = "resources/gesla/GESLA3_ALL.csv"
data_path = "resources\gesla\GESLA3.0_ALL.zip"

g3 = GeslaDataset(meta_file=meta_file, data_path=data_path)

## Load data from a single file
If you want to work with data from a single record, and you know the filename you want, use the function `file_to_pandas` as follows. The function returns a `pandas.DataFrame` with data and flags, and a `pandas.Series` containing metadata.

In [4]:
filename = 'degerby-deg-fin-cmems'
data, meta = g3.file_to_pandas(filename)

data

Unnamed: 0_level_0,sea_level,qc_flag,use_flag
date_time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1971-01-01 00:00:00,0.028,1,1
1971-01-01 01:00:00,0.025,1,1
1971-01-01 02:00:00,0.021,1,1
1971-01-01 03:00:00,0.026,1,1
1971-01-01 04:00:00,0.023,1,1
...,...,...,...
2020-12-17 03:00:00,-0.106,1,1
2020-12-17 04:00:00,-0.100,1,1
2020-12-17 05:00:00,-0.084,1,1
2020-12-17 06:00:00,-0.079,1,1


## Load data from a list of files
If you want to work with data from multiple files, and you know the filenames you want, use the function `files_to_xarray` as follows. The function returns a `xarray.Dataset` object containing data, flags, and metadata.

In [3]:
filenames = [
    "durban-181a-zaf-uhslc",
    "dutch_harbor_ak-041b-usa-uhslc", 
    "duxbury-8446166-usa-noaa", 
]
xr_dataset = g3.files_to_xarray(filenames)
xr_dataset

## Load data from the N closest records to a lat/lon location
Load data from records close to a particular location using the function `load_N_closest` as follows. Provide a lat/lon location and the number of desired records. The function returns a `xarray.Dataset` object containing data, flags, and metadata.  

Note the `UserWarning` that occurs when duplicate timestamps are encountered. The function `file_to_pandas` used to read each individual file keeps only the first of any duplicate timestamps.

In [6]:
data = g3.load_N_closest(lat=60.0, lon=20.0, N=2)
print(data.file_name.values)

['foglo_degerby-134252-fin-fmi' 'degerby-deg-fin-cmems']


## Load data from the records in a lat/lon range
Load data from records in a rectangular lat/lon range using the function `load_lat_lon_range` as follows. Provide lat/lon extents of the range. The function returns a `xarray.Dataset` object containing data, flags, and metadata.

In [2]:
# # Preprocess1 Area ERA5
# # Imposes Memory Error like this
# south_lat = 30 
# north_lat = 80 
# west_lon  = -70   
# east_lon  = 50 

data = g3.load_lat_lon_range(
    south_lat=south_lat,
    north_lat=north_lat,
    west_lon=west_lon,
    east_lon=east_lon,
)
print(data.site_name.values)



MemoryError: Unable to allocate 26.0 MiB for an array with shape (3409859,) and data type datetime64[ns]