Skip to content

Commit

Permalink
further debugging
Browse files Browse the repository at this point in the history
  • Loading branch information
BaptisteVandecrux committed Jun 11, 2024
1 parent e0d93b5 commit a6e8807
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 17 deletions.
8 changes: 6 additions & 2 deletions src/pypromice/process/get_l2.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
from argparse import ArgumentParser
import pypromice
from pypromice.process.aws import AWS
from pypromice.process.write import prepare_and_write
from pypromice.process.load import getVars, getMeta

def parse_arguments_l2():
parser = ArgumentParser(description="AWS L2 processor")
Expand Down Expand Up @@ -55,8 +57,10 @@ def get_l2():

# Write out level 2
if args.outpath is not None:
aws.writeL2(args.outpath)

if aws.L2.attrs['format'] == 'raw':

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 11, 2024

Member

I don't think this is needed. Within the AWS class, the resampling should automatically be defined. I'll just go and check this.

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 11, 2024

Member

Yes, it does automatically define the resampling frequency when processing is performed through the aws class:

def writeArr(self, dataset, outpath):
'''Write L3 data to .nc and .csv hourly and daily files
Parameters
----------
dataset : xarray.Dataset
Dataset to write to file
outpath : str
Output directory
t : str
Resampling string
'''
f = [l.attrs['format'] for l in self.L0]
if 'raw' in f or 'STM' in f:
write.prepare_and_write(dataset, outpath, self.vars, self.meta, '10min')
else:
write.prepare_and_write(dataset, outpath, self.vars, self.meta, '60min')

So this edit should not be needed.

This comment has been minimized.

Copy link
@BaptisteVandecrux

BaptisteVandecrux Jun 11, 2024

Author Member

I really like the prepare_and_write function and I don't see the need for maintaining a writeL2. Here for instance, I want to write to file both the 10min data (if it is a 'raw' file) and the hourly averages. It would be strange to use two different functions for this.

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 11, 2024

Member

Yes, I'm happy for you to remove the writeL2() and writeL3() class functions. They are made redundant now we have the pypromice.process.write.prepare_and_write() function.

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 11, 2024

Member

Do you think it is better to explicitly define the resampling frequency here in the pypromice.process.aws.AWS.writeArr() function? Or is it okay to infer the resampling frequency from the dataset attributes?

This comment has been minimized.

Copy link
@BaptisteVandecrux

BaptisteVandecrux Jun 12, 2024

Author Member

I think it is best that the frequency is specified. For example here, where we want the raw data to be both written into files at 10min and hourly frequency.

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 12, 2024

Member

I'll change it so that we specify it then.

prepare_and_write(aws.L2, args.outpath, getVars(), getMeta(), '10min')
prepare_and_write(aws.L2, args.outpath, getVars(), getMeta(), '60min')


if __name__ == "__main__":
get_l2()
Expand Down
5 changes: 5 additions & 0 deletions src/pypromice/process/get_l2tol3.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ def get_l2tol3():
# Define Level 2 dataset from file
l2 = xr.open_dataset(args.inpath)

if 'bedrock' in l2.attrs.keys():
l2.attrs['bedrock'] = l2.attrs['bedrock'] == 'True'
if 'number_of_booms' in l2.attrs.keys():
l2.attrs['number_of_booms'] = int(l2.attrs['number_of_booms'])

# Perform Level 3 processing
l3 = toL3(l2)

Expand Down
32 changes: 17 additions & 15 deletions src/pypromice/process/join_levels.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,23 @@ def parse_arguments_join():
return args

def loadArr(infile):
if infile.split('.')[-1].lower() in 'csv':
print(infile)
if infile.split('.')[-1].lower() == 'csv':
df = pd.read_csv(infile, index_col=0, parse_dates=True)
ds = xr.Dataset.from_dataframe(df)

elif infile.split('.')[-1].lower() in 'nc':
elif infile.split('.')[-1].lower() == 'nc':
ds = xr.open_dataset(infile)

try:
name = ds.attrs['station_name']
name = ds.attrs['station_id']
except:
name = infile.split('/')[-1].split('.')[0].split('_hour')[0].split('_10min')[0]

ds.attrs['station_id'] = name
if 'bedrock' in ds.attrs.keys():
ds.attrs['bedrock'] = ds.attrs['bedrock'] == 'True'
if 'number_of_booms' in ds.attrs.keys():
ds.attrs['number_of_booms'] = int(ds.attrs['number_of_booms'])

print(f'{name} array loaded from {infile}')
return ds, name

Expand Down Expand Up @@ -88,16 +93,13 @@ def join_levels():
else:
print(f'Invalid files {args.file1}, {args.file2}')
exit()

# Define output directory subfolder
out = os.path.join(args.outpath, name)



# Resample to hourly, daily and monthly datasets and write to file
prepare_and_write(all_ds, out, v, m, '60min')
prepare_and_write(all_ds, out, v, m, '1D')
prepare_and_write(all_ds, out, v, m, 'M')

print(f'Files saved to {os.path.join(out, name)}...')
prepare_and_write(all_ds, args.outpath, v, m, '60min')

This comment has been minimized.

Copy link
@PennyHow

PennyHow Jun 11, 2024

Member

Yes, I wasn't sure if we would want all resampling performed in this step. Fine with me.

This comment has been minimized.

Copy link
@BaptisteVandecrux

BaptisteVandecrux Jun 11, 2024

Author Member

actually, if we already have a resample done on l2_raw and l2_tx, do we need to have another resampling on the l2_raw.combine_first(l2_tx) ? I might remove it later on.

# prepare_and_write(all_ds, out, v, m, '1D')
# prepare_and_write(all_ds, out, v, m, 'M')
print(f'Files saved to {os.path.join(args.outpath, name)}...')

if __name__ == "__main__":
join_levels()

0 comments on commit a6e8807

Please sign in to comment.