You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merging meteofiles in the modelbuilder using dfmt.preprocess_merge_meteofiles_era5() is quite slow. The example code below contains the bottleneck code. Reading the data is fast, but writing the merged data to netcdf is slow.
importosimportdfm_toolsasdfmtimportdatetimeasdtmodel_name='sss'dir_base=r'p:\11210331-004-bes-modellering-2024\4_simulations\hydrodynamica\preprocessing\modelbuilder'dir_output=os.path.join(dir_base, f'modelbuilder_output_{model_name}')
date_min='2020-12-01'date_max='2023-01-01'time_slice=slice(date_min, date_max)
# define paths and pattern of source datadir_data_era5=os.path.join(dir_output, 'data', 'ERA5')
varlist_lists= [['msl','u10n','v10n','chnk'],['d2m','t2m','tcc'],['ssr','strd'],['mer','mtpr']]
varkey_list=varlist_lists[0]
fn_match_pattern=f'era5_.*({"|".join(varkey_list)})_.*.nc'#simpler but selects more files: 'era5_*.nc'file_out_prefix=f'era5_{"_".join(varkey_list)}_'preprocess=dfmt.preprocess_ERA5#reduce expver dimension if presentfile_nc=os.path.join(dir_data_era5, fn_match_pattern)
# read multifile datasetdata_xr_tsel=dfmt.merge_meteofiles(file_nc=file_nc, time_slice=time_slice, preprocess=preprocess)
# write to netcdf file (slow)print('>> writing file (can take a while): ',end='')
dtstart=dt.datetime.now()
times_pd=data_xr_tsel['time'].to_series()
time_start_str=times_pd.iloc[0].strftime("%Y%m%d")
time_stop_str=times_pd.iloc[-1].strftime("%Y%m%d")
file_out=os.path.join(dir_output, f'{file_out_prefix}{time_start_str}to{time_stop_str}_ERA5.nc')
data_xr_tsel.to_netcdf(file_out)
print(f'{(dt.datetime.now()-dtstart).total_seconds():.2f} sec')
This could be faster by using chunks when reading the files, or using a different merging method. Some investigation is still needed.
The text was updated successfully, but these errors were encountered:
Merging meteofiles in the modelbuilder using
dfmt.preprocess_merge_meteofiles_era5()
is quite slow. The example code below contains the bottleneck code. Reading the data is fast, but writing the merged data to netcdf is slow.This could be faster by using chunks when reading the files, or using a different merging method. Some investigation is still needed.
The text was updated successfully, but these errors were encountered: