Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANMN WAVE - schema has reference to non existing files #487

Closed
lbesnard opened this issue Jul 20, 2021 · 1 comment
Closed

ANMN WAVE - schema has reference to non existing files #487

lbesnard opened this issue Jul 20, 2021 · 1 comment
Assignees

Comments

@lbesnard
Copy link

By the look of it, the ANMN_WAVE harvester which writes in the anmn_wave schema has references to the ANMN NRS wave data files available in http://imos-data.s3-website-ap-southeast-2.amazonaws.com/?prefix=IMOS/ANMN/NRS/REAL_TIME/NRSDAR.

These files are part of the anmn_nrs_dar_yon schema and harvester to the similar name. So I don't understand why there are part of the anmn_wave schema as well.

Anyway, the anmn_wave schema seems to have references to files which don't exist such as:

IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20141201T015928Z_FV01_END-20141231T225808Z_C-20150325T103021Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2010/QAQC/IMOS_ANMN_W_20100703T095512Z_FV01_END-20100731T235544Z_C-20150325T102955Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2010/QAQC/IMOS_ANMN_W_20100801T015512Z_FV01_END-20100831T235544Z_C-20150325T102955Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130501T005944Z_FV01_END-20130531T225808Z_C-20150325T103009Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130701T003824Z_FV01_END-20130731T233840Z_C-20150325T103010Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130801T013808Z_FV01_END-20130831T233840Z_C-20150325T103011Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2015/QAQC/IMOS_ANMN_W_20150101T005944Z_FV01_END-20150131T225808Z_C-20150325T103022Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2015/QAQC/IMOS_ANMN_W_20150801T012936Z_FV01_END-20150901T000000Z_C-20150901T020228Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2016/QAQC/IMOS_ANMN_W_20160101T015928Z_FV01_END-20160201T000000Z_C-20160201T020554Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2016/QAQC/IMOS_ANMN_W_20160201T015928Z_NRSDAR_FV01_END-20160201T175928Z_C-20160201T200315Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2012/QAQC/IMOS_ANMN_W_20121209T160000Z_FV01_END-20121231T225808Z_C-20150325T103005Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130101T005944Z_FV01_END-20130201T000000Z_C-20150325T103006Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130201T015928Z_FV01_END-20130301T000000Z_C-20150325T103007Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2010/QAQC/IMOS_ANMN_W_20100901T015512Z_FV01_END-20100914T185456Z_C-20150325T102956Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2011/QAQC/IMOS_ANMN_W_20110102T105912Z_FV01_END-20110129T105912Z_C-20150325T102958Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2015/QAQC/IMOS_ANMN_W_20150501T002952Z_FV01_END-20150531T222816Z_C-20150601T180221Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2015/QAQC/IMOS_ANMN_W_20151101T005944Z_FV01_END-20151130T225808Z_C-20151201T170316Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2015/QAQC/IMOS_ANMN_W_20151201T005944Z_FV01_END-20160101T000000Z_C-20160101T020532Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130301T015928Z_FV01_END-20130331T225808Z_C-20150325T103008Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20130901T013808Z_FV01_END-20130930T223856Z_C-20150325T103012Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20131001T003824Z_FV01_END-20131031T233840Z_C-20150325T103012Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2013/QAQC/IMOS_ANMN_W_20131101T013808Z_FV01_END-20131130T173808Z_C-20150325T103013Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140124T035856Z_FV01_END-20140201T000000Z_C-20150325T103014Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140201T015928Z_FV01_END-20140301T000000Z_C-20150325T103014Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140301T015928Z_FV01_END-20140331T225808Z_C-20150325T103015Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140401T005944Z_FV01_END-20140430T225808Z_C-20150325T103016Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140501T005944Z_FV01_END-20140601T000000Z_C-20150325T103017Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140601T015928Z_FV01_END-20140701T000000Z_C-20150325T103017Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140701T015928Z_FV01_END-20140731T225808Z_C-20150325T103018Z.nc
IMOS/ANMN/NRS/REAL_TIME/NRSDAR/Peak_Wave_Period/peak_wave_period_channel_3001/2014/QAQC/IMOS_ANMN_W_20140801T005944Z_FV01_END-20140901T000000Z_C-20150325T103019Z.nc
....

It seems like this harvester doesn't clean data properly.

@lbesnard
Copy link
Author

lbesnard commented Aug 2, 2021

So those files were indexed in the anmn_wave schema. However they're not harvested because there content didn't match what the harvester expected.

Removing the entries from the harvester means that the chef-private databags has to be modified, since currently, these files don't match the regex (which was changed to be more specific). Currently the po_s3_del command wouldn't work.
Alternatively, the harvester could be changed to remove these entries.

However, the reference to those files is only in the indexed_table. For proof, there is no data associated to these files in both the WMS and WFS

select * from anmn_wave.anmn_wave_map where file_id in (select id from anmn_wave.indexed_file where url like 'IMOS/ANMN/NRS/REAL_TIME/%' and not deleted);
SELECT 0

select * from anmn_wave.anmn_wave_data where file_id in (select id from anmn_wave.indexed_file where url like 'IMOS/ANMN/NRS/REAL_TIME/%' and not deleted)
SELECT 0

I'd suggest not to do anything and just close this issue.

FYI @ggalibert

@lbesnard lbesnard closed this as completed Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants