Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content issues identified through data reports #290

Open
xhoenner opened this issue Aug 23, 2017 · 12 comments
Open

Content issues identified through data reports #290

xhoenner opened this issue Aug 23, 2017 · 12 comments
Assignees
Labels
data >= 12 months old task that has been put aside in the past 12 months because of other priorities

Comments

@xhoenner
Copy link

Will update this issue as I review the summary data reports on a monthly basis, refer to the latest set of reports for more details, in /Volumes/research/IMOS/emiiSheryl/eMII_data_report/Data_Reports.

@jachope, @smancini, @mhidas, @ggalibert, @bpasquer, @lbesnard feel free to add stuff in as you see fit. Note though that improvements to data reports will be listed here.

  • AATAMS satellite tagging: Campaigns without species names
  • Argo: @lbesnard
    1/ No new data since end of June, no active floats;
    2/ Query to generate views too slow, will have to change the harvester to produce some metrics used for reporting;
    3/ Problem with time range.
  • ABOS: Problem with time range. @mhidas
  • ANMN: RT: Problem with time range + null site name. @mhidas
  • ACORN: null station code for radials. Misspelling of Turquoise Coast site code @ggalibert
  • SOOP XBT NRT: Problem with time range. @smancini
  • FAIMMS: Problem with MYRWS site name. Should it be Myrmidon Reef instead? @lbesnard
@ggalibert
Copy link

ggalibert commented Aug 31, 2017

  • ACORN: null station code for radials. Misspelling of Turquoise Coast site code

Cannot find any trace of a radial with platform_code set to null.

@xhoenner
Copy link
Author

xhoenner commented Sep 3, 2017

@ggalibert There's an inconsistency in the site_code field as demonstrated by the output of the following query:

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url
site_code
"TURQ, Turquoise Coast"
"TURQ"
"CBG"
"SAG"
"ROT"
"BONC"
"COF"

@ggalibert
Copy link

This inconsistency: "TURQ, Turquoise Coast", has been introduced the 2017-06-01 by ACORN UWA when they started to produce the SeaSonde radial FV00 netCDF files themselves (without passing through SSR on our RT processing server). Have written to Simone and Andrew to see if they could update their script. If so then I would manually correct the historical inconsistent files.

@ggalibert
Copy link

ggalibert commented Sep 5, 2017

Other inconsistencies I'm putting there for the record:

select distinct site_code
from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"SAG, South Australian Gulf"
"SAG, South Australia Gulf"
"BONC, Bonney Coast"
"TURQ, Turqoise Coast"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"
select distinct site_code
from acorn_hourly_avg_nonqc.acorn_hourly_avg_nonqc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"TURQ, Turqoise Coast"
"BONC, Bonney Coast"
"SAG, South Australia Gulf"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"
select distinct site_code
from acorn_radial_qc.acorn_radial_qc_timeseries_url
site_code
"CBG"
"ROT"
"SAG"
"COF , Coffs Harbour (NSW), North Nambucca (NSW)"
"COF , Coffs Harbour (NSW), Red Rock (NSW)"
"COF"

Would that be affecting your reporting as well?

@xhoenner
Copy link
Author

xhoenner commented Sep 5, 2017

Would that be affecting your reporting as well?

Thanks for digging in a bit deeper @ggalibert. Yes any inconsistency in the site_code field will have an impact on the data reports, see for example the 'Hourly Vectors - QC' section in /Volumes/research/IMOS/emiiSheryl/eMII_data_report/Data_Reports/2017/Aug_2017/Ocean_Radar/OceanRadar_hourlyVectors_Summary.pdf

@ggalibert
Copy link

ACORN UWA has updated their scripts and I have just fixed this inconsistency: "TURQ, Turquoise Coast" instead of "TURQ" for historical SeaSonde radial FV00 netCDF files.

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url
site_code
"TURQ"
"CBG"
"SAG"
"ROT"
"BONC"
"COF"

@ggalibert
Copy link

Also, fixed this inconsistency: "SAG, South Australia Gulf" instead of "SAG, South Australian Gulf".

select distinct site_code
from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"SAG, South Australian Gulf"
"BONC, Bonney Coast"
"TURQ, Turqoise Coast"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"

@xhoenner
Copy link
Author

Thanks @ggalibert for looking into this. Still some issues with the following:

1. acorn_radial_qc

select distinct site_code, platform_code from acorn_radial_qc.acorn_radial_qc_timeseries_url order by site_code, platform_code;

which returns "COF, Coffs Harbour (NSW), North Nambucca (NSW)" instead of "COF" for platform_code 'NNB' and 'RRK'.

2. acorn_hourly_avg_nonqc

select distinct site_code from acorn_hourly_avg_nonqc.acorn_hourly_avg_nonqc_timeseries_url

which returns "TURQ, Turqoise Coast" instead of "TURQ, Turquoise Coast" and "SAG, South Australia Gulf" instead of "SAG, South Australian Gulf"

3. acorn_hourly_avg_qc

select distinct site_code from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url

which returns "TURQ, Turqoise Coast" instead of "TURQ, Turquoise Coast"

@ggalibert
Copy link

ggalibert commented Sep 13, 2017

@xhoenner

  1. getting there, currently uploading corrected files. There are 92 076 of them and it takes 5 sec to harvest 1 file...

  2. and 3. These are less priority / annoying problems. Basically just a typo which at least is consistent within the whole collection.

@ggalibert
Copy link

1. acorn_radial_qc fixed:

select distinct site_code, platform_code from acorn_radial_qc.acorn_radial_qc_timeseries_url order by site_code, platform_code;
site_code platform_code
"CBG" "LEI"
"CBG" "TAN"
"COF" "NNB"
"COF" "RRK"
"ROT" "FRE"
"ROT" "GUI"
"SAG" "CSP"
"SAG" "CWI"

@ggalibert ggalibert added the >= 12 months old task that has been put aside in the past 12 months because of other priorities label Apr 20, 2020
@leonardolaiolo
Copy link

I started investigating this and I can see in DB (acorn_radial_qc shema) that there are 6 new site_code than the ones reported in the SQL query. I believe 2 of them (CORL and NEWC) are new sites and should be added in the SQL, as they are in DB and in geonetwork consistently registered as sites.
I do no know what to do with the following site_code:

- COF, Coffs Harbour (NSW), Red Rock (NSW)
- COF, Coffs Harbour (NSW), North Nambucca (NSW)
- NWS
- “NWS”

@ggalibert I'm not sure what to do with the 4 site_code above as I cannot see them in the metadata registered as sites. Are these error in DB? I don't know much of these data but could be that this list above of site_code are actually platform_code?
Here the query to check all site_code in DB:

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url

Here a table to summarise all site_code found in DB and what I found in Geonetwork:

site_code geonetwork
TURQ Turquoise Coast (TURQ)
SAG South Australia Gulfs (SAG)
“NWS” ???
COF, Coffs Harbour (NSW), Red Rock (NSW) ???
COF Coffs Harbour (COF)
CORL Coral Coast (CORL)
CBG Capricorn Bunker Group (CBG)
ROT Rottnest Shelf (ROT)
COF, Coffs Harbour (NSW), North Nambucca (NSW) ???
NWS ???
BONC Bonney Coast (BONC)
NEWC Newcastle (NEWC)

@atkinsn
Copy link

atkinsn commented Aug 9, 2021

NWS = NWA: Northwest Shelf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data >= 12 months old task that has been put aside in the past 12 months because of other priorities
Projects
None yet
Development

No branches or pull requests

4 participants