Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error catches to add to the PlutoF download and QC script #8

Closed
kmexter opened this issue Sep 20, 2022 · 2 comments
Closed

Error catches to add to the PlutoF download and QC script #8

kmexter opened this issue Sep 20, 2022 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@kmexter
Copy link
Contributor

kmexter commented Sep 20, 2022

It is necessary to catch the following cases in the script where PlutoF metadata are downloaded and subjected to a QC on the station and ARMS unit names

The input to this QC is the CSV file with the name PlutoF_QC_StationsARMSnames.csv and it will be updated by (usually) Katrina, using git history to keep track of changes that will be made to it.
This file has 6 columns: Station, Stations corrected, Country, Country corrected, ARMS unit, ARMS unit corrected. The QC script will change all the Station, ARMS unit, and Country that it encouters in the PlutoF download, to the "corrected" values.
If there are a different number of station+arms units in PlutoF than in this spreadsheet (because more have been added to PlutoF since the spreadsheet has been made) then the QC may not be done fully.
To avoid this the script should do the following:

  1. check that all the "stations" downloaded from PlutoF are in the spreadsheet, and that the name in the Stations column matches the name in PlutoF exactly
  2. for each station, check that all the "ARMS units" downloaded from PlutoF are in the spreadsheet, and that the name in the ARMS unit column matches the name in PlutoF exactly
  3. produce a QC report, called "PlutoF_HarvestQCreport.csv" which can be a copy of the contents of "PlutoF_QC_StationsARMSnames.csv" with additional columns
    Station ; Station corrected; Station QC; Country; Country corrected; Country QC; ARMS unit; ARMS unit corrected; ARMS unit QC
  • If the station and/or country and/or arms unit matches in PlutoF with the input from PlutoF_QC_StationsARMSnames.csv: the entry in the respective QC column of PlutoF_HarvestQCreport.csv is "passed"
  • If an entry in PlutoF_QC_StationsARMSnames.csv is not found in PlutoF: the entry in the respective QC column is "not passed"
  • If there is an entry in PlutoF (a station, or a country, or a unit) that is not in PlutoF_QC_StationsARMSnames.csv, then add a new row to PlutoF_HarvestQCreport.csv with the appropriate value of the "station" "country" and "ARMS unit" added, and with the QC column having the word "new" in there for the part that is new (be that station and/or country and/or unit)

So, if the QC input is:
Station | Station corrected | Country | Country corrected | ARMS unit | ARMS unit corrected
Koster |   | Sweden |   | Koster_VH1 | VH1
....

Then the QC output would look something like:
Station | Station corrected | Station QC | Country | Country corrected | Country QC | ARMS unit | ARMS unit corrected | ARMS unit QC
Koster | | passed | Sweden | | passed | Koster_VH1 | VH1 | passed
....
Koster | | | Sweden | | | Koster_VH4 | | new

I see that now the Belgian and one other station have arms units that I do not have in the current QC input file, so if you test it out now there should be some "new"s in the QC report

@kmexter kmexter added the enhancement New feature or request label Sep 20, 2022
@kmexter
Copy link
Contributor Author

kmexter commented Sep 20, 2022

Reminder to self: once this has been done, write about it in the QC explanation text file I have now added to the Data/FromPlutoF folder

@kmexter kmexter closed this as completed Sep 30, 2022
@kmexter
Copy link
Contributor Author

kmexter commented Sep 30, 2022

done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants