Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential errors and inconsistencies in bud burst data #21

Open
StefanVriend opened this issue Dec 1, 2023 · 5 comments
Open

Fix potential errors and inconsistencies in bud burst data #21

StefanVriend opened this issue Dec 1, 2023 · 5 comments
Labels
hurdle Hurdles/problems we encounter throughout the workflow. takeaway General lesson learned from this issue

Comments

@StefanVriend
Copy link
Collaborator

There are some inconsistencies (e.g., missing coordinates, inconsistent spelling) and errors (e.g., duplicated records) in the bud burst data. These need to be dealt with.

@StefanVriend StefanVriend added this to the Bud burst data milestone Dec 1, 2023
@StefanVriend
Copy link
Collaborator Author

StefanVriend commented Dec 1, 2023

The following issues have been dealt with:

  • Add missing coordinates for trees
  • Remove duplicated records
  • Fix erroneous records
  • Sort and standardise remarks field

@StefanVriend
Copy link
Collaborator Author

On Friday 15/12/2023, we fixed a few instances where trees go back to a previous bud burst stage (i.e., TreeTopScore at t is lower than TreeTopScore at t-1), which is biologically impossible.

On Monday 18/12/2023, we fixed a large number of instances where bud burst stages were not as expected. In most years, trees are scored inbetween 0 and 3 (with a 0.5-interval); in some years (1989-1990, 2001-2008) trees are also scored at 0.25-intervals. All other scores are incorrect and have been fixed. Most errors were due to rounding errors (1.8 instead of 1.75) or missing digits (5 instead of 0.5). Some other values were set to NA (after verifying the field books).

We did not check all TreeAllScores, which still contain some of the types of errors like the ones we found for TreeTopScore.

@CherineJ
Copy link
Collaborator

There are also errors in the TreeAllScores, which are similar to the once of TreeTopScore (scoring in 0.25 steps and rounding issues, missing digits, typos). We checked them in the field books as well (Thursday 21/12/2023) and forwarded them to the AnE database to be corrected.

@CherineJ
Copy link
Collaborator

All errors in the TreeTopScore & TreeAllScore have been fixed on 08/01/2024. Additionally, missing observer IDs have been assigned that are needed to fix #3.

@CherineJ
Copy link
Collaborator

Takeaway

We were able to solve most of the problems in the original data because we are the data owner and we could go back to the field books, talk to Marcel etc. This is however unlikely the case if the data comes from elsewhere and can therefore be a more severe problem, especially if metadata about a contact person etc. is missing.

@CherineJ CherineJ added hurdle Hurdles/problems we encounter throughout the workflow. takeaway General lesson learned from this issue labels Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hurdle Hurdles/problems we encounter throughout the workflow. takeaway General lesson learned from this issue
Projects
None yet
Development

No branches or pull requests

2 participants