Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get basis of record categories #28

Closed
2 tasks done
peterdesmet opened this issue Jan 16, 2015 · 4 comments
Closed
2 tasks done

Get basis of record categories #28

peterdesmet opened this issue Jan 16, 2015 · 4 comments
Assignees
Labels

Comments

@peterdesmet
Copy link
Member

Description

For a given dataset, I want to know how many records have a certain basis of record. I also want to know how many of those are invalid. I envision this as a bar chart, where the records are grouped in categories based on basis of record.

Outcome

dataset_key
bor_preserved_specimen  // Preserved specimens
bor_fossil_specimen     // Fossil specimens
bor_living_specimen     // Living specimens
bor_material_sample     // Material samples
bor_observation         // Observations
bor_human_observation   // Human observations
bor_machine_observation // Machine observations
bor_literature          // Literature occurrences
bor_unknown             // Unknown

Terms we need

basisOfRecord

Ideally, use null if count is 0.

Questions

  • BASIS_OF_RECORD_INVALID returns 0 results. Most likely covered by Unknown evidence, can thus be ignored.
  • The basisOfRecord categories that GBIF provides are mutually exclusive.

Process

/* Map basisOfRecord to categories */
@peterdesmet peterdesmet added this to the Term metrics milestone Jan 16, 2015
@peterdesmet
Copy link
Member Author

@peterdesmet peterdesmet modified the milestone: Coordinate quality categories Jan 19, 2015
@peterdesmet peterdesmet changed the title Basis of record categories Get basis of record categories Jan 19, 2015
@peterdesmet peterdesmet added this to the Basis of record milestone Jan 19, 2015
@peterdesmet
Copy link
Member Author

The count API would require another method + 9 calls to the API. Decided to use metrics_store instead. All 9 fields are now available in CartoDB.

@bartaelterman
Copy link
Member

@niconoe based on the architecture proposal, I'm expecting the following tags in the extraction module output:

PRESERVED_SPECIMEN
FOSSIL_SPECIMEN
LIVING_SPECIMEN
OBSERVATION
HUMAN_OBSERVATION
MACHINE_OBSERVATION
MATERIAL_SAMPLE
LITERATURE
UNKNOWN

All under the tag basisofRecords.

I'll aggregate them and create a table with the desired columns defined in this issues body.

@niconoe
Copy link
Member

niconoe commented Jan 26, 2015

Very good ! I'll implement a first version of my module today and will let you know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants