Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggest to support EML json to define table structures #942

Closed
jhpoelen opened this issue Nov 28, 2023 · 5 comments
Closed

suggest to support EML json to define table structures #942

jhpoelen opened this issue Nov 28, 2023 · 5 comments

Comments

@jhpoelen
Copy link
Member

Following example by @zedomel in globalbioticinteractions/carvalheiro2023@23d3ba6#diff-8e200691cc9a15b8059aab1f16475a480536b1b07858c89e58aedcd126d44215

jhpoelen referenced this issue in globalbioticinteractions/carvalheiro2023 Nov 28, 2023
@jhpoelen
Copy link
Member Author

@mbjones - an EML question - is there a preferred filename for a eml metadata descriptor for a dataset?

eml.json
eml.xml
metadata.xml
metadata.json

What do you typically encounter when working with DataONE datasets?

@mbjones
Copy link

mbjones commented Nov 28, 2023

For EML, we see everything under the sun, but mostly people use the .xml extension. For the name, we see either a project-based name like some-project-2006.xml, and less frequently we'll see a generic name like metadata.xml. For our own data packages, we name the metadata files from the dataset title, mangled to remove illegal and non-printing characters from the filename. For example, for the DOI https://doi.org/10.18739/A2ZS2KF4B we have the metadata filename Retrogressive_thaw_slump_data_derived_from_ArcticD.xml.

@jhpoelen
Copy link
Member Author

@mbjones thanks for sharing your insights on use/naming of of eml files.

jhpoelen pushed a commit to globalbioticinteractions/carvalheiro2023 that referenced this issue Nov 28, 2023
@jhpoelen
Copy link
Member Author

jhpoelen commented Nov 28, 2023

A first pass at using EML to define table schemas for species interaction data tables is available via

elton v0.12.13

@zedomel @Filipi-Soares

example:

elton pull globalbioticinteractions/carvalheiro2023

yields

updating [globalbioticinteractions/carvalheiro2023]... done.

followed by

elton interactions globalbioticinteractions/carvalheiro2023\
  | head -n2\
  | mlr --itsvlite --oxtab cat

yields:

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     
sourceCatalogNumber                    
sourceCollectionCode                   
sourceCollectionId                     
sourceInstitutionCode                  
sourceTaxonId                          
sourceTaxonName                        Pilosella oficinarum
sourceTaxonRank                        species
sourceTaxonPathIds                     
sourceTaxonPath                        Plantae
sourceTaxonPathNames                   kingdom
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002623
interactionTypeName                    flowersVisitedBy
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Andrena
targetTaxonRank                        genus
targetTaxonPathIds                     
targetTaxonPath                        Hymenoptera | Andrenidae
targetTaxonPathNames                   order | family
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      
http://rs.tdwg.org/dwc/terms/eventDate 
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           England
referenceDoi                           
referenceUrl                           https://docs.google.com/spreadsheets/u/1/d/1cJ0qX9ppqHoSyqFykwYJef-DFOzoutthBXjwKRY81T8/export?format=tsv&id=1cJ0qX9ppqHoSyqFykwYJef-DFOzoutthBXjwKRY81T8&gid=776329546
referenceCitation                      https://docs.google.com/spreadsheets/u/1/d/1cJ0qX9ppqHoSyqFykwYJef-DFOzoutthBXjwKRY81T8/export?format=tsv&id=1cJ0qX9ppqHoSyqFykwYJef-DFOzoutthBXjwKRY81T8&gid=776329546
namespace                              globalbioticinteractions/carvalheiro2023
citation                               WorldFAIR pilot data from: VisitationData_Luisa_Carvalheiro.
archiveURI                             https://github.com/globalbioticinteractions/carvalheiro2023/archive/f498352c233d0172af29df964406f170c072d668.zip
lastSeenAt                             2023-11-28T21:44:11.747Z
contentHash                            8c8dea35d77ca7e73702839e3753a147215fdfd70886974ff434a24330247597
eltonVersion                           0.12.13

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 5, 2024

feature is working as expected, @zedomel et al. please holler if you find any issues or have any suggestions.

@jhpoelen jhpoelen closed this as completed Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants