You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This check will look to see if the format name listed in the physical section of the metadata matches the data file's format.
Components
format name from physical section
file format from data file itself
Result
SUCCESS: if the two formats match
FAILURE: if the two formats do not match
ERROR: on system error or exception in the check code, representing a bug in the check system
The text was updated successfully, but these errors were encountered:
I'm starting to think about this check, and wondering what pieces of information we should actually be checking. The data-format of a file will be recorded in 3 places, potentially:
the file itself
sysmeta formatId
EML dataFormat OR ISO field
I'm envisioning this as a congruency check between the return value of file test.csv (bash, wrapped in an R system call) and either the formatId, metadata field, or both. I see a couple of hurdles though.
establishing a mapping between the format name in each location. For example, file test.csv returns "CSV text" which should be "text/csv" in the sysmeta, and in EML could be described in a few ways but probably the most reliably as `physical/dataFormat/textFormat/ with fieldDelimeter set to ","
I don't think that any checks currently actually use sysmeta values as part of the check. So need to review how this information could be made available to the R process.
I think the steps needed to implement this check are, in order of difficulty:
determine how (if?) sysmeta values can be passed to check code in metadig-engine
add sysmetaXML as an argument to runCheck in metadig-R
establish mapping between common file types for the output of file commands and DataONE formatIds
Purpose
This check will look to see if the format name listed in the physical section of the metadata matches the data file's format.
Components
Result
SUCCESS: if the two formats match
FAILURE: if the two formats do not match
ERROR: on system error or exception in the check code, representing a bug in the check system
The text was updated successfully, but these errors were encountered: