Determine ProvONE index fields #66

mbjones · 2015-01-05T22:44:44Z

The ProvONE model has many fields, determine which should be indexed in Solr and how.

Discussion today led us to the following new fields for the Solr index:

Discussion: Do we use DataONE account URIs for Agent properties?

e.g., https://cn.dataone.org/cn/v1/accounts/CN=Jim%20Green%20A10401,O=Google,C=US,DC=cilogon,DC=org
Or DataONE DNs? CN=Jim%20Green%20A10401,O=Google,C=US,DC=cilogon,DC=org
Or ORCIDs?
Or any of these?

Yes, any. The indexer will take any of these values, and will then look up the other values that exist in the DataONE user portal and add those to the index as well. Model would use any of: hasOrcid, hasDN, foaf:Name, etc. The RDF Subject URI could be a anonymous blank node with each of these properties.

The text was updated successfully, but these errors were encountered:

mbjones · 2015-01-12T21:33:27Z

Reviewed these properties on today's sem-prov call.

I have now incorporated these into the PROVAnnotation design documents in SHA 69ceea7.

Open a new enhancement ticket if additional properties should be created, or a bug if one of these should be changed or revised.

laurenwalker · 2015-01-16T00:59:23Z

I propose we also index prov:used as "used."

leinfelder · 2015-01-20T16:19:07Z

Just a thought, but perhaps the index fields should be a little more distinguishable, like with a namespace: "prov_used"

The bare term - especially with a word like "used" - seems prone to confusion, collision, and misuse.

On Jan 15, 2015, at 4:59 PM, Lauren Walker notifications@github.com wrote:

I propose we also index prov:used as "used."

—
Reply to this email directly or view it on GitHub.

mbjones · 2015-01-20T19:58:46Z

Good idea. We should also be clear about the namespace in the term definition.

csjx · 2015-02-05T18:42:14Z

For the UI, being able to display an indication that a metadata document describing a data file has any provenance information. We decided that producing two more fields in the index would be helpful:

prov_hasSources
prov_hasDerivations

These would list the source pids and derivation pids for the data file this metadata describes. the UI will be able to display the total number based on the list size, and an icon that there is provenance information available. I'll open another ticket to add these.

csjx · 2015-02-05T18:52:31Z

To clarify where fields come from, we'll prefix them:
All fields will start with prov_ to diffrentiate them in the Solr index (and make searcing easier for provenance-enabled data packages). I'll open another ticket to add the prefix to the names.

mbjones assigned csjx Jan 5, 2015

mbjones added this to the WBS 2.9.6 Provenance Index and Query Service milestone Jan 5, 2015

csjx mentioned this issue Jan 8, 2015

Extend the RDF/XML processor to extract provenance fields #75

Closed

mbjones closed this as completed Jan 12, 2015

mbjones added the provenance label Jan 21, 2015

csjx mentioned this issue Jan 26, 2015

Change wasExecutedBy Solr field to hadExecution #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine ProvONE index fields #66

Determine ProvONE index fields #66

mbjones commented Jan 5, 2015

mbjones commented Jan 12, 2015

laurenwalker commented Jan 16, 2015

leinfelder commented Jan 20, 2015

mbjones commented Jan 20, 2015

csjx commented Feb 5, 2015

csjx commented Feb 5, 2015

Determine ProvONE index fields #66

Determine ProvONE index fields #66

Comments

mbjones commented Jan 5, 2015

mbjones commented Jan 12, 2015

laurenwalker commented Jan 16, 2015

leinfelder commented Jan 20, 2015

mbjones commented Jan 20, 2015

csjx commented Feb 5, 2015

csjx commented Feb 5, 2015