Skip to content

Latest commit

 

History

History
164 lines (106 loc) · 6.46 KB

2020-06-03.dcap_zoom_call.md

File metadata and controls

164 lines (106 loc) · 6.46 KB

DCAP meeting Wednesday June 3, 2020

Time: 14:00 UTC

Zoom Join URL: https://us02web.zoom.us/j/89008422202?pwd=NEcrWkxjc3kwN3k4bUxXVmx1bS9mZz09

HackMD agenda: https://hackmd.io/xfGmpp3zROWbckhvIRmNBA

Participants

  1. Ben Riesenberg (ries07@uw.edu)
  2. Phil Barker
  3. Grace Johnson
  4. Tom Baker
  5. Nishad Thalhath
  6. John Huck
  7. Phil Barker

Agenda

Minutes of previous meeting

Examples

  1. Phil's bookclub example

  2. John's DataCite

  3. Examples will be developed for: schema recipe - Phil Wikidata hospital - KC contentDM? - KC BIBFRAME - KC DataCite - John H

Some areas that need more discussion are:

  1. Value type and value space - need clear definitions
  2. Where to put namespace columns and what to call those columns
  3. Should entity descriptions be on their own row? (Advantages: can annotate, can give cardinality)
  4. Can one designate properties as usable anywhere (aka: free-floaters)? (Idea: place them first before first entity)
  5. There is still the question of IDs for properties (KC will attempt an example)

Minutes

Phil's Bookclub e.g. discussion

  • entity on its own row means potential for more info about entity:
    • cardinality
    • labels
    • comments Ben: good that first view of entity is not at first encounter of property

Nishad: would like to have a base URI for the profile for anything without a prefix

Phil: clarify between default URI and profile URI

?? does @ represent a URI? is @ a prefix?

Nishad: makes profile entities more reusable

Tom: would shapes ever be URIs?

Phil: will need a URI for profile that is base for entities in the profile

Tom: Is there a requirement for the profile to describe itself? And where would be base ID go in the template?

  • Tom: xsd: string in value space ? Phil: actually four things:
  • enumeration of allowed values
  • concept scheme
  • conforms to specification, e.g. xsd:string, which tells you how to form the literal
  • type of URI

John: agree, probably more than one

Tom: a list can just be more than one of a thing; a list of URIs would be of type URI

Phil: is type the type (expected or valid type) of the instance data?

KC: need to define what is the funciton of type

Phil: type is useful when receiving instance data to know whether you are receiving a URI or Bnode.

Also, schema.org often gives you a choice of types

KC: in shex you can say URI or LITERAL. But with that you can't provide more specific rules. This makes more sense for validating than creating. Creation templates usually provide more structure. Template is to help people define and create data.

What does a BNODE look like in a data creation environment?

  • Prefixes Tom: not enthused about prefixes at the top, nor columns doing double duty. Prefer to document them in a separate document. Tom: 2nd column more specific property URI

  • John's example (DataCite)

    • Using their AP; probably out of data from current datacite schema.
    • includes mapping to other vocabularies
    • there is the problem of OR Tom: best to punt on OR

KC: this one possibly could be modeled as an entity:

property is: dc:subject (a literal) or dct:subject (a URI)

has subject -> Mandatory -> @subject

@subject dc:subject - optional -> literal dct:subject - optional -> URI

Ben: This type of choice is common in Sinopia. When a property has a list lookup that would be a URI, if your value is not found you have the option of entering a literal, so it can be either

Tom: Where do you program in the assumptions? Some could be in the next step, the conversion script. Use profile as a start for shex/shacl, but then edit the latter for more precision.

Next tasks: Tom: work on value type, value space. Look at examples. Thinking about conversion. Just do those 2 or 3 columns as examples, not whole templates. Phil: working on a conversion script

Definitions

Abstract model strawman: "A profile can have one or more entities, each with one or more statements."

Naming and defining:

  • Profile_Entity

    • Definition strawman: "A thing or concept that will be described by the profile."
    • Is the entity described on its own row? Does it have cardinality?
    • Column label:
    • Also suggested:
  • Property

    • Definition strawman: "The previously defined data element that can be used to describe an aspect of the entity."
    • Column label:
  • Statement

    • Definition strawman: "The property and value constraints that describe an aspect of the entity."
    • Also suggested:

(Note to kc self: to what does the annotation refer? The entire row? Some element in the row?)

Minutes

Open Questions

  1. Will we want to define URIs for these properties? Do we prefer to use existing DC (or other) terms or AP-specific?
  2. How to define namespaces within a tabular format?
  3. How far do we want to take the value definitions? Include more, or leave to developers? What is the optimum "simple" for this? Would we want to include regex formulas in this column?
  4. Can valuetype be expanded to include pick lists, URI stems? Or do we limit to standard value types?
  5. Should the entity be on a separate row, or only on the row with properties? (advantages to separate: can include cardinality; can have an entity-specific annotation)
  6. How can we include constraints for "recommended" and "mandatory if applicable"?
  7. What transformations can we add to our prototypes?
  8. Should we specifically develop a "ShEx-friendly" version? e.g. https://github.com/johnsamuelwrites/dcap/blob/master/ShExStatements/shexstatements.ipynb

Resolved

  1. Can minimalist profile be the core of a more extensive profile template? (We seem to agree that it SHOULD.)
  2. Can the "simple" be used as is, or are more properties needed? (We are going with "as is" for now)
  3. How will we handle multiple values in a single cell (e.g. a pick list of string values)? ANSWER: For now using ShEx method of space between entries.