Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report for visualization of descriptive metadata that distinguishes properties with a "value" #4522

Open
ndushay opened this issue Jun 16, 2023 · 2 comments
Assignees

Comments

@ndushay
Copy link
Contributor

ndushay commented Jun 16, 2023

As part of #4493, we need a report akin to "descriptive_shapes" from https://github.com/sul-dlss/dor-services-app/blob/main/app/reports/descriptive_shape.rb, which will feed visualizations in sul-dlss-labs/cocina-shapes.

The deltas needed

  • only count data when the cocina DescriptiveValue (or equivalent) has a value. For example, we don't care about the authority being LCSH if there is no actual value in a subject.
  • two kinds of counts needed:
    • count once per object if there is one or more value
    • count all occurrences with values per object

we still want the desire to select for ils catalog links present or absent, as is in the descriptive shapes report.

Refs:

Spec:

  • 2 separate types of counts to track:

    • presence of one or more values in an object
    • number of occurrences with values in objects

    if object A has 2 subjects (with value) and object B has 3 subjects (with value):
    - presence of one or more subjects per object: 2: 1 for object A and 1 for object B
    - number of occurrences of subject: 5: 2 for object A and 3 for object B

How to determine if a cocina descriptive property has a value:

DescriptiveBasicValue

QUESTION: Note that code property with a value is only ok if we know the source -- punting for now. Note that these are all defined as type DescriptiveValue in the openapi doc

DescriptiveValue

DescriptiveBasicValue with appliesTo property added

Treat as DescriptiveBasicValue

Cocina Description Top Level Values

Title

Treat as DescriptiveBasicValue

Contributor

only count value if direct children properties of name, note, or identifier have value per DescriptiveBasicValue

Event

only count value if

  • direct children properties of date, contributor, location, identifier, note have value per DescriptiveBasicValue
  • direct child parallelEvent as above, e.g. parallelEvent[].date[].value

Form

Treat as DescriptiveBasicValue

Language

as DescriptiveBasicValue with script property added:

  • add language[].script.value as a value (parallelValue, structuredValue, groupedValue etc. can be in path after script)

Note

Treat as DescriptiveBasicValue

Identifier

Treat as DescriptiveBasicValue

Subject

Treat as DescriptiveBasicValue

Geographic

has only form and subject properties as direct children -- treat these as you would top level subject or form

Purl

It's a string. count if value is not blank

Access

Treat all immediate children (url, physicalLocation, digitalLocation, accessContact, digitalRepository, note) as DescriptiveBasicValue

RelatedResource

count it for any "top level" properties under relatedResource
e.g. relatedResource[].title[].structuredValue[].value - counts as a value per recursion
but relatedResource[].status - does NOT count as a value

AdminMetadata

follow subproperties (note, event, contributor, identifier, language, metadataStandard) and use rules of indicated value, e.g.:
adminMetadata.note[].value - count as value if present, but also add valueAt as a valid value

SUB-PROPERTIES

These should only be counted if the parent has a value per above

Standard

ValueLanguage

as Standard plus valueScript property, which should also count as a value

Encoding

exactly equivalent to Standard; treat same as Standard

Source

as Standard, but without source property ... so code cannot count

MarcEncodedData

  • only present for RequestDescription
    Can ignore because we're going from Marc to MODS to Cocina, not directly Marc to Cocina

If we weren't ignoring: treat as DescriptiveValue


DECISION: the below is probably more human hours than it gains in computing hours

paths we can SKIP when looking for presence of value per object (these are useful only if there is a value:
- do not count for presence of value
- count occurences only when a value is present

  • encoding
  • source
  • standard
  • valueLanguage
  • valueScript
@ndushay
Copy link
Contributor Author

ndushay commented Jun 22, 2023

So, I have a DRO that has 3 top level contributors with values. Am I counting ALL subvalues for each contributor with a value, for the report that counts all occurrences? And what am I counting for “presence of one or more values” ? Just the top level contributor fields? (How would I count presence or value for sub properties? Use all 3 occurrences with values to see if any of them use a sub property?)

If this spec is this hard ... that seems like a smell. This seems frighteningly complex for basic questions, like "does object A have a contributor in the desc metadata (that is a real value and isn't noise)?"

@ndushay
Copy link
Contributor Author

ndushay commented Jun 23, 2023

@arcadiafalcone: event has a top level structuredValue, in addition to parallelEvent. Should that be allowed somehow?

Also, for event, if there is a top level subproperty of "contributor:" should it be treated as a descriptiveBasicValue, or as a contributor (only care about it if there is a name, note or identifier child property)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant