Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Term - dcterms:PhysicalResource #421

Closed
baskaufs opened this issue Nov 23, 2022 · 66 comments
Closed

New Term - dcterms:PhysicalResource #421

baskaufs opened this issue Nov 23, 2022 · 66 comments
Labels
Class - new Task Group - Material Sample https://www.tdwg.org/community/osr/material-sample/ Term - add

Comments

@baskaufs
Copy link
Contributor

New term

  • Submitter: Steve Baskauf
  • Efficacy Justification (why is this term necessary?): At the present time, Darwin Core does not have a top level class term for describing physical entities. The Material Sample Task Group has been working for over a year to sort out the relationships among existing classes of material things in Darwin Core with the goal of addressing shortcomings in our ability to share information about physical things. This task has been hampered by the shortcomings of the current broadest term for material things: dwc:MaterialSample. Because that term requires an aspect of sampling, it conflates the role of the material (to serve as a sample) and the fundamental type of the resource (that it's a material rather than digital or information resource). This complication has hindered the progress of the group, whose work is now at a critical phase with the need to harmonize its work with that of Latimer Core (currently under review). This proposal would improve the situation by importing into Darwin Core a term from what is probably the most well-known metadata vocabulary: Dublin Core. That term has exactly the scope (material things) that is being considered by the Material Sample Task Group and importing it into Darwin Core would allow the group to clarify its work by describing the kinds of material things, rather than material samples.
  • Demand Justification (name at least two organizations that independently need this term): Material Sample Task Group, which includes representatives of over 10 organizations, GBIF because this would be a standardized term for the highest level classification in its developing "Grand Unified Model": "Material Entity"
  • Stability Justification (what concerns are there that this might affect existing implementations?): This would not prevent the use of any existing classes of material things within Darwin Core. However, it would be understood that it would be a superclass to dwc:MaterialSample, dwc:PreservedSpecimen, dwc:LivingSpecimen, dwc:FossilSpecimen. Since entailment-generating assertions are no longer included in basic "bag-of-term" metadata, these subclass relations would not be asserted directly by subClassOf statements for those other classes, although such assertions could be included in a vocabulary enhancement layered on top of the Darwin Core bag of terms. The relationship of this term to dwc:Organism should be the subject of future discussion.
  • Implications for dwciri: namespace (does this change affect a dwciri term version)?: N/A since this is not a property

Proposed attributes of the new term:

  • Term name (in lowerCamelCase for properties, UpperCamelCase for classes): dcterms:PhysicalResource
  • Organized in Class (e.g., Occurrence, Event, Location, Taxon): N/A, it is a class itself
  • Definition of the term (normative): A material thing. (taken verbatim from the DCMI Metadata Terms)
  • Usage comments (recommendations regarding content, etc., not normative):
  • Examples (not normative): a specimen, material that is destructively sampled, a physical object observed but not collected
  • Refines (identifier of the broader term this term refines; normative):
  • Replaces (identifier of the existing term that would be deprecated and replaced by this term; normative):
  • ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG; not normative): none (I think? Don't see any equivalent class in the Unit Type Classes)
@Jegelewicz
Copy link

I support this!

The relationship of this term to dwc:Organism should be the subject of future discussion.

For some discussions already in place see
tdwg/material-sample#22
tdwg/material-sample#23

The MaterialSample Task Group concluded that an organism may be a MaterialSample, but not all MaterialSamples are organisms. I think we could safely replace MaterialSample with PhysicalResource and say the same thing.

@tucotuco
Copy link
Member

tucotuco commented Nov 23, 2022 via email

@baskaufs
Copy link
Contributor Author

baskaufs commented Nov 23, 2022

One interesting observation that I think is an aside. Darwin Core
incorporates dcterms:type, which has a controlled vocabulary that includes
dcmi:PhysicalObject, which is NOT the same as a dcterms:PhysicalResource.

One thing that came out of the discussion between the Audubon Core 3D Task Group and the DCMI usage board was that using the DCMI type vocabulary as the source of values for dcterms:type is only a suggestion, not a requirement. They pointed out that the comment is "Recommended practice is to use a controlled vocabulary such as the DCMI Type Vocabulary" (my emphasis) and that communities of practice can establish their own controlled vocabularies that they expect to be used. In the past, TDWG has been pretty focused on only using the DCMI type vocabulary, but I think it's time to consider deviating from it. I know that in the case of Audubon Core, it has pretty much been decided to mint a "digital3DResource" class to be used with dcterms:type since none of the existing terms works and the DCMI Usage Board has no intention of adding any type terms.

Another comment relevant to this is that we should consider whether it's actually more appropriate to be using rdf:type rather than dcterms:type. dcterms:type is more well known, but DCMI has actually said that at least in an RDF context, rdf:type is to be preferred over dcterms:type. I'll have to find the reference, but I think it's cited in the Darwin Core RDF Guide somewhere.

@baskaufs baskaufs reopened this Nov 23, 2022
@baskaufs
Copy link
Contributor Author

Another comment relevant to this is that we should consider whether it's actually more appropriate to be using rdf:type rather than dcterms:type. dcterms:type is more well known, but DCMI has actually said that at least in an RDF context, rdf:type is to be preferred over dcterms:type. I'll have to find the reference, but I think it's cited in the Darwin Core RDF Guide somewhere.

It's here

@cboelling
Copy link
Member

cboelling commented Nov 24, 2022

I support the addition of this element with the scope indicated. The terminological choice of the label "PhysicalResource" is one of a couple of options, including perhaps "MaterialEntity", "PhysicalEntity", "MaterialObject", "PhysicalObject" - see below.

I can confirm that the proposed dcterms:PhysicalResource is, in its generality, exactly what is needed to facilitate export and sharing of representations of object histories from the applications being developed in the DINA consortium.

More importantly, addition of this term will help to close a gap towards a multitude of knowledge representations in the Open Biological and Biomedical Ontologies, a number of which, for a long time already, use the element MaterialEntity as a top-level class. This class was originally defined in the Basic Formal Ontology (BFO).

To foster compatibility I therefore propose to use the label "MaterialEntity" for this new term and updating the definition along the lines of the annotation provided for the corresponding element in BFO.

In comparison to DCMI I find that BFO is the more advanced and conceptually consistent framework to structure knowledge representations, so it might be helpful to take it into consideration here. One example to illustrate this is that DCMI, alongside dcterms:PhysicalResource has dcterms:PhysicalMedium defined as A physical material or carrier. without specifying the relation between these classes.

One note about process: personally, given the work that has been done in the Task Group, and especially the outcomes of the recent working session on Nov 7 2022, I would have preferred if the motion to add this element would have been proposed in the Material Sample Task Group first to mint a joint proposal as a reflection of the contributions to the task group by a number of individuals.

@tucotuco tucotuco added Class - new Task Group - Material Sample https://www.tdwg.org/community/osr/material-sample/ labels Nov 24, 2022
@tucotuco
Copy link
Member

Nice summary, @cboelling. The MaterialEntity class in the Unified Model originated in BFO, so there is perfect compatibility there.
To ameliorate the source of this proposal, I have added the issue label "Task Group - Material Sample" in an attempt to give credit to its origins. We can move the issue to the Task Group issue tracker if that is more palatable, awaiting a consensus from there on how to move forward with all Material Sample considerations at once. Open to suggestions on that front.

@jbstatgen
Copy link

  • Because that term requires an aspect of sampling, it conflates the role of the material (to serve as a sample) and the fundamental type of the resource (that it's a material rather than digital or information resource).

Thanks a lot @baskaufs for separating out these two aspects. With this, I feel we have a basis from which to move the next step forward.

A request for clarification: Are bfo:MaterialEntity and dcterms:PhysicalResouces equivalent (exactly the same)?

Checking the definition and editor note for bfo:MaterialEntity and reading its examples, it seems that MaterialEntity includes "energy", eg. examples are "a photon", "a tornado", "an energy wave". Thus, the core concept of MaterialEntity I would associate with "empirical".

What we are talking about in our context, I thought, is rather the "mass" than then "energy" part of "matter" (cp. the bfo:MaterialEntity editor note). dcterms:Physical Resource seems vague on the details - would that mean that it might be more fitting?

My concern is that if we go too far into the "energy" part of matter, then we end up with a continuum between PhysicalResource and "InformationArtifact " as eg. attributes of a term "baseTypeOfCollection" or similar at the object level, because at one point the 0's and 1's of digital information become matter.

However, if bfo:MaterialEntity and dcterms:PhysicalResource are equivalent, then there is no difference between choosing one or the other term; plus the philosophizing about matter and information might be moving too much into splitting hairs.

@cboelling
Copy link
Member

cboelling commented Nov 28, 2022

A request for clarification: Are bfo:MaterialEntity and dcterms:PhysicalResouces equivalent (exactly the same)?

From the available documentation, I understand that these are meant to be equivalent - although I have little experience in how dcmi:PhysicalResource is practically applied.

Checking the definition and editor note for bfo:MaterialEntity and reading its examples, it seems that MaterialEntity includes "energy", eg. examples are "a photon", "a tornado", "an energy wave". Thus, the core concept of MaterialEntity I would associate with "empirical".

What we are talking about in our context, I thought, is rather the "mass" than then "energy" part of "matter" (cp. the bfo:MaterialEntity editor note). dcterms:Physical Resource seems vague on the details - would that mean that it might be more fitting?

I find some of the examples for instances given for bfo:MaterialEntity in the BFO documentation confusing or even plainly wrong in light of the elucidation part of their documentation (because the BFO folks have a specific idea of how a definition should look like, they avoid the term "definition" for some of the high level classes). The elucidation for bfo:MaterialEntity reads: A material entity is an independent continuant that has some portion of matter as proper or improper continuant part. This makes reference to the even more general class of bfo:IndependentContinuant.

So, a thing is an instance of bfo:MaterialEntity if and only if it has some portion of matter as its part.

So an energy wave in my view most certainly (I am not a theoretical physicist) isn't a bfo:MaterialEntity. A tornado, in any practical application of BFO that I know of, would likely be described as a process (a bfo:Occurrent) in which numerous instances of bfo:MaterialEntity (grains of dust, droplets of water) participate. The terms "participate" and "being part" in this context (and in contrast to their colloquial use in everyday conversation) designate fundamentally different relations. In short, Material entities participate in a process (which in itself isn't a material thing) but may be part of a larger material entity.

For the proposed element as top level class for acommodating physical (material) entities, the important thing is that any item that ordinarily is dealt with in preparing, handling or using a collection object falls squarely into the scope of bfo:MaterialEntity, or, as it seems, dcterms:PhysicalResource.

Information artifacts are clearly separated from Material Entities in the BFO world as so called generically dependent continuants, which on a conceptual level makes a lot of sense to me.

@jbstatgen
Copy link

@cboelling Thanks for pointing further into the BFO. The sub-entries in the menu under "entity" are quite enlightening. Happily they seem to support some of our trains of thought so far. In addition, there is plenty more to think about.

Good to know that I'm not the only one who has been wondering about the examples for bfo:MaterialEntity.

While I might prefer "PhysicalResource" as term, I see the following:

  • "Physical": a definition?
  • "Resource": ethical concerns are arising with regard to 1) human remains and 2) objectification and "resourcifying" of nature

Thus, (a more narrow definition/use of) bfo:MaterialEntity might be a better choice.

@Jegelewicz
Copy link

What we are talking about in our context, I thought, is rather the "mass" than then "energy" part of "matter" (cp. the bfo:MaterialEntity editor note). dcterms:Physical Resource seems vague on the details - would that mean that it might be more fitting?

I am open to both PhysicalResource or MaterialEntity however, the MaterialEntity editor notes give me pause.

Also, what appropriate term will designate "not material" stuff? Can someone point to this in either DCMI or BFO? Although it isn't part of "MaterialSample" work, it will be important to know when one has a PhysicalResource or MaterialEntity and when one has whatever else we might be interested in.

I think the best definition I have heard for "material" is that it is something made of atoms. :-)

@tucotuco
Copy link
Member

In BFO the independent continuant is divided into (and only into) material entity and immaterial entity. See https://ontobee.org/ontology/BFO?iri=http://purl.obolibrary.org/obo/BFO_0000004

Ignoring 20th century physics (sticking with atoms) may be fine for our use cases, but it grates on my inner physicist. ;-)

@Jegelewicz
Copy link

Ignoring 20th century physics (sticking with atoms) may be fine for our use cases, but it grates on my inner physicist. ;-)

Oh yeah! I feel that too - but it does seem appropriate for our use?

@cboelling
Copy link
Member

Also, what appropriate term will designate "not material" stuff? Can someone point to this in either DCMI or BFO? Although it isn't part of "MaterialSample" work, it will be important to know when one has a PhysicalResource or MaterialEntity and when one has whatever else we might be interested in.

To answer this it would be helpful to have concrete examples of what you are thinking of with respect to not material stuff. For some entities BFO provides, in my opinion, an attractive framework to organize them and relate them to Material Entities. Note that BFO operates, as is my understanding, on the open world assumption. The classes in BFO are not meant to be exhaustive, it is valid to declare additional classes / subclasses in your own application.

@deepreef
Copy link

Chiming in to voice my general support on this! One word of caution, though:

The class dcmitype:PhysicalObject would clearly not work as a broader term (superclass) for dwc:Organism, as an Organism need not be inanimate.

I'm not sure if @tucotuco meant to suggest that dwc:Organism would be treated as a subclass of this new class (whatever the label ends up being), but I would strongly advise agaisnt representing it that way. It's really important from an informatics perspective to acknowledge that the "essence" of an instance of dwc:Organism transends its physical manifestation. An instance of dwc:Organism on the day of its birth shares very few atoms with that same dwc:Organism instance on the day of its death; yet informatically we want to treat them both as the same instance of dwc:Organism.

My inner physicist and inner biologist have discussed this with each other for many years.

@dr-shorthair
Copy link

dr-shorthair commented Nov 28, 2022

Regarding the relationship between Organism and PhysicalObject or `PhysicalResource, I think the set-theoretic viewpoint is helpful here.

There is a class of Physical Objects.

There is a class of Organisms.

Some Things are both Physical Objects and Organisms - the classes intersect.

We do not have to assert that Organism is a sub-class of Physical Object to have both of these concepts involved.

@deepreef
Copy link

I definitely agree that Organisms are (mostly) physical stuff. But they're more at the "tornado" end of the spectrum (i.e., dynamic, with changing atoms over time, participate in actions in different locations at different times, etc.)

Conceptually, I can see an overlapping Venn diagram between dwc:Organism and dwc:PhysicalObject in the sense that some properties of an instance of dwc:Organism certainly involve physical matter.

But normally with set-theory diagrams like this (at least in our space), the diagram represents instances, as in some instances of dwc:Oragnism are also instances of dwc:PhysicalObject, but some instances of each are not also instances of the other. In this sense, I'm not sure a set-theory approach works (unless we're talking about sets of core properties, rather than sets of instances).

We do have to assert that Organism is a sub-class of Physical Object to have both of these concepts involved.

I'm not sure what you mean by "involved" here. Would you say the same thing about dwc:Location or dwc:Taxon? In my mind, the dwc:Organism class has informatic value that goes well beyond whatever physical manifestation a given instance of dwc:Organism happens to have at any given moment. In the same way that we represent relationships between dwc:Location-dwc:Event, or dwc:Taxon-dwc:Identification-dwc:Organism, we can also represent relationships between dwc:Organism-dwc:PhysicalObject. But I don't see why any of these relationships need to be represented as the superclass-subclass sort.

As something of a counter-point, one somewhat problematic aspect of treating something like a "specimen" as an instance of dwc:PhysicalObject is that, like dwc:Organism and tornadoes and such, the atomic composition of a preserved specimen also changes over time. It's just that these thre different things (specimens, organism, tornadoes) change their material composition at different rates.

That level of pedantry is probably unhelpful; but I remain uncomfortable regarding dwc:Organism as a subclass of dwc:PhysicalObject, unless it's constrained to an instance of an organism at a particular moment in time.

@dr-shorthair
Copy link

Whoops - a very important 'not' was missing - fixed now.

@jbstatgen
Copy link

@Jegelewicz Good question. I had agreed with @tucotuco about bfo:ImmaterialEntity, until @deepreef threw in dwc:Organism. That brought me back to a diagram that I had done early on in our work process (focus on overall structure)
DarwinSWdiagram_20221129 drawio

Apparently the BFO deals only with bfo:Entities that are "tangible", ie. they are events (bfo:Occurrant) or tokens/evidence (bfo:Continuant). They belong to the row of subjective perceptions, even if they are bfo:ImmaterialEntities. BFO defines "immaterial entities" as boundaries or spaces, ie. the lumen of an intestine or nasal cavity. Thus, they are very much concrete.

Thus, the BFO doesn't seem to be developed for abstract concepts. In the diagram these include both the top row "Models" and the second row "Reality (abstract)". For me, occurrences, organisms and taxa are abstract concepts.

As part of the Material Sample group we are focusing on what we can actually perceive, grab onto of reality. We can catch the photons modified (? uuh, Physics 101) by an event (bfo:Occurrant) by a sensor in a digital camera, which points to an occurrence. We can cut wood and observe the lumen of its xylem vessels or dissect a moose and look at the size of the inside of its stomach. That is, the bfo:ImmaterialEntities are delimited/defined by concrete instances of bfo:MaterialEntities. Both are bfo:IndependentContinuants as subcategory of bfo:Continuant. All of these bfo:Occurrants and bfo:Continuants are subcategories of bfo:Entity, which seems to always refer to something concrete, tangible.

@deepreef
Copy link

I'm assuming your diagram was based on Darwin-SW, but if not, that means that three different groups independently arrived at a very similar modelling (the third group being Rob Whitton and I, who as part of the old NSF-BiSciCol project converged on exactly the same approach as Darwin-SW, except instead of "Token" we went with "Evidence".)

I agree that dwc:MaterialSample is a useful subclass of the proposed dwc:PhysicalResource (or whatever it ends up being called). My main point was to keep dwc:Organism as as separate thing (like dwc:Event, dwc:Location, dwc:Taxon, etc.)

@deepreef
Copy link

Whoops - a very important 'not' was missing - fixed now.

Ah! Makes much more sense! Thanks!

@jbstatgen
Copy link

@deepreef , yes the diagram is based on Darwin-SW, thanks a lot for pointing this out. For the full context and the many people who contributed see our discussion for issue #11 in the material-sample repository.

Your insertion of dwc:Organism I found very helpful, since it informed on @Jegelewicz 's question about "immaterial" entities. Your and @dr-shorthair 's comments provided, while having a different focus, to me the background for suggesting to differentiate bfo:ImmaterialEntity as part of concrete evidence from abstract concepts, something that hadn't been immediately obvious to me. I'm trying to catch up to what happened over night.

@jbstatgen
Copy link

One note about process: personally, given the work that has been done in the Task Group, and especially the outcomes of the recent working session on Nov 7 2022, I would have preferred if the motion to add this element would have been proposed in the Material Sample Task Group first to mint a joint proposal as a reflection of the contributions to the task group by a number of individuals.

@cboelling Until the working session after the conference, at least I had thought that the "materialSampleType" term had been sufficiently discussed, widely agreed on within the group and was unproblematic. The many comments and suggestions during the session showed that there were still open aspects. Personally, I am glad the discussion got reopened, because now the term that we will propose will have been set into a broader context and be mapped to and aligned with existing equivalent/similar terms. The proposal will be more solid for it.

Thanks a lot for reopening the discussion, so that @baskaufs proposed dcterms:PhysicalResource, and you pointed us to the BFO and its bfo:MaterialEntity.

@tucotuco
Copy link
Member

tucotuco commented Nov 29, 2022

Darwin SW is easily recognizable in the version of the GBIF Unified Model (see Appendix II and also https://github.com/gbif/model-material/blob/master/README.md) current as of the opening the the collections management systems mapping project, where Organism is a subtype of MaterialEntity, and MaterialEntity is congruent with bfo:material entity and dcterms:PhysicalResource.

@deepreef
Copy link

Organism is a subtype of MaterialEntity, and MaterialEntity is congruent with bfo:material entity and dcterms:PhysicalResource.

Yeah, I guess that's what I'm hoping to avoid. I can certainly imagine "BiologicalMaterial" or something like that as a subtype/subclass of MaterialEntity/PhysicalResource; but I see that as very different from dwc:Organism, which is as much (or more) an abstract entity as (than) it is a physical/material entity. This is/was the root of the fundamental issue about where an Organism ends, and a MaterialSample begins.

@stanblum
Copy link
Member

stanblum commented Nov 30, 2022

@baskaufs wrote:

The one think I would quibble about a little is that there really should be an arrow connecting the Identification and the Organism (i.e. taxonomically homogeneous thing).

My representation (Identification to Token) is:

  1. from old training where the admonition is to minimize the number of relationships ("circles/loops among relationships indicate inconsistent business rules");
  2. because a (properly scoped) Token has the same or narrower scope as Organism; if the wolf pack organism contains wolves and a husky/wolf hybrid, it needs to be broken into at least two organisms;
  3. it's been argued by others that Identifications should be directed toward the stuff it's based on.

But I take your point that an identification that summarizes and represents a decision based on multiple Tokens would need to be attached to the Organism. But that kind of makes it different than the ID based on a narrower scope. So I guess an ideal Identification entity has the capacity to indicate specifically what it's based on.

@deepreef
Copy link

deepreef commented Nov 30, 2022

Wow... lots to read and catch up on. One quick reply to @stanblum:

So I guess an ideal Identification entity has the capacity to indicate specifically what it's based on.

So, in our approach, the Organism is an abstract thing, not a physical thing. It is the continuum of matter and energy (kinetic and potential) that begins at conception or mitosis or whatever (in the biological context) and ends at death or dissolution or whatever, but it is not limited to just the physical matter that might be observed by a human at a particular moment in time.

Thus, through this lense, the Identification is an assertion that one abstract entity (an Organism) is an exemplar of another abstract entity (a Taxon). What this assertion is "based on" is what we refer to as "Evidence" (=Token in Darwin-SW). The Scope of Evidence in our thinking includes things like MaterialSamples (the explicitly physical representation of Organisms), Images and other media, observations, documented information (e.g., publications). These are the tangible things that are not themselves the "Organism", but rather are physical and informatic derivatives of Organisms, upon which an taxonomic Idetnfication may be based.

So, yes, ideally we have the ability to link a single instance of Identification to multiple "Tokens" (evidence), that serve as the basis by which the assertion of a link between an Organism and a Taxon (i.e., the Identification instance) is made.

Incidentlly, the same kinds of Tokens can also serve as "evidence of Occurrence" as well as "evidence of Identification".

Not sure if that makes any sense...

@deepreef
Copy link

In response to @stanblum:

Note that cardinalities among these entities in our models are such that if you know the MaterialSample, you also know the Event, Location, Agent, and Organism; because MaterialSample is in many-to-one relationships with all those entities. There is no need for an Occurrence entity separate from that join of data.

I'll push back on this. I don't see an Occurrence as being the collective set of stuff cirumscribed by the Occurrence box in your diagram. If we consider an Event to be the intersection of a Location and Time (with various associated properties), then an Occurrence is an intersection of an Organism and an Event (with various associated properties). Similar to an Identification being an intersection of an Organism and a Taxon (with various associated properties).

As I noted in my previous post, I don't think it's correct to represent an Identification as being directly linked to the Token (MaterialSample/InformationArtifact), but rather directly to the Organism (this is implied within the definition of dwc:Organism). Instead, I see the Tokens (there may be many associated with a single Identification) as the "evidence" supporting that Identification.

I definitely agree that minting OrganismID values to generate an intersection node is "the cleaner solution in the long run". It's almost exactly the same situation as minting an IdentificationID value to apply a Taxon to an Organism (and, by derivitate extension, a MaterialSample or InformationArtifact).

Occurrence was created only to serve as the Union class for specimen and observation, like Token was in DarwinSW.

I agree that's why it was originally created, but I disagree that this represents its current value. The real value of an Occurrence is to represent the intersection of an Organism instance and an Event instance.

Last, the necessity to explicitly create a record for an Organism, separate from MaterialSample, is not to link multiple Identifications (see the quote in Steve's prior post), but to link multiple MaterialSamples or InformationArtifacts.

I think it's both. These aren't mutually exclusive informatic functions of Organism in the model. There may be multiple (competing/mutually exclusive) taxonomic identifications applied to a single Organism instance, and there may be multiple Tokens associated with an Organism (which can serve as evidence of identification, or evidence of occurrence, or both).

A single MaterialSample can have multiple Identifications; you don't need an Organism record until you have more than one MaterialSample from that Organism.

I disagree, for reasons already stated.

@stanblum
Copy link
Member

@deepreef wrote:

If we consider an Event to be the intersection of a Location and Time (with various associated properties), [...]

This might highlight the difference(s) in our approaches to modeling. There are infinite locations on Earth; there are infinite points or intervals on the timeline. What makes some of them interesting to us is that a collecting/observing process took place where and when. So even if the result was zero (no specimens or observations), our interest and reason for recording/communicating about the when-where was the what, who, how.

Perhaps more to the point about Occurrence: I still assert that if you have:

Organism, Specimen/Observation, Location, Timeline, Agent, Method

Why do you need Occurrence? What do you want to say about it? Obviously, I'm trying eliminate anything that isn't essential. The sampling or observation of an organism at a place and time (by agent and method) tells the complete story. The only thing I can see is the binding together of multiple samples/observations in the same occurrence, when there might also be subsequent occurrences of the same organism. But those are all "joined" by having the same Organism and Event data, no? I'm repeating myself, but feel redundancy in here; i.e., Organism + Event = Occurrence so why create Occurrence?

[Heading out. I'll have a few hours to think more about this.]

@deepreef
Copy link

deepreef commented Dec 1, 2022

There are infinite locations on Earth; there are infinite points or intervals on the timeline. What makes some of them interesting to us is that a collecting/observing process took place where and when. So even if the result was zero (no specimens or observations), our interest and reason for recording/communicating about the when-where was the what, who, how.

Agreed, but that's kind of true for all DwC classes (infinite locations, infinite possible identifications, infinite measurements/facts, etc.) We limit the scope of instances to those that matter to us. But I'm in the camp that we define our classes around fundamental concepts. And to me, an "Event" is uniquely represented by the intersection of a place and time. Now, perhaps there is a need to mint multiple Events that share precisely the same place and time but differ only in terms of other properties (e.g., human participants, specific activities, etc.). That boils down to which properties of each class are definitive, vs. supplementary information. Maybe that's the level of definition we need to help avoid having different notions of what instances of these different classes actually mean.

As to your point about the (lack of) need for Occurrence -- I guess I need to think more about that. I mean, I think I could make the same argument that if you have:

Location, Timeline, Agent, Spcimen/Observation

Then why do we need Event? What do you want to say about an event that you can't say say with an aggregate set of instances and their properties?

I think the node of an Occurrence, defined more narrowly as an intersection of an Organism and an Event (not encumbered by "tokens" such as specimens/observations), serves an important informatic purpose -- in much the same way that an Event does.

The only thing I can see is the binding together of multiple samples/observations in the same occurrence, when there might also be subsequent occurrences of the same organism.

I think this is the crux of our difference here. To me, samples/observations are Evidence of Occurrences; they're not fundamental to Occurrences themselves. I see the Occurrence as the abstract presence of an organism (or non-biological thing) at a place and time. There are infinite numbers of them, but we only care to populate our databases with the tiny subset for which we have evidence to support them.

Good conversation! It's forcing me to think about this stuff in new ways.

@deepreef
Copy link

deepreef commented Dec 1, 2022

Now that I've caught up on the discussion, I owe an apology for falling into my usual trap of conceptual/philosiphical pedantry of only moderate (at best) relevance to the topic at hand. So... putting aside the definition and scope of Occurrence and Organism (which, to be fair, have at least some relevance), my own thoughts are:

  1. I don't fully understand the implications of adopting dc terms vs. bfo terms, so I'll defer to the experts on that.
  2. A lot of what we're doing is putting things into boxes to facilitate consistency in information representation and exchange. There will ALWAYS be edge cases. As noted by @cboelling, tornadoes and wolf packs differ in degree, not in kind; but for pragmatic purposes, our interest in the former is clearly dominated by it's kinetic and potential energy properties (with some interest in physical properties, such as when large tree branches and houses and such are hurled about), and our interest in the latter is clearly dominated by its physical properties (with some interest in kinetic and potential energy properties, such as behaviors and metabolic physiology). I imagine for most folks, the edge cases will be exceedingly rare, and most folks will have no trouble thinking of this class in terms of matter, rather than energy.
  3. I really, really, really hope we don't represent Organism as a subclass of dcterms:PhysicalResource/bfo:MaterialEntity! (not because the material composition changes over time, but because the informatic utility of an Organism is more about its conceptual existence, than its material existence).

@stanblum
Copy link
Member

stanblum commented Dec 1, 2022

I have to apologize, too. I modified my previous diagram a bit to accommodate some of the comments, but I hesitate to post it here and continue this thread about Occurrence, Event, Token, etc., because it's essentially peripheral or even irrelevant to the primary issue here -- the proposal to import(?) dcterms:PhysicalResource. @tucotuco, should we move these comments Occurrence to a discussion in DwC (not an issue)?

@cboelling
Copy link
Member

  1. A lot of what we're doing is putting things into boxes to facilitate consistency in information representation and exchange. There will ALWAYS be edge cases.

While it sometimes may seem frivolous, edge cases are most helpful in challenging our means for information exchange and sharpening our thinking. It's a good thing to continue to bring them up.

As noted by @cboelling, tornadoes and wolf packs differ in degree, not in kind;

... with respect to both of them being classified as bfo:MaterialEntities. Of course, there are innumerable ways in which these are different in kind with respect to other, more confined categories.

  1. I really, really, really hope we don't represent Organism as a subclass of dcterms:PhysicalResource/bfo:MaterialEntity!

While I think that doing so would be entirely justified, would simplify things and would opt for this, I understand that declaring dwc:Organism a subclass of the proposed element is currently not part of this proposal.

@ghwhitbread
Copy link

ghwhitbread commented Dec 1, 2022

Post it here before moving to DwC discussion, please, @stanblum. I’d like to know if it comes closer to our existing model or drifts away.

@Jegelewicz
Copy link

I am in favor of the term as given in the original proposal. For me it defines the box we need for "things". The "Human Resources" argument is the best I've heard if someone should oppose the term for the ethics of using "resources" in reference to humans.

I can certainly bring this before @tdwg/material-sample but I also see no reason they couldn't just comment here. Participation in meetings has dropped off and I don't want to have this decision be made by three or five people. I will email the task group members.

@deepreef
Copy link

deepreef commented Dec 1, 2022

While it sometimes may seem frivolous, edge cases are most helpful in challenging our means for information exchange and sharpening our thinking. It's a good thing to continue to bring them up.

Agreed! Indeed, it's the edge cases I always go to first to finesse where the boundary is. But there is often a point of diminishing returns where further scrutiny of rare edge cases impedes progress more than it enhances clarity and precision. In the age-old battle between the "perfect" and the "good enough" (which are often mutual enemies), I am often rooting for the "perfect" more than most people. But I also understand that this can be counterproductive.

@stanblum
Copy link
Member

stanblum commented Dec 1, 2022

From the two main points of feedback (I think):

  1. An occurrence is the existence of an Organism at a Location and time
  2. Identification applies to the Organism

Key feature (a reveal to me; duh!) is that an Occurrence is then the intersection (association entity; M:M) between Event and Organism.

[Revised to correct foreign keys in Token subclasses to be OccurrenceID instead of OrgansimID.]

DarwinSW-simplified2 drawio

Which looks almost the same as @jbstatgen 's first diagram, except for the use of arrows, I think.

@deepreef
Copy link

deepreef commented Dec 1, 2022

Only modification I'd make is that the Token box should also have an arrow pointing to Identification. Also, I'm a little fuzzy in my own mind about the exact relationship between Organism and Token. E.g., must it always pass through (at least) one Occurrence? So far, that's how we do it, and haven't found a need to change it. But does require expanding the scope of "Occurrence" to things like subsampling in a lab, or photo sessions not related to the time and place of the organism in nature.

Ok, and like @Jegelewicz, I am likewise in favor of the term as given in the original proposal (to stay on topic...)

@dagendresen
Copy link
Contributor

My preference would be to name the new class term simply Material.

And make a new property term for materialType.

And move to a controlled materialType vocabulary: FossilSpecimen, LivingSpecimen, PreservedSpecimen, and MaterialSample (the latter for tissue and environment samples, etc. as today).

Would we in such a case want to rename materialSampleID to materialID?

I do prefer an explicit strong link to bfo:MaterialEntity, and I do not mind a link to dcterms:PhysicalResource (to be captured in the term comments).

I agree with @deepreef that the "essence" of Organism we want to capture here is different from the material component of it.

@dagendresen
Copy link
Contributor

(I also think we do need a new class Evidence, Token, or simply Record (for recorded evidence)! And that it would be within the mandate of our task group to propose this).

@cboelling
Copy link
Member

cboelling commented Dec 2, 2022

I realized that I have misused the dcterms: namespace abbreviation when writing my earlier comments in this issue. I think this has unnecessarily complicated the discussion for which I'm sorry.

I have amended my earlier posts accordingly, especially here. I think this now partly resolves the concerns mentioned by @baskaufs (Apologies for stealing your time).

Nonetheless, I stand by my proposal to formally link to bfo:MaterialEntity for the same reasons stated earlier. Applying the same argument as in the proposal made in the top comment, entailments shouldn't be an issue because they are not formally declared in the bag of terms approach.

@baskaufs
Copy link
Contributor Author

baskaufs commented Dec 2, 2022

@cboelling Since you are essentially suggesting a counter-proposal, I think it would be helpful if you would create a new issue using the new term template and fill out exactly what you are suggesting as the term's metadata properties. In particular, how would you propose to make the link to the external terms? In the efficacy justifications you can reference this proposal and succinctly summarize your arguments as to why your proposal is an improvement over importing dcterms:PhysicalResource

If you are able to do that, I would like to request the members of the TAG to discuss the two proposals. In the past, if there was a well-known external term that captured the essence of what we wanted, we have imported it in preference to creating our own. In cases where our suggested use of the imported term was different or more specific than the use described by the minting organization, we have used non-normative "Notes" (dcterms:description in RDF) or normative "Usage" (skos:scopeNote in RDF) along with non-normative "Examples" (skos:example in RDF) to clarify. Your proposal is somewhat of a departure from this practice and I would like to hear what the TAG thinks about the approach since it would be setting a new precedent.

@tucotuco Can we add a field in the new term form for English label? It's not there presently and probably should be.

@tucotuco
Copy link
Member

tucotuco commented Dec 2, 2022

@baskaufs The two issue templates have been updated to include "* Term label (English, not normative): ".

@baskaufs
Copy link
Contributor Author

baskaufs commented Dec 2, 2022

Thanks @tucotuco

@tucotuco
Copy link
Member

tucotuco commented Dec 3, 2022

Sorry to be late in much of what will follow here - not enough time to keep up consistently. There has been a lot of good discussion. Though there is a lot I would say about lots of comments in this issue, I feel compelled to answer questions specifically addressed to me.

@deepreef from #421 (comment)

(@tucotuco said) The class dcmitype:PhysicalObject would clearly not work as a broader term (superclass) for dwc:Organism, as an Organism need not be inanimate.

I'm not sure if @tucotuco meant to suggest that dwc:Organism would be treated as a subclass of this new class (whatever the label ends up being), but I would strongly advise agaisnt representing it that way.

I would not suggest any subclassing in Darwin Core itself. This is something we once had with the values of basisOfRecord as a formal type vocabulary (similar to what Dublin Core has for the dcmitype: terms as a type vocabulary for dcterms:type) and we abandoned it in favor of deferring any kind of over-arching modeling for a later exercise when we were mature enough technically. We are still not engaged in that exercise in Darwin Core.

@jbstatgen from in #421 (comment):

@tucotuco Following your links and looking at your figure 2 (see above) of the README.md with fresh eyes and the discussion of the past days in the back of my mind, I have several questions.

The starting point for my inquiries is that I would like to understand what you need "Organism" for in the GBIF model to have it as subcategory of bfo:Entity? What is the role of the class in the GBIF model? What is it used for?

First, a point of information. Entity in the Unified Model is currently our own fabrication. It is inspired by bfo:entity, prov:Entity, sosa:FeatureOfInterest and dsw:Token, but we have no formal ontological declarations of any kind at this point. By experience, doing that in an early stage of modeling constrains one's world view. We are only now beginning to consider the benefits of aligning formally with specific world views.

In your figure 2 you placed "Organism" in the same column as bfo:Entity and you also describe it as subcategory to bfo:Entity in your last post. However, the definition of dwc:Organism starts out with "Instances of the dwc:Organism class are intended to facilitate linking one or more dwc:Identification instances to one or more dwc:Occurrence instances. ..." This sounds more like the class being an abstract construct, maybe even "only" a tool and not something "real", tangible. It seems thus quite different from bfo:Entity

In the diagram for version 4.5 of the Unified Model, there is no implied significance of the "columns" of tables. Their arrangement is primarily a practical one to minimize clutter. The meaning is instead captured in the cardinality indicators in the connections between tables. Reading that from the diagram says that Organism (here we do mean sensu Darwin Core) is a subtype of MaterialEntity, which is a subtype of Entity.

That quote above about dwc:Organism is from the (non-normative) Comments for the term, not from the (normative) Definition. The Comments apply specifically in the Darwin Core context, which is flat with respect to describing Occurrences (the Event, the Location, the evidence, the Organism, the Identification, the Taxon are all part of one wide undifferentiated row). For anything fancier, an extension attached within the confines of the star schema constraint is required.

Just because an Organism is a MaterialEntity doesn't mean it can't be more than that. In fact, it must be, or we wouldn't bother using a separate class for it. We believe Organisms can also be Agents, for example. In the Unified Model, Organisms are not just part of a flat Occurrence with inferred links to other concepts. Some or all of the material remains of an Organism can also provide the evidence for an Identification. An Organism is also the entity/Entity/featureOfInterest of an Occurrence and its material remains can provide the evidence for that (digital evidence via DigitalEntities can also).

At the same time, bfo:Entity is the top category for bfo:Continuant and bfo:Occurent. Yet, you place "Occurence" to the side in a new column by itself, suggesting that it is something quite different.

Again, the arrangement in columns is not meant to have special significance and formal alignment with BFO is not established. The significance is in cardinality of the relationships provided. In the Unified Model, an Occurrence is a subtype of Event. It's a special kind of Event in which there was evidence of an Organism having been within a Location during some period of time.

Finally, I have become a fan of the PROV model, since I like to see the DES and the future of data modeling as fundamentally transactional with an event-based data model at their heart. The colored boxes denote the three foundational elements of the PROV standard: entities, agents and activities. In PROV they are all directly connected in a kind of triangle. In the GBIF model you are placing "Occurrences" inbetween the "Entities" and the "Activities". I don't understand why you need them there. I would appreciate it if you could explain and maybe provide an example.

In the Unified Model, as a subtype of Event, an Occurrence is a prov:Activity where the prov:Entity is the Organism and there are various possibilities of prov:Agents associated with that connection, such as observer, collector, photographer, etc.

For me, reality at some place and time results in an occurrence. In our perception we might focus on the rock facies component of reality (gneiss and basalt as igneous rocks, not sandstone) and not the organism(s) growing on its surface (eg. algae and lichens). The rocks and organisms can be classified according to some relationship/similarity/ancestry scheme. Up to now, the rock facies and organisms are abstract, general (universal? there is an expression for this) concepts. Once we move in an activity/event from the concepts to the specific instances, we have an empirical fact to collect, preserve and share, an entity.

Thus, I would at this point argue that dwc:Organisms are of a different "quality" than bfo:Entity, though will be happy to better understand your perspective on this.

I hope my responses help. Let me know if they leave any doubts.

@stanblum from #421 (comment):

I have to apologize, too. I modified my previous diagram a bit to accommodate some of the comments, but I hesitate to post it here and continue this thread about Occurrence, Event, Token, etc., because it's essentially peripheral or even irrelevant to the primary issue here -- the proposal to import(?) dcterms:PhysicalResource. @tucotuco, should we move these comments Occurrence to a discussion in DwC (not an issue)?

It might have been a great idea to start this conversation in a Github Discussion in this repository, but given that all of this commentary has been transmitted via email to anyone who is watching the Darwin Core issues (with links to specific comments and such), it would be problematic to move them. I think that if we can periodically provide a summary-so-far comment of the stuff directly relevant to the proposal we should be fine staying here in this issue with the diverse connected conversations. If we come to any concrete conclusions about how to save the (modeling) world on related topics, we should probably do our best to make that happen in other appropriate places as well. For the Unified Model stuff, that would be GBIF's Discourse forum.

@cboelling
Copy link
Member

cboelling commented Dec 7, 2022

@baskaufs:

@cboelling Since you are essentially suggesting a counter-proposal, I think it would be helpful if you would create a new issue using the new term template and fill out exactly what you are suggesting as the term's metadata properties. In particular, how would you propose to make the link to the external terms? In the efficacy justifications you can reference this proposal and succinctly summarize your arguments as to why your proposal is an improvement over importing dcterms:PhysicalResource

In short, I see 3 alternatives:

  1. import dcterms:Physicalresource (this proposal)
  2. import bfo:MaterialEntity
  3. create dwc:MaterialEntity as a separate resource under control of DwC but terminologically (through the term label) and conceptually (through the (possiby adapted) definition) informally linked to bfo:MaterialEntity

I can do as you suggest but I would like to run this by the Material Sample Task Group. In my opinion, incorporating a top level term for material entities with accompanying metadata and documentation is the key result of this chartered task group, whichever alternative the TG gets behind.

In cases where our suggested use of the imported term was different or more specific than the use described by the minting organization, we have used non-normative "Notes" (dcterms:description in RDF) or normative "Usage" (skos:scopeNote in RDF) along with non-normative "Examples" (skos:example in RDF) to clarify.

These don't seem to be part of the new term form or is there a mapping?

@baskaufs
Copy link
Contributor Author

baskaufs commented Dec 7, 2022

@cboelling The new term template is a little unclear about this, sorry. Usage comments (recommendations regarding content, etc., not normative) in the template is what ends up in the Notes field in the Darwin Core List of Terms (and List of Terms documents for other vocabularies) and in the Comments field in the Darwin Core Quick Reference Guide. It is the dcterms:description value in the RDF. Darwin Core does not (yet) have a usage field (skos:scopeNote) for any terms because few of its terms are imported from other vocabularies. It has been used commonly in Audubon Core, which borrows many terms whose definitions are set outside of TDWG and therefore sometimes needs to provide normative guidance on how these terms should be used in the TDWG context. So "usage comments" in the Darwin Core template does not correspond to the "Usage" field as it appears in the Audubon Core list of terms.

These patterns are historical artifacts and we probably should get our act together to make the terminology more consistent across documents. Hope this helps clarify.

@Jegelewicz
Copy link

This was discussed at length today in the @tdwg/material-sample meeting. At this time, we plan to review the very detailed proposal made by @cboelling and make a decision on which of the three choices he proposed we prefer as a committee. That meeting is scheduled for January 18. We request that this proposal be held until at least until then. For more information and to join the discussion see tdwg/material-sample#31

@baskaufs
Copy link
Contributor Author

Following the discussion at the January 18 metting, I would like to request that this proposal be withdrawn (closed) in favor of the proposal for dwc:MaterialEntity soon to be submitted by the MaterialSample task group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Class - new Task Group - Material Sample https://www.tdwg.org/community/osr/material-sample/ Term - add
Projects
None yet
Development

No branches or pull requests