Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Term - materialSampleType #345

Closed
deepreef opened this issue Apr 28, 2021 · 21 comments
Closed

New Term - materialSampleType #345

deepreef opened this issue Apr 28, 2021 · 21 comments

Comments

@deepreef
Copy link

New term

  • Submitter: Richard Pyle
  • Proponents (at least two independent parties who need this term): Bishop Museum, TaxonWorks, multiple other institutions and data managers as expressed in commentary on several related issues.
  • Justification (why is this term necessary?): Multiple implementations of biodiversity data management systems that manage instances of MaterialSample assign a "type" value as a way to classify the general category of MaterialSample instance. A standard term for this property would facilitate data exchange regarding MaterialSample records, and would encourage convergence on a common vocabulary of commonly used types of MaterialSample instances.

Proposed attributes of the new term:

  • Proposed definition of the new term: "The type or general nature of a MaterialSample."
  • Term name (in lowerCamelCase): materialSampleType
  • Class (e.g. Location, Taxon): MaterialSample
  • Comment (recommendations regarding content, etc.): Recommended best practice is to use a controlled vocabulary.
  • Examples: lot, whole specimen, specimen part, tissue sample, multiple fossils, serial thin sections, microfossil, water sample, soil sample, microbial sample, nest
  • Refines (identifier of the broader term this term refines, if applicable): None
  • Replaces (identifier of the existing term that would be deprecated and replaced by this term, if applicable): None
  • ABCD 2.06 (XPATH of the equivalent term in ABCD, if applicable): not in ABCD

Discussion around changes to MaterialSample on DwC (#314) and GBIF issue #37. This new term has direct relevance to dwc:preparations. Some additional discussion is required to determine what values are more appropriate to represent as preparations, and what values are more appropriate to represent as materialSampleType.

@deepreef deepreef changed the title New Term - New Term - materialSampleType Apr 28, 2021
@dshorthouse
Copy link

dshorthouse commented Apr 28, 2021

Where I get stuck on preparation or this proposed materialSampleType is on their use as a verb or as a noun. This proposal appears to be centred on its use as a noun. Is there merit to explicitly disentangling this whereby there's differentiation between a preparation process - that which generated or changed the state of a materialSample - versus an expression of the outcome of that transformation? And, as spill-over from this, do these states (= "types") back-propagate when a materialSample is destructively sampled? If I pluck a leg off a beetle to obtain a DNA barcode, does that beetle remain a whole specimen? If a materialSample is entirely disentangled and used-up such that there is no longer any vestige of physical representation, does that parent materialSample retain its materialSampleType? If not, do we need to split this term into originalMaterialSampleType and presentMaterialSampleType (= none / no longer exists).

@deepreef
Copy link
Author

@dshorthouse : Excellent points! (as always!) And I think you really get to the heart of what I've struggled with both for preparations and the proposed materialSampleType.

And, as spill-over from this, do these states (= "types") back-propagate when a materialSample is destructively sampled? If I pluck a leg off a beetle to obtain a DNA barcode, does that beetle remain a whole specimen?

Yes, I've been down that road as well. If I pluck a leg off a beetle, I'd be inclined to brand the leg as a "part", and retain the rest of the beast as "whole". But if I split it right down the middle, I'd probably call them both "parts".

However, this is a bit beside the point, because technically no MaterialSample is ever a "whole organism" if you want to be pedantic about it -- in fact, no Organism is ever whole once any time passes, as a very large number of water molecules likely depart from the physical content of the Organism with each passing microsecond.

So the point is, there will always be subjectivity in deciding how to split those hairs (no pun intended), when deciding among the options for either preparation or materialSampleType for any particular instance. But the key question is:

What is the lesser of evils? Dealing with subjectivity at the edge-cases, or having no idea whether a particular MaterialSample instance represents a jug of water, or a (mostly) whole fish specimen, or a tissue sample?

If not, do we need to split this term into originalMaterialSampleType and presentMaterialSampleType (= none / no longer exists).

Probably... but I'll leave that for someone else to propose.

@tucotuco
Copy link
Member

tucotuco commented Apr 29, 2021

The term preparations is what we have come to call a convenience term, meaning it has a use, but that use is not very well defined semantically because it mixes concepts and/or allows for multiple instances to be captured within the content of a single term. The term preparations mixes parts of organisms with the preparation and preservation methods for those parts, and can do so for as many parts as one likes. In essence it is to capture the "stuff" that was saved from an Occurrence and the state it is currently in.
This materialSampleType term is an attempt to disambiguate preparations (in part - sorry Rich, but you started the punning). The proposed materialSampleType covers only the stuff, and only one at a time. In other words, it is well poised to form part of an extension that can connect zero to many MaterialSamples to an Occurrence (and thence to an Organism at a particular place and time). There has been work on this front in the past in discussions around a preparations extension, inspired in part by the desire to capture the preparation and preservation histories of material samples in order to understand their viability for a variety of purposes, particularly DNA extraction. The current enthusiasm around MaterialSample might signal a time to rekindle the effort.
Based on the related discussions to date, my recommendation is to charter a Task Group within the Observations & Specimens Interest Group for the specification of a MaterialSample extension, including the development of the needed terms and vocabularies as a package.
Related issues are Issue #1, Issue #3, Issue #24 (reopened because of renewed interest), Issue #314, Issue #332, Issue #344, Issue #346, and Issue #347.

@wouteraddink
Copy link

@tucotuco I like your idea for a task group to specify a MaterialSample extension! Regarding MaterialSampleType, note that this is also being discussed in the MIDS (see tdwg/mids#14) and CD task group, and related to work going on in iSamples and DiSSCo. Ideally the results should be aligned. I am currently writing a proposal to describe the 'what is it' terms for MIDS.
@dshorthouse, a specimen is an individually curated object, at least that is how we defined in DiSSCo after community consultation. So it does not saying anything about completeness as an organism individual. It may be consist of a whole organism individual, a part, or multiple individuals of one or multiple organisms. It does not change if you remove a part so I see no need for a split in originalMaterialSampleType and presentMaterialSampleType.

@dshorthouse
Copy link

a specimen is an individually curated object, at least that is how we defined in DiSSCo after community consultation

I'm assuming this means that there is no room in the model for an individually curated object that no longer has a physical manifestation, right? Is there a mechanism to strike a digital record from the index when its physical counterpart has evaporated?

@tucotuco
Copy link
Member

tucotuco commented Apr 29, 2021 via email

@wouteraddink
Copy link

Yes, that is one reason why we have the concept of a digital specimen, which will keep existing if the physical specimen no longer exists. In a catalog of physical specimens the record might be deleted, but the digital specimen record will be maintained and further curated.

@dshorthouse
Copy link

dshorthouse commented Apr 29, 2021

@tucotuco @wouteraddink Fair enough. What I'm mindful of in this instance is the potential for a philosophical schism or a functional disconnect between the physical items cared for by collections/museums and their virtual, digital representations that may lead fundamentally different lives.

@deepreef
Copy link
Author

deepreef commented Apr 29, 2021

@tucotuco :

my recommendation is to charter a Task Group within the Observations & Specimens Interest Group for the specification of a MaterialSample extension, including the development of the needed terms and vocabularies as a package.

Yes! I think that is exactly the right way to proceed.

@wouteraddink :

Regarding MaterialSampleType, note that this is also being discussed in the MIDS (see tdwg/mids#14) and CD task group, and related to work going on in iSamples and DiSSCo.

I've subscribed to the MIDS discussions, and they always end up in my Outlook calendar, but they usually happen in the middle of the night or very early morning Hawaii time. This, itself, is not a problem for me, except that by pure coincidence, each one so far has been on a morning that I needed to stay up late, and therefore could not drag myself out of bed to attend. I see the next one is May 6 at 3:30am Hawaii time, so I'll make a special effort to get to bed early the night before, if this topic is likely to be on the discussion agenda.

a specimen is an individually curated object

This is exactly my working definition for MaterialSample, so perhaps I've been subconsciously joining the MIDS discussions in spirit?

On a related question (to all): Do you consider "Specimen" to be synonymous with MaterialSample? If not, what does the Venn diagram look like for these two concepts?

Yes, that is one reason why we have the concept of a digital specimen, which will keep existing if the physical specimen no longer exists. In a catalog of physical specimens the record might be deleted, but the digital specimen record will be maintained and further curated.

What I'm mindful of in this instance is the potential for a philosophical schism or a functional disconnect between the physical items cared for by collections/museums and their virtual, digital representations that may lead fundamentally different lives.

The digital record is the proxy for the physical thing (or in the case of Events, Occurrences, Taxa, etc., the abstract idea). They lead fundamentally different lives from the moment they are born. The trick is for the digital proxies to accurately capture the information concerning the physical/abstract "thing" that we care about, as completely and accurately as possible.

The significance of the "end of life" of a physical or abstract thing is not that the data records should be deleted, or even that the data records cannot change. The only real significance is that the physical/abstract thing can no longer yield new derivatives, or in many/most cases, participate in any new relationships, after the end of life.

For example, I would define the lifespan of an Event as the period of time between the Event StartTime and the Event EndTime. Related Occurrences can only exist if they occur within that range of time. Similarly, an Organism lifespan begins at fertilization (or division for single-cell organisms) and ends either when it ceases to be alive, or perhaps when all the molecules have dissociated (disintegration). Between those two points in time, the Organism may participate in multiple Occurrences, and be the source of multiple MaterialSamples. The lifespan of a MaterialSample begins whenever the curation process begins for biological material (When someone picks the dead bird off the road? When the dead bird arrives at a Museum?), and ends when the all the molecules have dissociated (or perhaps when it ceases to be curated). Within that timeframe, there may be multiple derivative MaterialSamples extracted from it.

One exception to the inability to "participate in any new relationships" thing is with respect to Identification instances. An Organism could participate in new Identification instances into perpetuity, even after the end of life for the Organism (and after the end of life for any of its derived MaterialSample instances).

@tucotuco
Copy link
Member

tucotuco commented Apr 29, 2021 via email

@wouteraddink
Copy link

@deepreef you may contact alex and elspeth, they are probabably willing to see if they can accomodate your timezone if you want to participate in the mids meetings

@wouteraddink
Copy link

@deepreef i think a specimen is a kind of materialsample, a subset perhaps as there is a subtle difference between a specimen and a sample. Not in what they are but in how they are treated:a specimen acts as representation of a class (organism or type of geological material) and the aim is therefore to preserve it, while a sample is an example, and usually destroyed after doing some measurements or data extraction.

@baskaufs
Copy link

I don't remember from the original discussion of creating the MaterialSample class that there was any expectation that it be destroyed. It could be destroyed, but didn't have to be.

I'm with @tucotuco, I don't see how a twig clipped from a tree to be glued to an herbarium sheet is fundamentally different from a twig clipped from a tree for some other purpose. They are both the same material, they are a sample from an organism, and they could have the same metadata. The only difference to me is in our heads.

@wouteraddink
Copy link

Yes that is what I was saying, perhaps i was not clear, there is no difference in what they are, we just treat them differently/for a different purpose

@deepreef
Copy link
Author

@tucotuco

Try as I might, I can not get my head around a "Specimen" not being a MaterialSample.

Same here, but the other part of the question is: are there any instances of MaterialSample that are not also Specimens? In other words, do the two terms represent congruent concepts? My answer: Yes, they are congruent. But that's why I asked for a Venn diagram from those who think otherwise.

@wouteraddink :

Yes that is what I was saying, perhaps i was not clear, there is no difference in what they are, we just treat them differently/for a different purpose

I guess my question is: Is there value in maintaining the two terms ("specimen" and "MaterialSample") with distinct (if overlapping) definitions in our biodiversity informatics discussions? Or, for purposes of defining the various biodiversity standards, can we use the two terms interchangeably? I agree that each word invokes different presumptions (e.g., destined for destruction or preservation), but for the purposes of modelling the data we care about, I think they are two words that mean the same thing.

@deepreef you may contact alex and elspeth, they are probabably willing to see if they can accomodate your timezone if you want to participate in the mids meetings

Thanks, but the time zone thing isn't a problem, except when I have late-night obligations the previous day (which is rare, but coincidentally coincided with the last several MIDS gatherings). I willfully accept that one of the costs of living in the middle of the Pacific is that I often get the short end of the stick for international meeting scheduling. That's PERFECTLY fine with me, and a price that I'm more than happy to pay.

@dagendresen
Copy link
Contributor

From the term name materialSampleType I initially get the association to a splitting of basisOfRecord so that PreservedSpecimen, FossilSpecimen and LivingSpecimen (?) would move to the new term materialSampleType -- and that HumanObservation and MachineObservation remain as a "occurrenceType".

@wouteraddink
Copy link

A specimen is a materialSample (so in a venn diagram it falls into that), but I think a materialSample is not always a specimen, e.g. if it is not a curated object. A specimen should always have a storage location (past or present). I think for the model that does not matter, they are the same 'thing' but a specimen is required to have some specific associated data regarding curation and storage location. I think a livingSpecimen also falls completely into materialSample, it is different from an Organism in that it is a curated object. But it can change back to an Organism when it escapes into the wild and is no longer a curated object.

@deepreef
Copy link
Author

@dagendresen :

From the term name materialSampleType I initially get the association to a splitting of basisOfRecord so that PreservedSpecimen, FossilSpecimen and LivingSpecimen (?) would move to the new term materialSampleType -- and that HumanObservation and MachineObservation remain as a "occurrenceType".

I've stayed quiet on the issue of basisOfRecord, on the doctrine of "if you have nothing nice to say, then don't say anything at all". But I've indirectly hinted at my dissatisfaction with the mixing of "proper" DwC classes (Occurrence, Taxon, Event, Location, MaterialSample, Organism, etc.) with what I somewhat disparagingly refer to as the DwC "pseudo classes" (the ones you list). My sense from some of the discussion at last year's TDWG meetings was that I'm not unique in having this dissatisfaction.

While I agree that terms like PreservedSpecimen, FossilSpecimen and LivingSpecimen are "kinds" of MaterialSample; I'm not sure they're granular enough to capture what I'd like to see as standard values for materialSampleType. I would prefer values that distinguish whether a particular MaterialSample instance represents a (mostly) whole specimen, or aggregate of multiple specimens of the same taxon (e.g., "lot"), or aggregate of multiple specimens of potentially different taxa (e.g., water sample, soil sample), or a portion of an organism with potentially useful morphological and biochemical characters (e.g., "organism part"), or a portion of an organism with potentially useful biochemical characters (e.g., "tissue sample"). Whether or not it's alive, dead but preserved in some chemical way, or dead and gone but represented by a mineralized surrogate (i.e., fossil) seems to me to be a different property in need of a different term(?).

With respect to these "pseudo classes", I'm also a little fuzzy on how best to characterize the situation where a person first sees an organism in-situ, then photographs it using a digital camera, then collects it an preserves the specimen. Does the "PreservedSpecimen" supersede the "HumanObservation"? What does the photograph represent? A "MachineObservation"? Or is it only a "MachineObservation" if no organism identifiable to Homo sapiens participated in the Event? What if the photograph was taken by an autonomous robot that was also present at the Event? (Don't laugh -- we are actively working on this with non-trivial funding).

To me, what is interesting/important to us are:

  • The asserted Occurrence of an Organism at an Event
  • The asserted taxonomic identity of the Organism
  • The evidence supporting the assertion that the Organism occurred at the Event
  • The evidence supporting the assertion of a taxonomic identity of the Organism
  • The state, disposition, and fit-for-purpose for different kinds of subsequent analysis of curated physical biological objects

I would like to see our information structures oriented around tracking these things. I'm just not sure how including the "pseudo classes" among the options for basisOfRecord helps us achieve this.

@tucotuco
Copy link
Member

This conversation has become relevant to #302 and would also benefit from related discussions in tdwg/dwc-qa#134.

@tucotuco tucotuco added the Task Group - Material Sample https://www.tdwg.org/community/osr/material-sample/ label Jun 2, 2021
@smrgeoinfo
Copy link

The proposed definition and vocabulary correspond nicely to the SpecimenType we're workign on in iSamples. See the Decision tree for the vocabulary on GitHub. We're taking a broader cross domain view of the kinds of physical samples that are in scope, and not included information objects (which are linked to samples as related resources).

for the examples given in the proposed definition:
lot -- 'biome aggregation' (if I understand what 'lot' means)
whole specimen -- ?? does this mean 'whole organism' or 'other solid object'?
specimen part -- ??organism part?
tissue sample --> organism part
multiple fossils --> fossil
serial thin sections --> analytical preparation
microfossil --> fossil
water sample --> container with fluid
soil sample --> aggregation
microbial sample -- > biome aggregation
nest --> product of an organism

The concepts in the iSamples SpecimenType are generally higher level that those in the examples; the more domain-specific specimen types would be in vocabularies with a narrower scope, as subtypes.

@tucotuco
Copy link
Member

This issue has been superseded by #454

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants