-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adjust verbatim collector to verbatim agent #1492
Comments
There's some discussion at #1239 I have no strong feelings, but the concept is named after the TABLE (collector) for which it's a text-value (vs. data object) replacement, NOT the role of any "collectors" (http://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE) I sort of think that we should somehow attempt to retain that association - the attribute (whatever it's called) applies to things which with more data would be entered into table collector, and the attribute should not be used for things which with more data might appear as Agents outside that table. |
Any reason would couldn't have both?
…On Mon, Apr 2, 2018 at 1:48 PM, dustymc ***@***.***> wrote:
There's some discussion at #1239
<#1239>
I have no strong feelings, but the concept is named after the TABLE
(collector) for which it's a text-value (vs. data object) replacement, NOT
the role of any "collectors" (http://arctos.database.
museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE)
I sort of think that we should somehow attempt to retain that association
- the attribute (whatever it's called) applies to things which with more
data would be entered into table collector, and the attribute should not be
used for things which with more data might appear as Agents outside that
table.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hG-IwE9K4lk6sIM00TMLuG1lMpDoks5tkoCbgaJpZM4TD-Oe>
.
|
I am in agreement with @AJLinn . Based upon @dustymc 's comments, I suggest we either have multiple attributes: verbatim collector, verbatim determiner, verbatim preparator, verbatim maker, etc. or we change to an ATTRIBUTE of verbatim agent, enter the agent's name in ATTRIBUTE_VALUE and make ATTRIBUTE_UNITS required with a controlled vocabulary of collector, determiner, preparator, maker, etc. |
Technically? Nope, none at all. From a usability standpoint,
|
Attributes with units (which are always required when they can exist for an attribute) force value to be numeric. |
Maybe the solution is to change the TABLE (collector) to TABLE (agent_actor), since the name in that table may be a collector, preparator, maker, etc. Which also brings me to determiners? How do they fit? I've definitely had some determiners that ended up as unknown because all I had were three initials, but they aren't listed in the collector table.... |
I'm certainly not in love with the table name (and it's far from alone in being poorly-named), but it's also much more structurally-specific than "agent who did something." If some random person wanders in with a specimen, I think most of us would make them a collector. Most of us would probably not put much value in what they think they have - someone in the collection would identify the material and be recorded as the determiner. From that, I would probably make the 3-letter determiners into agents (with verbose remarks) on the assumption that they're "us" and will eventually be resolved to known agents. Along with labels, there are thousands of remarks in most every table with the same sort of information - "did something, {date}, ABC." I'm sure those initials make perfect sense to 3 or 4 people for a year or two.... |
I could support "Verbatim agent" with the type of agent in remarks. That
would be simplest and work for the broadest number of collection types, but
with the understanding that the agent type would not be readily searchable
because it would be an open text field with no controlled vocabulary.
I could also support "verbatim_xxx" with each different type of agent.
Either way, I support making this option usable by other collection types
for maker etc.
…On Mon, Apr 2, 2018 at 2:52 PM, dustymc ***@***.***> wrote:
I'm certainly not in love with the table name (and it's far from alone in
being poorly-named), but it's also much more structurally-specific than
"agent who did something."
If some random person wanders in with a specimen, I think most of us would
make them a collector. Most of us would probably not put much value in what
they think they have - someone in the collection would identify the
material and be recorded as the determiner. From that, I would probably
make the 3-letter determiners into agents (with verbose remarks) on the
assumption that they're "us" and will eventually be resolved to known
agents.
Along with labels, there are thousands of remarks in most every table with
the same sort of information - "did something, {date}, ABC." I'm sure those
initials make perfect sense to 3 or 4 people for a year or two....
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hME2e1-fMKPAQt_AuAmDa_HywgJ6ks5tko-XgaJpZM4TD-Oe>
.
|
One of us isn't understanding something. Maker etc data are in table "collector," as attributes those data are in a thing called "...-collector." It's usable by everyone, and always has been. This thing is for probably-low-quality data. Absolutely nothing about it is usefully searchable, and results derived from low-quality data can be nothing but low quality. Anyone searching this for anything except VERY specific reasons (eg, to reverse an agent merger) is basically wasting time. FWIW, this concept was introduced for the bot who merges agents. Every merger gets a....
with any specimen using the "bad" agent. Can I somehow make this more clear in the definition?? |
with any specimen using the "bad" agent. This is what I'm trying to figure out how to fix. What's the best way for me to handle situations where I know a name string for a maker of a piece, but it would typically be flagged by your merge-agent-bot and automatically turn my name fragment into "unknown" and if I don't pay attention to the message I get sent, I lose the one string of info that might be important for a user. Let me give you a real-world example: we just got some objects donated that are Inuit soapstone carvings. They have artist's signatures on them using Canadian syllabics. Depending on the orientation of the piece you might get 2 totally different literal translations of a name. For example, one carving's literal translations could be Paulusi Tunili or Lunuta Suliapu, depending on which direction you read the syllabics. I would not create two different agents for these names knowing that one of them is not correct, but it would be useful for me or someone who is doing research in these carvings to search this field for a name string literally translated, while the official maker is listed as unknown. Do you think there's a different way I could be handling this situation? By placing this info into an attribute called 'verbatim collector' someone searching that is going to think they are looking for a different kind of person (I know, this is an agent role). I just think it's not useful for us to retain a confusing label for our users just because we want to hold onto a naming concept that no longer is exclusive. The names we call things makes a difference in how our users undertake searches and interpret what they see. They don't know, or care, what a code table might be called or why it was called that. For the relative of an artist to see their family member listed as a collector rather than as the maker of a piece could be offensive, depending on the contact history of that particular Indigenous community. |
The entire concept is not usefully searchable. "A. B. C." may be entered as "ABC" or "C, AB" or "A.,;:''B(Y&***C" or anything else anyone felt like typing, for each and every specimen to which it's attached, and any possible metadata are limited to another single uncontrolled field. There's absolutely no predictability in the data, and how they got there doesn't much change that. Unpredictable data are undiscoverable data. Your example is absolutely an entity, and you are absolutely correct that it should be entered as a SINGLE entity. I'd use the Canadian syllabics (https://en.wikipedia.org/wiki/Unified_Canadian_Aboriginal_Syllabics_(Unicode_block)) for preferred name and add all possible translations as other names - anyone searching any of that stuff will get where they want to go, and from there find a link to the agent record for clarification. The "verbatim" thing (as it might be used by anyone except the Arctos scripts) is meant more for "ABC", where ABC is some random person-or-something-probably that dropped off (or some collector_role'd) a specimen and you don't really know who they are or expect to hear from them again. The various agent tools scripts are meant to help you do your job, not prevent that. If they're misbehaving (eg, not following http://handbook.arctosdb.org/documentation/agent.html#creating--maintaining-agents - which we can always discuss), PLEASE let me know. |
From above:
So, can we do this in documentation? Attribute Type: Ya'll tell me. Description: Verbatim text string representing an Agent as associated with a specimen. Usage should generally be limited to low-quality agents (e.g., those not likely to become associated with more data) and to data normally recorded in table Collector (under any role in https://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE). When known, Collector Role should be entered in Attribute Remarks. Edits and/or better ideas greatly appreciated. Please do NOT make any code table changes - this is used by application code, I'll sync everything up when we have an acceptable solution. |
@dustymc
|
The funky agent code is mostly looking for diacritics, which isn't relevant here (unlike eg "ñ", there is no "close enough ASCII version" of those characters.) I don't have the capacity to translate - I have no idea if "Kigai" is a useful representation of ᑭᒐᐃ or not - so in the next version, when preferred name is all non-ASCII and there's something in 'aka', 'alternate spelling', or 'full,' the scripts will assume that whatever's in the AKA is a useful translation and not report a potential problem. |
Perfect, thanks. |
Well it turns out spaces are ASCII and my testing was insufficient - it's patched, should run correctly tonight. Is the rest of this still a problem? Is this a useful definition for the concept?
FYI here's what the data I insert look like:
And as an interesting index of how free-text anything tends to work, there are roughly 50 ways of saying "information from label" in the few human-generated remarks of this new-ish concept. |
Picking up this issue of allowing an agent for a specimen who is NOT an Arctos Agent. Is that the correct interpretation? |
I think that would work! Just want to make sure that we don't mess with anything given Dusty's request above that we don't change code tables:
|
I remain hesitant to extend this beyond Agents who would normally be linked to specimens via table Collector, and I think the imperfect but consistent naming convention helps clarify that. Attributes with units must be numeric: http://handbook.arctosdb.org/how_to/Understanding-Attribute-Errors.html The merger-bot adds a remark of the format given in #1492 (comment). Anything different will make these data even less discoverable, should that be possible. |
Tentatively closing. |
OH I get it - so if we WANT to add a not in Arctos name as a preparator, then we should add "verbatim preparator" to the attribute code table? |
What would you hope to accomplish by that? The (initial, anyway) point of this was primarily to log agent mergers. That's really only something you'd be interested in after you'd found a record (eg, you don't think it was actually {collector-roled} by AgentID=12345), and you can get the details of that (if my scripts introduced the attribute) from remarks. I don't think more structure would lead to better usability, it would just lead to very confusing data as http://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE evolves. The concept also been useful for initial import; just throw whatever you've got in the attribute, don't clean up or load "collectors" as agents. Remarks is a perfectly suitable mechanism for recording collector role there as well (although I don't think we've ever used it for data which had more than collector_role=collector). |
I'm trying to follow this and failing. Can't this just be an interface issue? Keep the table structure as is, change the UI term to verbatim agent? Define verbatim agent for this use case but keep in the Table collector? The issue here is what people see, not what's under the hood. |
This is a data-driven ultra-normalized part of Arctos; those are one and the same. |
Would anyone be opposed to adjusting the CTATTRIBUTE_TYPE "verbatim collector" to "verbatim agent" ?
More often than not, we're entering the name of the maker/artist who only has a name string signed on the piece and we may or may not be able to figure out the actual agent name. By changing this to "verbatim agent" everyone could enter the qualifying info in the "remarks" and "det. method" fields
to indicate which agent type and then use the preferred "unknown" as the agent.
As it is with verbatim collector it adds confusion and I would not have my staff/students use this field.
The text was updated successfully, but these errors were encountered: