Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust verbatim collector to verbatim agent #1492

Closed
AJLinn opened this issue Apr 2, 2018 · 24 comments
Closed

adjust verbatim collector to verbatim agent #1492

AJLinn opened this issue Apr 2, 2018 · 24 comments
Labels
Function-Agents Function-CodeTables Priority-High (Needed for work) High because this is causing a delay in important collection work..

Comments

@AJLinn
Copy link

AJLinn commented Apr 2, 2018

Would anyone be opposed to adjusting the CTATTRIBUTE_TYPE "verbatim collector" to "verbatim agent" ?

More often than not, we're entering the name of the maker/artist who only has a name string signed on the piece and we may or may not be able to figure out the actual agent name. By changing this to "verbatim agent" everyone could enter the qualifying info in the "remarks" and "det. method" fields
to indicate which agent type and then use the preferred "unknown" as the agent.

As it is with verbatim collector it adds confusion and I would not have my staff/students use this field.

@dustymc
Copy link
Contributor

dustymc commented Apr 2, 2018

There's some discussion at #1239

I have no strong feelings, but the concept is named after the TABLE (collector) for which it's a text-value (vs. data object) replacement, NOT the role of any "collectors" (http://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE)

I sort of think that we should somehow attempt to retain that association - the attribute (whatever it's called) applies to things which with more data would be entered into table collector, and the attribute should not be used for things which with more data might appear as Agents outside that table.

@campmlc
Copy link

campmlc commented Apr 2, 2018 via email

@Jegelewicz
Copy link
Member

I am in agreement with @AJLinn . Based upon @dustymc 's comments, I suggest we either have multiple attributes: verbatim collector, verbatim determiner, verbatim preparator, verbatim maker, etc. or we change to an ATTRIBUTE of verbatim agent, enter the agent's name in ATTRIBUTE_VALUE and make ATTRIBUTE_UNITS required with a controlled vocabulary of collector, determiner, preparator, maker, etc.

@dustymc
Copy link
Contributor

dustymc commented Apr 2, 2018

Any reason would couldn't have both?

Technically? Nope, none at all.

From a usability standpoint,

  1. that gives us two things doing the same job; about 50% of users will find one of them, assume they've found everything, and leave without the information they're looking for.
  2. We currently have 133 attributes, which is absolutely enough that most folks won't find what they want in there. I'm very hesitant to add more unless there's a really good reason to do so. This doesn't look like a very good reason to me, but I could be convinced that it is.
  3. undefined authority values #1450

@dustymc
Copy link
Contributor

dustymc commented Apr 2, 2018

ATTRIBUTE_UNITS required

Attributes with units (which are always required when they can exist for an attribute) force value to be numeric.

@Jegelewicz
Copy link
Member

Maybe the solution is to change the TABLE (collector) to TABLE (agent_actor), since the name in that table may be a collector, preparator, maker, etc. Which also brings me to determiners? How do they fit? I've definitely had some determiners that ended up as unknown because all I had were three initials, but they aren't listed in the collector table....

@dustymc
Copy link
Contributor

dustymc commented Apr 2, 2018

I'm certainly not in love with the table name (and it's far from alone in being poorly-named), but it's also much more structurally-specific than "agent who did something."

If some random person wanders in with a specimen, I think most of us would make them a collector. Most of us would probably not put much value in what they think they have - someone in the collection would identify the material and be recorded as the determiner. From that, I would probably make the 3-letter determiners into agents (with verbose remarks) on the assumption that they're "us" and will eventually be resolved to known agents.

Along with labels, there are thousands of remarks in most every table with the same sort of information - "did something, {date}, ABC." I'm sure those initials make perfect sense to 3 or 4 people for a year or two....

@campmlc
Copy link

campmlc commented Apr 2, 2018 via email

@dustymc
Copy link
Contributor

dustymc commented Apr 3, 2018

making this option usable by other collection types

One of us isn't understanding something. Maker etc data are in table "collector," as attributes those data are in a thing called "...-collector." It's usable by everyone, and always has been.

This thing is for probably-low-quality data. Absolutely nothing about it is usefully searchable, and results derived from low-quality data can be nothing but low quality. Anyone searching this for anything except VERY specific reasons (eg, to reverse an agent merger) is basically wasting time.

FWIW, this concept was introduced for the bot who merges agents. Every merger gets a....

'Automated insertion from agent merger process - #escapeQuotes(bads.agent_pref_name)# --> #escapeQuotes(bads.rel_agent_pref_name)# for collector role ' || COLLECTOR_ROLE, sysdate'

with any specimen using the "bad" agent.

Can I somehow make this more clear in the definition??

@AJLinn
Copy link
Author

AJLinn commented Apr 3, 2018

This thing is for probably-low-quality data. Absolutely nothing about it is usefully searchable, and results derived from low-quality data can be nothing but low quality. Anyone searching this for anything except VERY specific reasons (eg, to reverse an agent merger) is basically wasting time.

FWIW, this concept was introduced for the bot who merges agents. Every merger gets a....

'Automated insertion from agent merger process - #escapeQuotes(bads.agent_pref_name)# --> #escapeQuotes(bads.rel_agent_pref_name)# for collector role ' || COLLECTOR_ROLE, sysdate'

with any specimen using the "bad" agent.

This is what I'm trying to figure out how to fix. What's the best way for me to handle situations where I know a name string for a maker of a piece, but it would typically be flagged by your merge-agent-bot and automatically turn my name fragment into "unknown" and if I don't pay attention to the message I get sent, I lose the one string of info that might be important for a user.

Let me give you a real-world example: we just got some objects donated that are Inuit soapstone carvings. They have artist's signatures on them using Canadian syllabics. Depending on the orientation of the piece you might get 2 totally different literal translations of a name. For example, one carving's literal translations could be Paulusi Tunili or Lunuta Suliapu, depending on which direction you read the syllabics. I would not create two different agents for these names knowing that one of them is not correct, but it would be useful for me or someone who is doing research in these carvings to search this field for a name string literally translated, while the official maker is listed as unknown.

Do you think there's a different way I could be handling this situation? By placing this info into an attribute called 'verbatim collector' someone searching that is going to think they are looking for a different kind of person (I know, this is an agent role). I just think it's not useful for us to retain a confusing label for our users just because we want to hold onto a naming concept that no longer is exclusive. The names we call things makes a difference in how our users undertake searches and interpret what they see. They don't know, or care, what a code table might be called or why it was called that. For the relative of an artist to see their family member listed as a collector rather than as the maker of a piece could be offensive, depending on the contact history of that particular Indigenous community.

@dustymc
Copy link
Contributor

dustymc commented Apr 3, 2018

with any specimen using the "bad" agent.

The entire concept is not usefully searchable. "A. B. C." may be entered as "ABC" or "C, AB" or "A.,;:''B(Y&***C" or anything else anyone felt like typing, for each and every specimen to which it's attached, and any possible metadata are limited to another single uncontrolled field. There's absolutely no predictability in the data, and how they got there doesn't much change that. Unpredictable data are undiscoverable data.

Your example is absolutely an entity, and you are absolutely correct that it should be entered as a SINGLE entity. I'd use the Canadian syllabics (https://en.wikipedia.org/wiki/Unified_Canadian_Aboriginal_Syllabics_(Unicode_block)) for preferred name and add all possible translations as other names - anyone searching any of that stuff will get where they want to go, and from there find a link to the agent record for clarification.

The "verbatim" thing (as it might be used by anyone except the Arctos scripts) is meant more for "ABC", where ABC is some random person-or-something-probably that dropped off (or some collector_role'd) a specimen and you don't really know who they are or expect to hear from them again.

The various agent tools scripts are meant to help you do your job, not prevent that. If they're misbehaving (eg, not following http://handbook.arctosdb.org/documentation/agent.html#creating--maintaining-agents - which we can always discuss), PLEASE let me know.

@dustymc
Copy link
Contributor

dustymc commented Apr 3, 2018

naming

From above:

I sort of think that we should somehow attempt to retain that association - the attribute (whatever it's called) applies to things which with more data would be entered into table collector, and the attribute should not be used for things which with more data might appear as Agents outside that table.

So, can we do this in documentation?

Attribute Type: Ya'll tell me.

Description: Verbatim text string representing an Agent as associated with a specimen. Usage should generally be limited to low-quality agents (e.g., those not likely to become associated with more data) and to data normally recorded in table Collector (under any role in https://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE). When known, Collector Role should be entered in Attribute Remarks.

Edits and/or better ideas greatly appreciated.

Please do NOT make any code table changes - this is used by application code, I'll sync everything up when we have an acceptable solution.

@AJLinn
Copy link
Author

AJLinn commented Apr 12, 2018

@dustymc
How do I fix this given your recommendation of using the syllabics for the preferred name? We do have the literal translation as an aka.

Agents which may not comply with the Arctos Agent Creation Guidelines (http://handbook.arctosdb.org/documentation/agent.html#general-agent-creation-and-maintenance-guidelines) have been detected. If you are receiving this email, you have either created a potentially noncompliant agent or have manage_collection roles for a user who has created a potentially noncompliant agent. If you are a collection manager, please ensure that everyone with manage_agents rights in your collection has read and understands the agent creation guidelines.
Please use the contact link at the bottom of any Arctos form if you believe you have received this mail in error, or if you wish to discuss the Arctos Agent Creation Guidelines.

Please review the following agents and make corrections or additions as appropriate.

ᐅᓱᐃᑐ ᐄᐱᓕ ᑭᒐᐅ
CreatedBy: Mahriena Ellanna
Problem: no_ascii_variant
ᐸᐅᓗᓯ ᑐᓂᓕ
CreatedBy: Mahriena Ellanna
Problem: no_ascii_variant
ᑭᒐᐃ
CreatedBy: Mahriena Ellanna
Problem: no_ascii_variant

@dustymc
Copy link
Contributor

dustymc commented Apr 16, 2018

The funky agent code is mostly looking for diacritics, which isn't relevant here (unlike eg "ñ", there is no "close enough ASCII version" of those characters.)

I don't have the capacity to translate - I have no idea if "Kigai" is a useful representation of ᑭᒐᐃ or not - so in the next version, when preferred name is all non-ASCII and there's something in 'aka', 'alternate spelling', or 'full,' the scripts will assume that whatever's in the AKA is a useful translation and not report a potential problem.

@AJLinn
Copy link
Author

AJLinn commented Apr 16, 2018

Perfect, thanks.

@dustymc
Copy link
Contributor

dustymc commented Apr 17, 2018

next version

Well it turns out spaces are ASCII and my testing was insufficient - it's patched, should run correctly tonight.

Is the rest of this still a problem?

Is this a useful definition for the concept?

Verbatim text string representing an Agent as associated with a specimen. Usage should generally be limited to low-quality agents (e.g., those not likely to become associated with more data) and to data normally recorded in table Collector (under any role in https://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE). When known, Collector Role should be entered in Attribute Remarks.

FYI here's what the data I insert look like:

Automated insertion from agent merger process - Alaska Department of Transportation --> State of Alaska Department of Tr
ansportation & Public Facilities for collector role maker

And as an interesting index of how free-text anything tends to work, there are roughly 50 ways of saying "information from label" in the few human-generated remarks of this new-ish concept.

@Jegelewicz Jegelewicz added Priority-High (Needed for work) High because this is causing a delay in important collection work.. Function-Agents Function-CodeTables labels Jul 12, 2018
@mkoo
Copy link
Member

mkoo commented Jul 12, 2018

Picking up this issue of allowing an agent for a specimen who is NOT an Arctos Agent.
The proposal seems like it could be straightforward if you go with the proposal to adjust CTATTRIBUTE_TYPE:
need to change ATTRIBUTE of "verbatim collector" to "verbatim agent", enter the agent's name in ATTRIBUTE_VALUE and make ATTRIBUTE_UNITS required with a controlled vocabulary of collector, determiner, preparator, maker (basically using CTCOLLECTOR_ROLE. which does not currently have determiner)

Is that the correct interpretation?

@Jegelewicz
Copy link
Member

I think that would work! Just want to make sure that we don't mess with anything given Dusty's request above that we don't change code tables:

Please do NOT make any code table changes - this is used by application code, I'll sync everything up when we have an acceptable solution.

@dustymc
Copy link
Contributor

dustymc commented Jul 16, 2018

I remain hesitant to extend this beyond Agents who would normally be linked to specimens via table Collector, and I think the imperfect but consistent naming convention helps clarify that.

Attributes with units must be numeric: http://handbook.arctosdb.org/how_to/Understanding-Attribute-Errors.html

The merger-bot adds a remark of the format given in #1492 (comment). Anything different will make these data even less discoverable, should that be possible.

@dustymc
Copy link
Contributor

dustymc commented Feb 10, 2020

  1. The initial problem has been fixed and wasn't actually related to this attribute.
  2. The proposal to make this 'verbatim agent' effectively disables the reason for which this attribute was created; with that change, I can't tell why the attribute exists, it becomes a random string presumably representing some agent-like entity that did something vaguely related to something that was at some point somehow related to the record.
  3. The entire point of this attribute is and always has been centered on "NOT an Arctos Agent."
  4. The suggestions regarding units on strings are not structurally feasible.

Tentatively closing.

@dustymc dustymc closed this as completed Feb 10, 2020
@Jegelewicz
Copy link
Member

OH I get it - so if we WANT to add a not in Arctos name as a preparator, then we should add "verbatim preparator" to the attribute code table?

@dustymc
Copy link
Contributor

dustymc commented Feb 10, 2020

add "verbatim preparator" to the attribute code table

What would you hope to accomplish by that?

The (initial, anyway) point of this was primarily to log agent mergers. That's really only something you'd be interested in after you'd found a record (eg, you don't think it was actually {collector-roled} by AgentID=12345), and you can get the details of that (if my scripts introduced the attribute) from remarks. I don't think more structure would lead to better usability, it would just lead to very confusing data as http://arctos.database.museum/info/ctDocumentation.cfm?table=CTCOLLECTOR_ROLE evolves.

The concept also been useful for initial import; just throw whatever you've got in the attribute, don't clean up or load "collectors" as agents. Remarks is a perfectly suitable mechanism for recording collector role there as well (although I don't think we've ever used it for data which had more than collector_role=collector).

@campmlc
Copy link

campmlc commented Mar 1, 2020

I'm trying to follow this and failing. Can't this just be an interface issue? Keep the table structure as is, change the UI term to verbatim agent? Define verbatim agent for this use case but keep in the Table collector? The issue here is what people see, not what's under the hood.

@dustymc
Copy link
Contributor

dustymc commented Mar 2, 2020

The issue here is what people see, not what's under the hood.

This is a data-driven ultra-normalized part of Arctos; those are one and the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Function-Agents Function-CodeTables Priority-High (Needed for work) High because this is causing a delay in important collection work..
Projects
None yet
Development

No branches or pull requests

5 participants