Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are our Agent goals? #6813

Closed
dustymc opened this issue Oct 7, 2023 · 18 comments
Closed

What are our Agent goals? #6813

dustymc opened this issue Oct 7, 2023 · 18 comments
Labels
Enhancement I think this would make Arctos even awesomer! Function-Agents Help wanted I have a question on how to use Arctos Let's talk! GH comments are exhausted; to be discussed in realtime testable feature is running in test, feedback is welcome

Comments

@dustymc
Copy link
Contributor

dustymc commented Oct 7, 2023

Latest Edit

My understanding of The Community's goals is summarized in https://docs.google.com/document/d/1tSk344mJsIRYZ-BJcYfyaPEmbrFS-wSqZzJ5dO1E2EU/edit. I believe that's all immediately actionable. CHAS heavily curates Agents and has expressed a strong desire to maintain high-quality Agent data; @droberts49 indicates that they could work within the proposed model. I believe it would also easily accommodate @DerekSikes ' needs.

@Jegelewicz 's experiment in agent attributes (https://docs.google.com/spreadsheets/d/1h0cxNvMyIJuM9M30g75t5hG7k6V1FbxEOlUb7c7cnvA/edit#gid=0) indicates a mechanism - I think it's capable of carrying nearly any amount of information, with versioning providing a full history including editors.

Is this an accurate summary of The Community's views and goals? If not, how do we fix it? If so, how do we proceed?

HELP!

Edit

There are at this point many agent issues, some of which would require a different outlook if not structure. It's not clear to me what The Community's goals are, but I think they may not be what I'd initially assumed below. This is a plea to do what the amended title of this Issue requests: clarify the goals. (The rules and structure should be easy once that's accomplished, and cannot be considered while the goals are unclear.)

#7135 contains an explicit request to make verbatim agents into Agents, which I believe is also an implicit request to remove all "requirements," possibly including that of unique preferred names. I think that is most-shaping at this point, but I also don't know if the implications are understood and/or accepted by The Community.

These are affected Issues, but perhaps do not drive the goals, or do so to a lesser extent.

Original Below

Is your feature request related to a problem? Please describe.

associate of [ link ] is being used to game the 'agents need data' system. I don't think it adds anything of value, just a lot more work for overall lower-quality data.

Describe what you're trying to accomplish

  • Call people what they want to be called.
  • Provide proper attribution.

Describe the solution you'd like

No hiding behind flaky relationships!

Describe alternatives you've considered

  • Don't call people what they want to be called.
  • Don't provide proper attribution.

Additional context

Priority

Anything that might help stop unnecessary agents seems pretty critical from here.

@dustymc dustymc added the Enhancement I think this would make Arctos even awesomer! label Oct 7, 2023
@ewommack
Copy link

ewommack commented Oct 7, 2023

I have several people who are associated with organizations, but not hired by them, volunteers, or students. Associate is a nice catch all term that allows us to catch these more broad relationships.

Are there other terms that you would suggest?

@AJLinn
Copy link

AJLinn commented Oct 7, 2023

I agree with @ewommack - associate of is a valuable and completely legitimate relationship for actual research associates of museums (see my list of my own department: https://www.uaf.edu/museum/collections/ethno/staff/), as well as those she describes.

Likewise, currently all the members of the various Arctos teams have "associate of" Arctos Working Group, etc. (See my own agent profile as an example: https://arctos.database.museum/agents.cfm?agent_id=21248712)

I think this is actually a symptom of a bigger problem in the forced verbatimization of agents that we underwent last year that many of us are still bitter about. The fact that operators feel the need to add the "associate of" relationship as a way of getting around the verbatim agent status is a major signal of a portion of the community's feeling about this new lower-status agent and the additional work being placed upon us to re-agentify those previously deemed low-data status and those that might be flagged as such in the future.

Please don't get rid of this useful relationship.

Maybe instead, think about why people are using it in the way you are perceiving and if verbatim agent as an attribute is serving the purpose that was intended. I know for myself, it's created a huge backlog of additional work I'm being forced to take on so my users and myself are able to access the data they expect (e.g., including verbatim agent in my search results is a cost of 50, which means my search results are limited to 500 records, when there should be 3x that number).

Screenshot 2023-10-07 at 11 42 28 AM

@mkoo
Copy link
Member

mkoo commented Oct 9, 2023

I know it's inconsistently applied but this is what we did before that relationship (I think it was before!)
Firefox_Screenshot_2023-10-09T04-45-13 562Z

We should clean these two names up and could use 'associate of' as well as 'not the same as' to distinguish, right?

'associate of' definitely has its uses

(Actually Arctos wont allow two agents with the same preferred name even with other distinguishing attributes! What's the best course of action then?)

@dustymc
Copy link
Contributor Author

dustymc commented Oct 9, 2023

'not the same as'

That's also used to mean "no data but gonna do this anyway."

Arctos wont allow two agents with the same preferred name

And cannot as long as we're allowing Agents with data which keeps them indistinguishable. This can probably be dropped as unrealistic, but we are going to be stuck behind that silly constraint until and unless we find some way to stop creating low-information Agents. (And neither David Stephen Taylor has any information which might require an Agent...)

@mkoo
Copy link
Member

mkoo commented Oct 9, 2023

ah the realities of curation! See my other issue about low information agents as a necessary step in data migration and clean-up (#6114). We need a reality check on the pace of cleaning up agents.

Here's the way I see this: we aim for high quality agents which is achievable through coalescence of lots of disparate data sources (eg obituaries, museum transactions, publications, etc). However, we dont always have access to that IMMEDIATELY at data entry. Sometimes it's a short period dependent on data entry, sometimes it's an unknown period dependent on idiosyncratic actions and serendipity at maybe more than one collections (eg archives, records at different institutions, etc)

But we will almost never have it entirely at data entry. And that's about 80% of the issues regarding Agents.

We struggle with entering in agents with dup names, unusual names, incomplete names, etc. But that is the reality that we face with both legacy data and incoming data at the museums (eg salvage, etc) So I'm seeing lots of "workarounds" like the above which I think is worse for data quality (Dusty called them "barely above the bar" or something!).

If we dont use 'associate of' then someone will request a new term for even vaguer relationship? (oh I know you hate that idea but that;s what;s happening now)

So it's not incompatible that we care about unambiguous agents (known knowns in the words of a former defense secratary) because we also need to accommodate known ambiguous agents (the known unknowns). We just want to avoid the unknown unknowns!

@mkoo mkoo added this to the Needs Discussion milestone Oct 9, 2023
@dustymc
Copy link
Contributor Author

dustymc commented Oct 9, 2023

we dont always have access to that IMMEDIATELY

Then there's no NEED for an Agent, and creating them at this time assures they can't be disambuguated from other Agents. I still suggest

  1. Issue for whatever verbatim agents lack, and
  2. verbatim agents until the data demand otherwise

need to accommodate known ambiguous agents

Yes, but not as Agents, verbatim agent exists for this very purpose.

then someone will request a new term for even vaguer relationship

Yea, so let's kill this attempt at addressing a symptom and talk big picture: What are the goals for Agents in Arctos, and what are we willing to do to accommodate them?

FWIW the agent project was intended to make entry much easier (by only requiring the complexity of a full-blown data object only when the data require such a thing), and make agent creation much easier when necessary (by not requiring it when there's not much data, and putting it in the context of related data - eg somewhere Arctos tools can get it it, rather than in a spreadsheet in front of and separate from an incoming collection). Creating low-quality agents does neither of those things, and clearly can't support what I understand to be the end goals (providing proper attribution and allowing nonunique preferred names).

@Jegelewicz
Copy link
Member

What are the goals for Agents in Arctos, and what are we willing to do to accommodate them?

The fact that we couldn't answer that question is why @ArctosDB/agents-committee is on hiatus. If we aren't going to use the Agent table as an authority, then we shouldn't prevent any new agent from being created. I don't think anyone is going to sign up to "clean up agents" if any old agent can be created because there is absolutely no way to handle any agent used by more than "my" collection.

I don't care which path we choose, but we need to choose one because standing in the middle of "create and agent for A. B. C." and thou shalt know thy agents (and prove it with some good information), is making everything too hard.

@dustymc dustymc changed the title Feature Request - remove agent relationship 'associate of' What are our Agent goals? Oct 16, 2023
@dustymc
Copy link
Contributor Author

dustymc commented Oct 16, 2023

I don't care which path we choose

Thanks, title edited. I don't care either, but what we do (or don't do) in the data 100% controls what we can do with the data. Agree being stuck in the middle (where we always seem to end up...) is the least-comfortable position. If we don't care about these data then we should make it easy to not care about these data and lose the bar that's clearly easy to duck anyway, if we do care (or want to answer questions which require that care, or whatever) then we need to address this (and I've been meaning to file an Issue about auto-deleting noncompliant agents, and....).

@ewommack
Copy link

While I understand the great to desire to have all of our agents be full data, I just do not think we have the ability to do that. I also think that some of the power of Arctos just comes with the ability to track agents through individual collections. Maintaining that connection for even low information agents means that we can go back and fix them later.

I know we switched to the verbatim agent attribute for lower quality agents, but I don't think people feel like they get the same tracking and data power from them as they do from making a full agent still.
If I have someone who I can only track down one association with, but they have over 100 objects in my collection, I might feel like they need to be a full agent to really make sure I could utilize that data well.

@dustymc
Copy link
Contributor Author

dustymc commented Oct 16, 2023

understand the great to desire to have all of our agents be full data, I just do not think we have the ability to do that.

Clearly true, we're always going to have strings which can't be resolved to a definable entity.

feel like they need to be a full agent

I'm still fine with that, but it has what I believe to be unavoidable costs, the most obvious of which are

  1. I can see absolutely no way that Arctos can function with multiple "Some Agent" preferred name low-data agents. (Been there, done that, we stopped for reasons which became compelling even with a much smaller Arctos.) If we're not requiring some functionally useful minimal data then I think we can only tolerate one "Some Agent." Clearly lots of people share names so that's a bit of a conflict with reality, but it's also what we currently live under (albeit with different expectations, at least from me) and is mostly functional.
  2. I'm absolutely confident that in the above scenario we cannot provide proper attribution. We'll end up with"Some Agent" referring to 5 entities, and also "S. Agent" and "Some Person" and .... 30 other indistinguishable variations also referring to one or more of them and ... - there's just no possible way to know all of what that "Some Agent," and only that "Some Agent," has done, there's no possible way we're going to give them proper attribution in a system without minimal standards.
  3. (Parenthetical because I'm not sure a messier mess has any functional implications, but dropping even the current expectations - or my interpretation of them - would likely result in orders of magnitude more similar agents which would be a very different UX experience, but again the split seems to be 'standards or not' so this is perhaps not worth mentioning at all.)

What I'm not fine with is the middle ground. If we're moving to a no-standards model then we need to drop a bunch of rules and update the documentation so that people have realistic expectations of things like the benefits of cleaning Agents or receiving attribution for their contributions, and if we as a Community happen to find value in any of the aforementioned then we need to find a way to shape our data around those things.

@AJLinn
Copy link

AJLinn commented Oct 16, 2023

As someone who personally views the data associated with the people in Arctos as being equally if not more important than the data contained in the catalog records, I appreciate where our Arctos staff are coming from.

And I do believe there is a valid place in Arctos for verbatim agents, like all the other verbatim fields we have recently added. I also want to publicly say that I very much appreciate the addition of the verbatim agents in the Manage Agents interface with a link to the catalog record where they show up. This is super helpful.

As a collection that has a majority of agents with Alaska Native / Native American / Canadian First Nations cultural origins and therefore, an inherently different naming structure and sometimes lettering system (e.g., Canadian syllabics) for preferred names, with the other part being corporations or companies or LLCs owned and bought and sold by a whole host of different organizations over time... (don't get me started)... what I object to is our decidedly rigid format that is setting the standard, and any automatic "bad duplicate of unknown" process that is being forced upon those records at what seems like totally random schedules. Which then causes anyone with those agents to drop everything else we're doing and address this brushfire immediately or else face the verbatimization of an agent like "Edison Electric Appliance Company" who was clearly a real organization connected to items in our collection before we were required to have these other data affiliated with the record to make it a real thing.

Rather, I'd prefer to have the ability to have these agent records flagged in some way for managing when we set our schedule, without the threat of elimination if I'm not able to suddenly prioritize this task. I'm still going thru the very long list of agents for cleanup - my team and I have successfully fixed 391 agents, 359 of which have been re-agentified and only 28 remained as verbatim, and three were discovered to be bad duplicates. But I still have 1,426 to review (and I no longer have a team to do this work!) and fix. Checking Ancestry, etc. takes a lot of time to make these updates, and yes, when I get that data input I feel like I've just brought someone back into existence again! But even doing this in my free time on weekends, I can only get through about 10 records in ~2 hours. Having to totally re-create an agent record from scratch and then re-add this person to all the records they were originally connected to just adds to that elapsed time.

Perhaps we just need to be okay with our "functionally useful minimal data" being that this agent is affiliated with a particular collection, who is likely going to be best suited to figure out the mystery of who they really are, given they have the objects, the specimen tags, the physical paperwork and ancillary products potentially connected to that person.

@dustymc
Copy link
Contributor Author

dustymc commented Oct 16, 2023

"functionally useful minimal data" being that this agent is affiliated with a particular collection

valid place in Arctos for verbatim agents

These statements seem conflicting to me. If that's the bar then I don't know what a verbatim agent could DO, other than add complication.

@mkoo
Copy link
Member

mkoo commented Oct 16, 2023

We may be talking (posting) past each other now, which is counterproductive. I am going to request we take this to an online meeting. Unfortunately I cant make any of the Arctos office hours this week. Any availability this Friday or next Monday?

@AJLinn
Copy link

AJLinn commented Oct 17, 2023

Thanks Michelle - I agree we should start talking. I'm at a conference W-F this week, but would happily meet next Monday.

@ewommack
Copy link

I'm at a conference W-F this week, but would happily meet next Monday.

Me too. Unfortunately I do not think Angie and I are off to the same conference.

@Jegelewicz
Copy link
Member

Bionomia will be running a virtual workshop titled Roundtripping People Identifiers in Collections Management Systems in early December (Dec 6/7 depending on where you are in the world) See https://www.eventbrite.ca/e/roundtripping-people-identifiers-in-collections-management-systems-tickets-733839462587 for more information.

@mkoo
Copy link
Member

mkoo commented Jan 19, 2024

After a lot of discussion, the first step towards the new Agents wish list will be implementation of the verification as a attribute. The new attribute type would accept only: verified | accepted | unverified as values. Agent Committee can discuss criteria for each but for now:

status symbol defn.
verified gold star something a curator would add to an agent to signify that it has the validated and meaningful information to the collection.
accepted silver star potentially could be bot-driven based on a minimum number of attributes
unverified no star for you! has no other data, i.e. low quality data

We can work on the symbology (Arctos bear with stars?) and where they would appear but definitely on public page. Did I get that right @dustymc ?

Meanwhile you can use these font-awesome codes:
<i class="fa-solid fa-star"></i> "gold star"
<i class="fa-duotone fa-star"></i> "silver star"

@Jegelewicz
Copy link
Member

That's how I remember it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement I think this would make Arctos even awesomer! Function-Agents Help wanted I have a question on how to use Arctos Let's talk! GH comments are exhausted; to be discussed in realtime testable feature is running in test, feedback is welcome
Projects
None yet
Development

No branches or pull requests

5 participants