Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a dataverse administrator, I want to remove my name from contributors list of a dataset version #6274

Closed
tcoupin opened this issue Oct 15, 2019 · 17 comments

Comments

@tcoupin
Copy link
Member

tcoupin commented Oct 15, 2019

All editors of a dataset appears in contributors list of the published dataset version.

On the following snapshot, only the scientist should be displayed (Christian DEPRAETERE) et not the administrators who make only small adjustements.
image

@tcoupin
Copy link
Member Author

tcoupin commented Oct 15, 2019

A temporary workaround can be to delete rows from datasetversionuser table. What do you think about that ?

@djbrooke
Copy link
Contributor

djbrooke commented Oct 15, 2019

Hi @tcoupin, in #5541 there were changes that allowed updates to be made by superusers without making a version change. This was included with 4.12. I'd suggest using this feature instead of removing records that were created.

I acknowledge that the use case that you present is slightly different, but I'm concerned about providing extended functionality in this area because each additional option for editing without a record reflected in the versions view takes away from what the versioning system was meant to support in the first place. Suggestions/discussion welcome!

@tcoupin
Copy link
Member Author

tcoupin commented Oct 17, 2019

Hey @djbrooke,
I use this feature but my name is in the contributors list. Since this feature is for superusers only, we can remove names when publishing whitout version increase.

@djbrooke
Copy link
Contributor

Hi @tcoupin, I'm not sure I understand. Are you saying that you are able to use the feature to publish without incrementing the version, but your name still appears on the contributor list?

@tcoupin
Copy link
Member Author

tcoupin commented Oct 18, 2019 via email

@djbrooke
Copy link
Contributor

Thanks @tcoupin, I just tested this in this way:

A new version was not created (as expected) and I did see the superuser account added to the contributors list as you mention.

@qqmyers and @adam3smith, do you have any thoughts on this? I suppose I was expecting for the superuser account to not be added to the contributors list for the current version, but thinking about it more, this is correct and makes sense. Just wondering if it came up as you were developing #4760/#5541.

@scolapasta
Copy link
Contributor

I can see both sides of this. A change was made to that version, so we should track it as if it had happened when in draft. But I also see the idea of having a superuser make changes for the contributor and not needing to show up as a contributor.

I've seen other applications were a superuser could log in "as" another user. Not sure if that's kind if functionality is overkill for this or even wanted generally.

@qqmyers
Copy link
Member

qqmyers commented Oct 18, 2019

The Contributors as listed in the versions table are not the Creators listed in the dataset metadata and the citations. Would changing the labeling, or what's displayed in the table resolve the issue (versus actually changing who is considered a Contributor to the dataset object in Dataverse)?

In some sense, who contributed in Dataverse is not the same kind of info as the rest (i.e. it's only interesting to someone tracking use in Dataverse as opposed to someone coming in to check out the data), so perhaps it should not appear in the Versions table. Or be a column that appears only for admins, or is on a separate admin/mgmt tab, etc.

(FWIW, the work to do archiving is similar (mostly of interest to admins/curators/managers) and for QDR, we've (for now) added a column to the versions tab that shows whether the archival copy exists (or not) that only shows for admins. If there start to be several things are management-related, perhaps a separate mgmt/admin tab would be the better option).

@adam3smith
Copy link
Contributor

I want the information captured and I think having a log of who made changes, including to metadata, is the right way to go (it certainly is for us).

Completely agree with @qqmyers though that the label may need a change to avoid confusion between this and the metadata label of the same name. How about "Edited by"?

@stevenmce
Copy link

stevenmce commented Oct 20, 2019 via email

@djbrooke
Copy link
Contributor

Thanks all for the discussion here. I hear the arguments about a label change being a possibility here, but I also like "contributors" because it recognizes the act of curation as a contribution, not just editing (I think I'm echoing @stevenmce). I encourage further conversation here but making changes in this section isn't something that's currently a priority.

@tcoupin, the team at IQSS has other priorities right now and we can't really dig into this in detail from a design perspective. In the short term, we'd also want to avoid any PRs that remove contributors (even those just API based). We don't want to add something right now that will make it harder to support workflows for sensitive data in the future, which require auditing and other characteristics. A DB edit may meet your needs but I'm not sure.

@scolapasta
Copy link
Contributor

A DB edit can solve any problem. :) But yes, in this case, it's straightforward delete from one table and there are no dependencies on those rows, so you can definitely do manual db deletes for this.

@TaniaSchlatter
Copy link
Member

TaniaSchlatter commented Oct 21, 2019

Maybe in the short term we could add a tool tip that clarifies the term "Contributors" as used in the versions table.

@djbrooke
Copy link
Contributor

I'm going to close this for now since we won't be making changes, but we can of course reference it in the future.

@BPeuch
Copy link
Contributor

BPeuch commented Apr 3, 2020

Hello everybody,

@djbrooke explained that several things discussed here entail large-scale changes and that they're not priorities at the moment, and I took good note of that. But I just wanted to make a few suggestions in case they can help in future discussions.

  1. I very much agree with the idea of specifying who exactly did what, Wikipedia style. I think it would add a lot of value in terms of transparency, especially considering the present use case: admins and/or superusers modifying datasets alongside the depositors / other authorized users. Even if an admin/super's modifications are minor, monitoring who did what exactly will prevent users from suspecting foul play or feeling like the Dataverse admins can do what they want in total secrecy. In turn it will also allow the admins to monitor their users and e.g. reach out to users who fail to encode (meta)data properly (cf. Pareto principle).

  2. That being said, I very much side with @stevenmce and think that it's also important to make roles explicit, yet without erasing the admin/superuser's name because that information is still extremely relevant if we are to do things in an honest, transparent manner. In this way it would become immediately clear why this John Smith or that Jane Doe so very often modify datasets: because they're admins/superusers, so they're just carrying out maintenance/technical tasks, cleaning things up, correcting typos and what not, as is expected from admins.

  3. Finally, I entirely agree with @scolapasta. The archivist in me is outraged at the thought of not documenting changes, however small they might be, and in spite of the fact that they might have been carried out by administrators. Building trust in our users entails being as transparent as may be (within, of course, reasonable limits; e.g. it's not about publishing the list of our users' email addresses) and modifying their precious datasets is something that must be done according to public policies and under automated monitoring.

@BPeuch BPeuch added this to Santa's watching (Keep an eye on) in Dataverse SODHA (Belgium) Apr 3, 2020
@BPeuch BPeuch removed this from Santa's watching (Keep an eye on) in Dataverse SODHA (Belgium) Nov 26, 2020
@pdurbin
Copy link
Member

pdurbin commented Feb 2, 2021

This issue seems to be a decent place for me to write up some behavior I noticed while working on #4475.

The ReturnDatasetToAuthorCommand calls updateDatasetUser which adds a row to the datasetversionuser table containing the id the user who called that command. The datasetversionuser table is what's used to populate the "Contributors" list, which looks like this:

Screen Shot 2021-02-02 at 3 40 36 PM

All this is to say that in some cases curators might be noticing their names under "Contributors" not because they edited the dataset but because they clicked "Return to Author" (the second half of the "Submit for Review" workflow).

Here's the code mentioned above: https://github.com/IQSS/dataverse/blob/v5.3/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/ReturnDatasetToAuthorCommand.java#L47

@scolapasta
Copy link
Contributor

Hi, @eunices, yes, deleting directly the rows from that table should be OK. It would be nice if we had a superuser API though, as that would of course allow less room for error. But in the meanwhile, deleting manually is the way to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

9 participants