Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User moderation #2290

Open
11 of 15 tasks
zzacharo opened this issue Jul 7, 2023 · 9 comments
Open
11 of 15 tasks

User moderation #2290

zzacharo opened this issue Jul 7, 2023 · 9 comments
Assignees
Labels
EPIC stale No activity for more than 60 days.

Comments

@zzacharo
Copy link
Member

zzacharo commented Jul 7, 2023

This a task to design user moderation in InvenioRDM. That will include capabilities, to mark a user as spam, inactivate a user, delete all user's content, safelist a user. We will need to implement both backend and frontend for this feature.

Resources

See mockups here under the "Users dashboard (List records)"

See RFC here

Instructions on how to test the moderation until we have an UI: CodiMD

Actions

  • Design the feature, writing a small RFC to align the architecture of the backend with the rest of the team
  • Ensure that UI design is agreed and interfaces from the mockups are clearly updated in the RFC
  • Define current scope from the RFC, both for the backend and frontend implementation
  • Make sure the tasks are clearly stated in issues and tracked by this parent issue

Issues

@alejandromumo
Copy link
Member

alejandromumo commented Jul 26, 2023

1st Deliverable

From discussions, the following was agreed:

  • We will use invenio-requests for moderation requests. We will create a new request type for user moderation and use the underlying generic requests service/resource to interact with these requests.
  • User moderation requests are created when the user first publishes a record / creates a community.
  • Actions on users (block, approve, suspend) should be on the users-service
  • Permissions-wise, we should have more granular permissions in terms of administration. E.g. some admin users might be able to moderate users, others don't.
  • "After-actions" (e.g. what happens after we block a user) can be asynchronous for cases where these actions could take a lot of time (e.g. deleting thousands of records)
  • Permissions on the users' service are dictated by the service permission policy. We add new actions like "can_manage" that are only allowed to users of certain role/group (typically one that enables user moderation)
  • Permissions on the requests are dictated by the request's entities. We create a new entity resolver for roles/groups and they require the moderation role.

We agreed on splitting these requirements into multiple deliverables:

  • First derivable delivers the bare minimum. E.g. we can block users from the platform and they can't log in anymore.
  • The second deliverable delivers 1) after actions. E.g. after blocking a user, we also execute side effects like deleting all the user's records; and 2) automatic request creation after the user publishes for the first time.
  • From the second deliverable we have multiple open questions. E.g. how we add the user's "activity" to the request so the moderator can have more context on the user, like the list of user's records.

@alejandromumo
Copy link
Member

alejandromumo commented Jul 27, 2023

After IRL meeting, the following was discussed:

  • Naming-wise, the entity should be called "group" instead of "role". However, the entity being resolved by a "GroupResolver" is a "Role"
  • Ideally, we should address these groups (or roles) by ID and not name. However, this can be implemented afterwards.
  • We might have use cases where the receiver of a request can be multiple entities. E.g. the receiver can be a user or a group of users. So maybe we might have "receivers" instead of a single "receiver".

For the usage of role id instead of role name:

  • We agreed to tackle this in a separate issue so we don't block the feature of being merged.

@alejandromumo
Copy link
Member

2nd Deliverable

Regarding moderation actions (e.g. what happens after a user is blocked), we discussed:

  • Modules should register their actions separately. For now, we can assume they are independent since we'll be dealing with soft deletions.
    • For now, we can omit per-instance configuration and add it later.
  • User service should "notify" the registered modules to run their actions.
  • Like "block" actions, we also need "restore" actions.
  • We can't block the main execution thread, thus avoiding timeouts (e.g. we can kick off one celery task to handle after actions).
  • Actions should be idempotent
  • Record deletion can be anything from deleting one record to thousands of records. We need to be careful with this not to exhaust a lot of resources at the same time.
  • We need to think about reliability and consistency. E.g. deal with failures and consider retries

Following these points, the strategy for now is the following:

  • Think about decentralized actions registration (e.g. maybe entry points)
  • Think about distributed execution of the tasks
    • e.g. we can delete a user's records and communities separately

@alejandromumo
Copy link
Member

We agreed to ship in the second deliverable the following features:

  • Registration mechanism for moderation actions (users service)
  • Empty actions (records service)

The implementation of the actions will come in the next deliverable

@alejandromumo
Copy link
Member

alejandromumo commented Aug 4, 2023

3rd Deliverable

We agreed on having a third deliverable focusing mostly on re-indexing user records after moderation. It includes:

  • Update the record's API and mapping and add an is_verified property to it
  • The property should be computed from the owner (parent.access.owner)for now. In the future, it might need to be computed using other information (e.g. communities).
  • Update the search to sort by is_verified (similar to Zenodo), so unverified records are moved to the bottom of the search results.

@alejandromumo
Copy link
Member

alejandromumo commented Aug 8, 2023

  • Re-indexing of records done in two parts:
    • Latest versions re-indexed immediately
    • All versions re-indexed afterwards (ideally in bulk).

@alejandromumo
Copy link
Member

alejandromumo commented Aug 14, 2023

4th Deliverable

We agreed on having a fourth deliverable focusing on protecting the user from concurrent moderation actions. E.g:

  • A user is blocked and then restored when the block actions are still running (e.g. deleting a high amount of records)

The agreed solution was the following:

  • Adding a lock mechanism (similar to CDS videos) to prevent concurrent accesses to user moderation on the same user (e.g. a Mutex)
  • The lock will be created in invenio-cache for now

Some details on the implementation were also discussed and agreed:

  • The lock is acquired as soon as possible (e.g. when the service method is called) and released as late as possible
    • This point is a bit tricky since the "real" latest release of the lock would be when the actions finished their execution.
    • This point in time would be inside an asynchronous context (a celery task).
  • Due to the previous issue, we agreed on having the locking mechanism done in two steps:
    1. acquire the lock in the service call (with a timeout to avoid deadlocks in case of failures)
    2. refresh the lock inside the celery task.

Refreshing the lock means that the celery task will have access to the lock unconditionally. Therefore, two tasks could actually dispute the lock. We believe that this can be mitigated if the lock timeout on the service call is sufficient to avoid a new moderation call from being accepted at the service layer

@alejandromumo
Copy link
Member

5th Deliverable

We agreed on having a fifth deliverable focusing on the creation of moderation requests when the user publishes a record.

The agreed solution was the following:

  • When a user publishes a record, we will create a moderation request asynchronously if the user is not verified yet - avoiding the overloading of the publish' operations.
    • It is expected that this step will add one db query on the user that has to be executed synchronously (to know if the user is verified or not).
  • We only allow one open request moderation for now.

We also agreed on adding the user management REST endpoints in this deliverable since the overhead of adding it was small for now.

We briefly discussed feature flags, namely:

  • Adding a flag to set verified to True / False by default. Some instances might want to just have all the new users verified by default, while others don't (like Zenodo)

Copy link
Contributor

github-actions bot commented Nov 7, 2023

This issue was automatically marked as stale.

@github-actions github-actions bot added the stale No activity for more than 60 days. label Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPIC stale No activity for more than 60 days.
Projects
None yet
Development

No branches or pull requests

2 participants