Skip to content

Conversation

@marcospri
Copy link
Contributor

@marcospri marcospri commented Apr 8, 2025

This PR:

Migrate AnnotationModeration

  • Creates a new moderation_status column on Annotation (and Annotation Slim)

Every existing annotation will start with a null value there.

  • Creates a new moderation_log table.

  • Updates the current hide/unhide feature to also take into account the new column.

We'll set it to DENIED when hiding and to APPROVED when "`unhidden".

We'll also create an audit record on moderation_log for it.

  • Migration to backfill "DENIED" based on the existing AnnotationModeration rows.

  • TODO Migration to backfill moderation_log based on AnnotationModeration.

  • TODO switch reads off AnnotationModeration to the new moderation_status.

Pre moderated group creation

  • TODO feature flag for pre-moderation
  • Migration for Group.pre_moderated
  • TODO expose "pre_moderated" on the groups API used in the grup edit form.
  • TODO consume "pre_moderated" on the group edit API.
  • TODO UI for pre_moderated toggle.

Moderation queue

  • New endpoint to list the annotations of a group
  • New endpoint to change the moderation status of an annotation.
  • Expose new endpoints to the edit UI config.
  • FE for moderation
  • TODO Link to annotation in context
  • TODO allow moderated (or pending) annotations to be visible in context
  • TODO link to user

@marcospri marcospri changed the title Moderation column [PoC] Pre moderation Apr 8, 2025
@marcospri marcospri force-pushed the moderation-column branch from e1a888f to fe7c4a7 Compare April 9, 2025 12:55
@marcospri marcospri changed the title [PoC] Pre moderation Moderation proof of concept implementation Apr 9, 2025
@marcospri marcospri force-pushed the moderation-column branch from fe7c4a7 to f87c9b0 Compare April 9, 2025 13:39


class Timestamps:
class CreatedMixin:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting this in two, we not always need updated but if we have created we most likely want also updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case where updated is seemingly unnecessary (e.g. the moderation_log) I'd be tempted to just add updated anyway. You never know!

@@ -1,7 +1,12 @@
from typing import Literal

AnnotationAction = Literal["create", "update", "delete"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why I did this but I reckon I was looking for possible values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about typing.Literal 👍

down_revision = "c8f748cbfb8f"


def upgrade() -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migrates existing AnnotationModeration rows to the moderation_status column.

"""
)
)
print("\tUpdated annotations as DENIED:", result.rowcount) # noqa: T201
Copy link
Contributor Author

@marcospri marcospri Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Also backfill moderation_log based on AnnotationModeration.

This is not going to be possible as we don't have the user who hide them.

We could:

  • Make user nullable on moderation_log.
  • Just don't migrate these to moderation_log
  • Pick one user, the creator of the group? Could be confusing on the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I hadn't really considered that we might want to back-fill moderation_log, but that might be good to have. Will we be able to set moderation_log.created correctly based on annotation_moderation.created?

I think we could go with either just not back-filling it or letting user_id be nullable.

nullable=True,
),
)
op.add_column(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • AnnotationSlim has a boolean moderated column. I'm going to keep it like that but take into consideration the new Annotation.moderation_status to keep them in sync.

We should have a spike/discussion about AnnotationSlim. We did some work there but we haven't committed to move the codebase over that table.

My current thinking is that we should:

  • Move annotation_metadata to be related to annotation.id
  • Make a assessment of current slow queries, existing indexes in annotation and make the necessary adjustments.
  • Move existing queries of AnnotationSlim back to Annotation.

I don't think we should block the pre-moderation work over this but it something we could work in parallel or right afterwards.

)

def upsert_annotation_slim(self, annotation):
def upsert_annotation_slim(self, annotation: Annotation, user: User):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are passing user to various methods above now, pass it here to and avoid this query.

permission=Permission.Annotation.MODERATE,
)
def change_annotation_moderation_status(context, request):
status = Annotation.ModerationStatus(request.json_body["moderation_status"].upper())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This endpoint needs the mechanism to avoid updates if the annotation the client saw was updated.

@view_config(
route_name="group_edit_moderation",
request_method="GET",
permission=Permission.Group.EDIT, # TODO permission for moderation?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the new endpoints should be based on:

  • A new feature flag
  • Whether or not you are moderator.

assert svc.all_hidden([]) == set()

# fmt: off
@pytest.mark.parametrize("action,is_private,pre_moderation_enabled,existing_status,expected_status", [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the test that don't require shared changing.

We should test those here too here for coverage but I wanted to make sure all looked good so those are tested as func test here.

)


class TestAnnotationModerationServiceAllHidden:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored this to one test class per target class instead of one test class per method.

@marcospri marcospri force-pushed the moderation-column branch from f87c9b0 to a6671c5 Compare April 9, 2025 14:08
# moderated and hidden.
parents_and_replies = [self.annotation.id] + self.annotation.thread_ids # noqa: RUF005
parents_and_replies = [self.annotation.id, *self.annotation.thread_ids]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO This method below is not reading from the new status column yet.

This has the effect that for example single annotation threads are not hidden from the response (but their content is hidden).

(True, ModerationStatus.SPAM, ModerationStatus.SPAM),
],
)
def test_sharing_a_private_annotation(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using func test for this cases that require a visibility change.

This would be the edit view, calling the anno write service and that calling the moderation service.

These same cases need to be unit tested but I wanted to see if everything was wired correctly.

def delete(context, request):
request.find_service(AnnotationWriteService).unhide(context.annotation)
request.find_service(AnnotationWriteService).unhide(
context.annotation, request.user
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding user to these to keep the audit log up to date.

def _present_for_user(service, annotation, user):
annotation_json = service.present_for_user(annotation, user, hide_moderated=False)

annotation_json["moderation_status"] = (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated on the group_annotations endpoint as well. We'll need a better way to deal with this.

def list_annotations(context: GroupContext, request):
group = context.group
params = validate_query_params(PaginationQueryParamsSchema(), request.params)
page_number = params["page[number]"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested any of the pagination.


@staticmethod
def _annotation_search_query(
def annotation_search_query(
Copy link
Contributor Author

@marcospri marcospri Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this public and adding a bunch of parameters.

return {m.annotation_id for m in query}

def set_status(
self, annotation: Annotation, user: User, status: ModerationStatus | None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we set an status (almost) unconditionally.

We log it on the moderation_log table as well.

)
annotation.moderation_status = status

def update_status(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we might change the status on creation/updates

@marcospri
Copy link
Contributor Author

I reckon this PR has serve its function. Closing it.

@marcospri marcospri closed this Apr 17, 2025


class Timestamps:
class CreatedMixin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case where updated is seemingly unnecessary (e.g. the moderation_log) I'd be tempted to just add updated anyway. You never know!

@@ -1,7 +1,12 @@
from typing import Literal

AnnotationAction = Literal["create", "update", "delete"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about typing.Literal 👍

Comment on lines +14 to +22
sa.text(
"""
UPDATE annotation
SET moderation_status = 'DENIED'
FROM annotation_moderation
WHERE annotation.id = annotation_moderation.annotation_id
AND annotation.moderation_status is null
"""
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there's few enough rows in annotation_moderation for this to work as a single query? Look like almost 9,000 rows, so I guess this should be fine.

"""
)
)
print("\tUpdated annotations as DENIED:", result.rowcount) # noqa: T201
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I hadn't really considered that we might want to back-fill moderation_log, but that might be good to have. Will we be able to set moderation_log.created correctly based on annotation_moderation.created?

I think we could go with either just not back-filling it or letting user_id be nullable.

Comment on lines +13 to +16
"APPROVED",
"DENIED",
"SPAM",
"PENDING",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to put these actual strings in the DB in case we decide to change them in future. For example Rob was saying he thought "Rejected" would be better than "Denied". I'm wondering if the DB should just contain ints, but then the code can contain aliases/constants for the ints like APPROVED = 1
etc.

@marcospri marcospri deleted the moderation-column branch May 14, 2025 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants