Skip to content

Commit

Permalink
refactor: allows Taxonomy to be subclassed
Browse files Browse the repository at this point in the history
* adds optional taxonomy_class property+field to Taxonomy
* adds Taxonomy cast() method to use this class
* oel_tagging.api uses Taxonomy.cast() whenever practical
* moves ObjectTag validation back to Taxonomy
* removes ObjectTag.resync() logic -- we don't need it yet.
* removes ObjectTag.object_type field -- we're not using it for anything.
* squashes migrations from previous commits
  • Loading branch information
pomegranited committed Jul 18, 2023
1 parent bbbb4cc commit a7661bb
Show file tree
Hide file tree
Showing 12 changed files with 438 additions and 626 deletions.
15 changes: 2 additions & 13 deletions docs/decisions/0007-tagging-app.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ Taxonomy

The ``openedx_tagging`` module defines ``openedx_tagging.core.models.Taxonomy``, whose data and functionality are self-contained to the ``openedx_tagging`` app. However in Studio, we need to be able to limit access to some Taxonomy by organization, using the same "course creator" access which limits course creation for an organization to a defined set of users.

So in edx-platform, we will create the ``openedx.features.content_tagging`` app, to contain the models and logic for linking Organization owners to Taxonomies. There's no need to subclass Taxonomy here; we can enforce access using the ``content_tagging.api``.
So in edx-platform, we will create the ``openedx.features.content_tagging`` app, to contain the models and logic for linking Organization owners to Taxonomies. Here, we can subclass ``Taxonomy`` as needed, preferably using proxy models. The APIs are responsible for ensuring that any ``Taxonomy`` instances are cast to the appropriate subclass.

ObjectTag
~~~~~~~~~

Similarly, the ``openedx_tagging`` module defined ``openedx_tagging.core.models.ObjectTag``, also self-contained to the
``openedx_tagging`` app.

But to tag content in the LMS/Studio, we need to enforce ``object_id`` as a CourseKey or UsageKey type. So to do this, we subclass ``ObjectTag``, and use the ``openedx.features.tagging.registry`` to register the subclass, so that it will be picked up when this tagging API creates or resyncs object tags.
But to tag content in the LMS/Studio, we need to enforce ``object_id`` as a CourseKey or UsageKey type. So to do this, we subclass ``ObjectTag``, and use this class when creating object tags for the content taxonomies. Once the ``object_id`` is set, it is not editable, and so this key validation need not happen again.

Rejected Alternatives
---------------------
Expand All @@ -38,14 +38,3 @@ Embed in edx-platform
Embedding the logic in edx-platform would provide the content tagging logic specifically required for the MVP.

However, we plan to extend tagging to other object types (e.g. People) and contexts (e.g. Marketing), and so a generic, standalone library is preferable in the log run.


Subclass Taxonomy
~~~~~~~~~~~~~~~~~

Another approach considered was to encapsulate the complexity of ObjectTag validation in the Taxonomy class, and so use subclasses of Taxonomy to validate different types of ObjectTags, and use `multi-table inheritance`_ and django polymorphism when pulling the taxonomies from the database.

However, because we will have a number of different types of Taxonomies, this proved non-performant.


-.. _multi-table inheritance: https://docs.djangoproject.com/en/3.2/topics/db/models/#multi-table-inheritance
110 changes: 17 additions & 93 deletions openedx_tagging/core/tagging/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,31 +25,40 @@ def create_taxonomy(
required=False,
allow_multiple=False,
allow_free_text=False,
taxonomy_class: Type = None,
) -> Taxonomy:
"""
Creates, saves, and returns a new Taxonomy with the given attributes.
"""
taxonomy = Taxonomy.objects.create(
taxonomy = Taxonomy(
name=name,
description=description,
enabled=enabled,
required=required,
allow_multiple=allow_multiple,
allow_free_text=allow_free_text,
)
return taxonomy
if taxonomy_class:
taxonomy.taxonomy_class = taxonomy_class
taxonomy.save()
return taxonomy.cast()


def get_taxonomy(id: int) -> Union[Taxonomy, None]:
"""
Returns a Taxonomy of the appropriate subclass which has the given ID.
Returns a Taxonomy cast to the appropriate subclass which has the given ID.
"""
return Taxonomy.objects.filter(id=id).first()
taxonomy = Taxonomy.objects.filter(id=id).first()
return taxonomy.cast() if taxonomy else None


def get_taxonomies(enabled=True) -> QuerySet:
"""
Returns a queryset containing the enabled taxonomies, sorted by name.
We return a QuerySet here for ease of use with Django Rest Framework and other query-based use cases.
So be sure to use `Taxonomy.cast()` to cast these instances to the appropriate subclass before use.
If you want the disabled taxonomies, pass enabled=False.
If you want all taxonomies (both enabled and disabled), pass enabled=None.
"""
Expand All @@ -65,37 +74,11 @@ def get_tags(taxonomy: Taxonomy) -> List[Tag]:
Note that if the taxonomy allows free-text tags, then the returned list will be empty.
"""
return taxonomy.get_tags()


def cast_object_tag(object_tag: ObjectTag) -> ObjectTag:
"""
Casts/copies the given object tag data into the ObjectTag subclass most appropriate for this tag.
"""
return object_tag.cast_object_tag()


def resync_object_tags(object_tags: QuerySet = None) -> int:
"""
Reconciles ObjectTag entries with any changes made to their associated taxonomies and tags.
By default, we iterate over all ObjectTags. Pass a filtered ObjectTags queryset to limit which tags are resynced.
"""
if not object_tags:
object_tags = ObjectTag.objects.select_related("tag", "taxonomy")

num_changed = 0
for tag in object_tags:
object_tag = cast_object_tag(tag)
changed = object_tag.resync()
if changed:
object_tag.save()
num_changed += 1
return num_changed
return taxonomy.cast().get_tags()


def get_object_tags(
object_id: str, object_type: str = None, taxonomy: Taxonomy = None, valid_only=True
object_id: str, taxonomy: Taxonomy = None, valid_only=True
) -> Iterator[ObjectTag]:
"""
Generates a list of object tags for a given object.
Expand All @@ -112,13 +95,10 @@ def get_object_tags(
.select_related("tag", "taxonomy")
.order_by("id")
)
if object_type:
tags = tags.filter(object_type=object_type)
if taxonomy:
tags = tags.filter(taxonomy=taxonomy)

for tag in tags:
object_tag = cast_object_tag(tag)
for object_tag in tags:
if not valid_only or object_tag.is_valid():
yield object_tag

Expand All @@ -127,8 +107,6 @@ def tag_object(
taxonomy: Taxonomy,
tags: List,
object_id: str,
object_type: str,
object_tag_class: Type = None,
) -> List[ObjectTag]:
"""
Replaces the existing ObjectTag entries for the given taxonomy + object_id with the given list of tags.
Expand All @@ -139,58 +117,4 @@ def tag_object(
Raised ValueError if the proposed tags are invalid for this taxonomy.
Preserves existing (valid) tags, adds new (valid) tags, and removes omitted (or invalid) tags.
"""

if not taxonomy.allow_multiple and len(tags) > 1:
raise ValueError(_(f"Taxonomy ({taxonomy.id}) only allows one tag per object."))

if taxonomy.required and len(tags) == 0:
raise ValueError(
_(f"Taxonomy ({taxonomy.id}) requires at least one tag per object.")
)

current_tags = {
tag.tag_ref: tag
for tag in ObjectTag.objects.filter(
taxonomy=taxonomy, object_id=object_id, object_type=object_type
)
}
updated_tags = []
for tag_ref in tags:
if tag_ref in current_tags:
object_tag = cast_object_tag(current_tags.pop(tag_ref))
else:
try:
tag = taxonomy.tag_set.get(
id=tag_ref,
)
value = tag.value
except (ValueError, Tag.DoesNotExist):
# This might be ok, e.g. if taxonomy.allow_free_text.
# We'll validate below before saving.
tag = None
value = tag_ref

object_tag = ObjectTag(
taxonomy=taxonomy,
object_id=object_id,
object_type=object_type,
tag=tag,
value=value,
name=taxonomy.name,
)
if object_tag_class:
object_tag.object_tag_class = object_tag_class
object_tag = cast_object_tag(object_tag)

object_tag.resync()
updated_tags.append(object_tag)

# Save all updated tags at once to avoid partial updates
for object_tag in updated_tags:
object_tag.save()

# ...and delete any omitted existing tags
for old_tag in current_tags.values():
old_tag.delete()

return updated_tags
return taxonomy.cast().tag_object(tags, object_id)
79 changes: 79 additions & 0 deletions openedx_tagging/core/tagging/migrations/0002_auto_20230718_2026.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Generated by Django 3.2.19 on 2023-07-18 05:54

import django.db.models.deletion
from django.db import migrations, models

import openedx_learning.lib.fields


class Migration(migrations.Migration):
dependencies = [
("oel_tagging", "0001_initial"),
]

operations = [
migrations.AddField(
model_name="taxonomy",
name="system_defined",
field=models.BooleanField(
default=False,
editable=False,
help_text="Indicates that tags and metadata for this taxonomy are maintained by the system; taxonomy admins will not be permitted to modify them.",
),
),
migrations.AlterField(
model_name="tag",
name="parent",
field=models.ForeignKey(
default=None,
help_text="Tag that lives one level up from the current tag, forming a hierarchy.",
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="children",
to="oel_tagging.tag",
),
),
migrations.AlterField(
model_name="tag",
name="taxonomy",
field=models.ForeignKey(
default=None,
help_text="Namespace and rules for using a given set of tags.",
null=True,
on_delete=django.db.models.deletion.CASCADE,
to="oel_tagging.taxonomy",
),
),
migrations.AddField(
model_name="taxonomy",
name="visible_to_authors",
field=models.BooleanField(
default=True,
editable=False,
help_text="Indicates whether this taxonomy should be visible to object authors.",
),
),
migrations.RemoveField(
model_name="objecttag",
name="object_type",
),
migrations.AddField(
model_name="taxonomy",
name="_taxonomy_class",
field=models.CharField(
help_text="Taxonomy subclass used to instantiate this instance.Must be a fully-qualified module and class name.",
max_length=255,
null=True,
),
),
migrations.AlterField(
model_name="objecttag",
name="object_id",
field=openedx_learning.lib.fields.MultiCollationCharField(
db_collations={"mysql": "utf8mb4_unicode_ci", "sqlite": "NOCASE"},
editable=False,
help_text="Identifier for the object being tagged",
max_length=255,
),
),
]

This file was deleted.

41 changes: 0 additions & 41 deletions openedx_tagging/core/tagging/migrations/0003_objecttag_proxies.py

This file was deleted.

36 changes: 0 additions & 36 deletions openedx_tagging/core/tagging/migrations/0004_tag_cascade_delete.py

This file was deleted.

Loading

0 comments on commit a7661bb

Please sign in to comment.