Skip to content
This repository has been archived by the owner on Dec 7, 2022. It is now read-only.

Master/Detail tasks #3394

Closed
wants to merge 9 commits into from
Closed

Master/Detail tasks #3394

wants to merge 9 commits into from

Conversation

asmacdo
Copy link
Contributor

@asmacdo asmacdo commented Mar 23, 2018

No description provided.

@pep8speaks
Copy link

pep8speaks commented Mar 23, 2018

Hello @asmacdo! Thanks for updating the PR.

Line 180:1: E303 too many blank lines (3)

Line 245:101: E501 line too long (101 > 100 characters)

Comment last updated on May 31, 2018 at 22:34 Hours UTC

@@ -110,8 +110,8 @@ def endpoint_pieces(cls):
except AttributeError:
# no endpoint_name defined, need to get more specific in the MRO
continue

pieces = (master_endpoint_name, cls.endpoint_name)
pieces = tuple(cls.endpoint_name.split('/'))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allows plugins to namespace tasks.

v3/tasks/core-update/ becomes tasks-core-update-list and tasks-core-update-detail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better examples might read:

  • v3/tasks/core-update/ becomes tasks-core-update-list (or -detail)
  • v3/tasks/file/syncs/ becomes tasks-file-syncs-list
  • v3/tasks/docker/flashlight/strobes/123/ becomes tasks-docker-flashlight-strobes-detail

To emphasize the the plug-in namespacing options

@@ -43,6 +45,14 @@ def cancel(self, request, pk=None):
return Response(status=status.HTTP_204_NO_CONTENT)


class CoreUpdateTaskViewSet(TaskViewSet):

endpoint_name = 'core/updates'
Copy link
Contributor Author

@asmacdo asmacdo Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

includes namespace which allows plugins to use the same name. like tasks/file/sync/ and tasks/python/sync/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what the actual core objects are; reading thru the app/viewsets/base mixin docstrings, isn't it just a repo all the time? Or is this more about creating a generic behavior rather than providing a core/updates endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that all tasks that are deployed by pulpcore will be listed at v3/tasks/core/, but CoreTask is just abstract.
Updates to Repositories, Importers, Publishers will deploy CoreUpdateTask, listed at v3/tasks/core/updates/
Deletes of the same will deploy CoreDeleteTask, listed at v3/tasks/core/deletes/.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, cool

@@ -176,7 +177,8 @@ class UserFacingTask(PulpTask):
# this tells celery to not automatically log tracebacks for these exceptions
throws = (PulpException,)

def apply_async_with_reservation(self, resources, args=None, kwargs=None, **options):
def apply_async_with_reservation(self, reservations, task_status=None, args=None, kwargs=None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmbouter, didn't want to slip this in without you noticing

task_name = self.name

# Set the parent attribute if being dispatched inside of a Task
parent_arg = self._get_parent_arg()

TaskStatus.objects.create(pk=inner_task_id, state=TaskStatus.WAITING, **parent_arg)
if task_status is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO
Task (task_status) should be created and passed in here everytime. Creation of CoreUpdateTask is temporary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this deserves a doc block update, the new task_status parameter description as well as the logic description update

@@ -237,7 +246,7 @@ def apply_async(self, args=None, kwargs=None, **options):
parent_arg = self._get_parent_arg()

try:
TaskStatus.objects.create(pk=async_result.id, state=TaskStatus.WAITING, **parent_arg)
CoreUpdateTask.objects.create(pk=async_result.id, state=TaskStatus.WAITING, **parent_arg)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO Should this be a new Task type?

@@ -31,7 +31,8 @@ def __init__(self, task_results, request):
"""
tasks = []
for result in task_results:
task = {"_href": reverse('tasks-detail', args=[result.task_id], request=request),
task = {"_href": reverse('tasks-core-updates-detail', args=[result.task_id],
request=request),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ODO: Can this be generalized?

something like:

task = CoreTask.objects.get(id=result.task_id).cast()

Except I don't think the model object knows about its serializer. Could we find some way to use a serializer to have DetailRelatedField(many=True) that would cast() to render with the detail serializer? Would replace "OperationPostponedResponse"

Maybe here? a new serializer that is used here, and in the AsyncRemoveMixin https://github.com/pulp/pulp/pull/3394/files#diff-ae7f375b1b9eaee42aecdf250679cd6eR232

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, esp. in case this was an async result of some plug-in specific thing, returning this hard-coded tasks-core-updates-detail reverse might not be cool.

I think part of the issue is that we change this logic to be an object-like endpoint instead a verb; posting to a serializer to create an object, we might want to avoid returning a list of them, it feels odd.

If instead there was a WaitingTask that had a custom, read-only serializer, a read only parent field and a RunningTask had a read-only children field in addition, we might avoid having to have a custom response handling logic (here) and make the tasks appear what they actually are, a lazily-unspooling, directed acyclic graph.

See also my off-topic comment below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OperationPostponed will only ever be used for asynchronous updates/deletes. The reason for this is that these tasks are created by the ViewSets for the objects that are being updated/deleted (repos, importers, publishers).

Other tasks are created/read/listed at their own endpoints and handled by their own viewsets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now the hard-coded values might not be an issue; btw the tasks.base.general_update.apply_async_with_reservation function doesn't return a Task instance but a Celery handle so I guess that's where this magic comes from.

Anyways, I believe this should be better done in a Serializer thru some sort of a HyperlinkedRelatedField or such..

Copy link
Contributor

@dparalen dparalen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @asmacdo!

Thanks for the patch!
So far I've been just able to read thru so please don't take me too seriously here ;) See inline...

Cheers,
milan

@@ -177,7 +177,7 @@ class TaskLock(Model):
lock = models.TextField(unique=True, null=False, choices=LOCK_STRINGS)


class Task(Model):
class Task(MasterModel):
"""
Represents a task

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about: Represents a generic task since this is now a MasterModel subclass i.e the most generic non-abstract thing


pieces = (master_endpoint_name, cls.endpoint_name)
pieces = tuple(cls.endpoint_name.split('/'))
pieces = (master_endpoint_name,) + pieces

# ensure that neither piece is None/empty and that they are not equal.
if not all(pieces) or pieces[0] == pieces[1]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure about the check of equality here; shouldn't this be an 'for-all-style' check now that there can be multiple names? tuple(sorted(set(pieces))) != tuple(sorted(pieces))

@@ -110,8 +110,8 @@ def endpoint_pieces(cls):
except AttributeError:
# no endpoint_name defined, need to get more specific in the MRO
continue

pieces = (master_endpoint_name, cls.endpoint_name)
pieces = tuple(cls.endpoint_name.split('/'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better examples might read:

  • v3/tasks/core-update/ becomes tasks-core-update-list (or -detail)
  • v3/tasks/file/syncs/ becomes tasks-file-syncs-list
  • v3/tasks/docker/flashlight/strobes/123/ becomes tasks-docker-flashlight-strobes-detail

To emphasize the the plug-in namespacing options

@@ -43,6 +45,14 @@ def cancel(self, request, pk=None):
return Response(status=status.HTTP_204_NO_CONTENT)


class CoreUpdateTaskViewSet(TaskViewSet):

endpoint_name = 'core/updates'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what the actual core objects are; reading thru the app/viewsets/base mixin docstrings, isn't it just a repo all the time? Or is this more about creating a generic behavior rather than providing a core/updates endpoint?

task_name = self.name

# Set the parent attribute if being dispatched inside of a Task
parent_arg = self._get_parent_arg()

TaskStatus.objects.create(pk=inner_task_id, state=TaskStatus.WAITING, **parent_arg)
if task_status is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this deserves a doc block update, the new task_status parameter description as well as the logic description update

kwargs, options),
queue=RESOURCE_MANAGER_QUEUE)
return AsyncResult(inner_task_id)
return AsyncResult(task_status.pk)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an off-topic brain dump here; we use the Task (model) as a state machine so why doesn't this return an updated Task instance with an async_result attribute attached to it? Ideally, we'd have a a task subtype for a state e.gWaitingTask; with an e.g BaseTask being abstract with e.g apply_async raising a 400 error; Task.apply_async returning a WaitingTask instance that again would 400 on WaitingTask.apply_async. All these states of a task could have different behaviors of DRF serialization too. Plug-ins might not need to override too much of such code either (probably just the Task.apply_async i.e the serializer create method).

nested_layers = nested_string.split('.')
value = self.task
for layer in nested_layers:
value = getattr(value, layer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is odd; why loop here if you just return the last item eventually

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows us to create a string structure like "importer.pk"
first loop does task.importer, second loop does task.importer.pk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, it's getattr thru each nested layer; probably a sample would help

@@ -298,6 +299,14 @@ def release_resources(self):
if not reservation.tasks.exists():
reservation.delete()

class CoreTask(Task):

TYPE = "core"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering, isn't this always a repo task that is hiding behind the core here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either Repository, Importer, or Publisher

@@ -31,7 +31,8 @@ def __init__(self, task_results, request):
"""
tasks = []
for result in task_results:
task = {"_href": reverse('tasks-detail', args=[result.task_id], request=request),
task = {"_href": reverse('tasks-core-updates-detail', args=[result.task_id],
request=request),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, esp. in case this was an async result of some plug-in specific thing, returning this hard-coded tasks-core-updates-detail reverse might not be cool.

I think part of the issue is that we change this logic to be an object-like endpoint instead a verb; posting to a serializer to create an object, we might want to avoid returning a list of them, it feels odd.

If instead there was a WaitingTask that had a custom, read-only serializer, a read only parent field and a RunningTask had a read-only children field in addition, we might avoid having to have a custom response handling logic (here) and make the tasks appear what they actually are, a lazily-unspooling, directed acyclic graph.

See also my off-topic comment below

bmbouter pushed a commit to bmbouter/pulp that referenced this pull request Mar 28, 2018
@asmacdo asmacdo changed the title Master task hack Master/Detail tasks Apr 5, 2018
@bmbouter bmbouter closed this May 31, 2018
@dralley dralley changed the base branch from 3.0-dev to master May 31, 2018 22:34
@dralley dralley reopened this May 31, 2018
@asmacdo
Copy link
Contributor Author

asmacdo commented Jun 6, 2018

Closing for now, I'll rebase and reopen if it is helpful for discussion.

@asmacdo asmacdo closed this Jun 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants