Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to save diagnostics data to artifacts #5583

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

lubosmj
Copy link
Member

@lubosmj lubosmj commented Jul 16, 2024

closes #5422

@lubosmj
Copy link
Member Author

lubosmj commented Jul 16, 2024

Logs:

[pulp]  | ('pulp [85744c1da9104ab4a348cd9c8d1780ac]: ::ffff:127.0.0.1 - admin [16/Jul/2024:18:30:58 +0000] "POST /pulp/default/api/v3/repositories/file/file/0190bccf-de1b-7aac-a32b-baa7ff3a5153/sync/ HTTP/1.0" 202 75 "-" "Pulp-CLI/0.25.0"',)
[pulp]  | pulp [None]: pulpcore.tasking._util:INFO: Writing task memory data to /var/tmp/pulp/0190bccf-e208-7bc1-87d9-c5b3c0150167/memory.datum
[pulp]  | pulp [85744c1da9104ab4a348cd9c8d1780ac]: pulpcore.tasking.tasks:INFO: Starting task 0190bccf-e208-7bc1-87d9-c5b3c0150167
[pulp]  | ('pulp [85744c1da9104ab4a348cd9c8d1780ac]: ::ffff:127.0.0.1 - admin [16/Jul/2024:18:30:59 +0000] "GET /pulp/default/api/v3/tasks/0190bccf-e208-7bc1-87d9-c5b3c0150167/ HTTP/1.0" 200 1230 "-" "Pulp-CLI/0.25.0"',)
[pulp]  | ('pulp [85744c1da9104ab4a348cd9c8d1780ac]: ::ffff:127.0.0.1 - admin [16/Jul/2024:18:31:00 +0000] "GET /pulp/default/api/v3/tasks/0190bccf-e208-7bc1-87d9-c5b3c0150167/ HTTP/1.0" 200 1359 "-" "Pulp-CLI/0.25.0"',)
[pulp]  | pulp [85744c1da9104ab4a348cd9c8d1780ac]: pulpcore.tasking.tasks:INFO: Task completed 0190bccf-e208-7bc1-87d9-c5b3c0150167
[pulp]  | ('pulp [85744c1da9104ab4a348cd9c8d1780ac]: ::ffff:127.0.0.1 - admin [16/Jul/2024:18:31:01 +0000] "GET /pulp/default/api/v3/tasks/0190bccf-e208-7bc1-87d9-c5b3c0150167/ HTTP/1.0" 200 1480 "-" "Pulp-CLI/0.25.0"',)
[pulp]  | pulp [85744c1da9104ab4a348cd9c8d1780ac]: pulpcore.tasking._util:INFO: Profile data for memory diagnostics available at '/pulp/default/api/v3/artifacts/0190bccf-ede4-7740-888b-dad33e8757ff/'

Artifacts API:

http :5001/pulp/default/api/v3/artifacts/0190bccf-ede4-7740-888b-dad33e8757ff/
{
    "file": "artifact/54/52b2e058aabe113f7502adfbc6adc261bc8aa5840bf68ce4f10274bcba7f9a",
    "md5": null,
    "pulp_created": "2024-07-16T18:31:01.759519Z",
    "pulp_href": "/pulp/default/api/v3/artifacts/0190bccf-ede4-7740-888b-dad33e8757ff/",
    "pulp_last_updated": "2024-07-16T18:31:01.759541Z",
    "sha1": null,
    "sha224": "862292f6fa5e8ca364b2c114b5ab6283190493cd78c42381fb1085d3",
    "sha256": "5452b2e058aabe113f7502adfbc6adc261bc8aa5840bf68ce4f10274bcba7f9a",
    "sha384": "a4571c6ad55c03f1307d8d03f404ae4f7eb5fd209b5683797855e5324962fb3da8a88ddc6e7de67f932e5487324f6888",
    "sha512": "045d93dd116c61f9e3f3c2a33df51e72438bc6efed29040252ae4cee1bc4dd4a6b6c6aa8838f34f73439448a0d34ce07a7def056483785edd04eaec61cdddc71",
    "size": 31
}

Inside the container:

[root@396ee91e87d7 /]# cat /var/tmp/pulp/0190bccf-e208-7bc1-87d9-c5b3c0150167/memory.datum
# Seconds	Memory in MB
0	86.56
[root@396ee91e87d7 /]# cat /var/lib/pulp/media/artifact/54/52b2e058aabe113f7502adfbc6adc261bc8aa5840bf68ce4f10274bcba7f9a
# Seconds	Memory in MB
0	86.56

@lubosmj
Copy link
Member Author

lubosmj commented Jul 16, 2024

Open questions:

  1. Would it be better to give users an option to decide whether they want to save artifacts or not? This requires introducing a new Pulp setting.
  2. If not, does it still make sense to store profiling data under both the VAR_TMP_PULP and MEDIA_ROOT directories? We duplicate the data.
  3. Do we want to provide a way to download these artifacts from the storage? It is a bit cumbersome to shell into a container/storage and manually look for the stored artifacts.

@dkliban
Copy link
Member

dkliban commented Jul 16, 2024

  1. No, an artifact should be created every time.
  2. We should not store the data in temp storage.
  3. Yes, an admin should have a way to easily download the artifact.

@lubosmj lubosmj force-pushed the 5422-save-diagnostic-data branch 2 times, most recently from 35ed9e7 to 415b612 Compare July 17, 2024 13:01
@lubosmj
Copy link
Member Author

lubosmj commented Jul 17, 2024

Done.

Logs:

[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulpcore.tasking.tasks:INFO: Starting task 0190c0c7-747c-7199-acbd-a8126d3816ff
[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulp_file.app.tasks.publishing:INFO: Publishing: repository=test, version=1, manifest=PULP_MANIFEST
[pulp]  | ('pulp [34a692fc6e6b4a74b53902b8ffb4a437]: ::ffff:127.0.0.1 - admin [17/Jul/2024:13:00:15 +0000] "GET /pulp/default/api/v3/tasks/0190c0c7-747c-7199-acbd-a8126d3816ff/ HTTP/1.0" 200 877 "-" "Pulp-CLI/0.25.0"',)
[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulp_file.app.tasks.publishing:INFO: Publication: 0190c0c7-7508-73da-9088-411837444853 created
[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulpcore.tasking.tasks:INFO: Task completed 0190c0c7-747c-7199-acbd-a8126d3816ff
[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulpcore.tasking._util:INFO: Profile data for pyinstrument available at '/pulp/default/api/v3/artifacts/0190c0c7-784c-7a63-a799-ffedfd2a5d73/'
[pulp]  | pulp [34a692fc6e6b4a74b53902b8ffb4a437]: pulpcore.tasking._util:INFO: Profile data for memory diagnostics available at '/pulp/default/api/v3/artifacts/0190c0c7-78db-7924-b3a3-f933eaa34a93/'
http :5001/pulp/default/api/v3/artifacts/0190c0c7-784c-7a63-a799-ffedfd2a5d73/download/
{
    "url": "http://localhost:5001/pulp/content/default/e3f31f43fb3a98e605fed31a3f144c0cff74112a249cfaf8d60aac3f7e6e3026/0190c0c7-784c-7a63-a799-ffedfd2a5d73?expires=1721224899&validate_token=c2c1a2a074285deb5e37976d0c34060553e0a1bdb784a3a2b1831e4323256e4d:b7b551e3543361afedd2b383e65db93ddb21b62c6ece02bbaf15ca8fec22e3a9"
}
wget -O output.html 'http://localhost:5001/pulp/content/default/e3f31f43fb3a98e605fed31a3f144c0cff74112a249cfaf8d60aac3f7e6e3026/0190c0c7-784c-7a63-a799-ffedfd2a5d73?expires=1721224899&validate_token=c2c1a2a074285deb5e37976d0c34060553e0a1bdb784a3a2b1831e4323256e4d:b7b551e3543361afedd2b383e65db93ddb21b62c6ece02bbaf15ca8fec22e3a9'
--2024-07-17 15:02:01--  http://localhost:5001/pulp/content/default/e3f31f43fb3a98e605fed31a3f144c0cff74112a249cfaf8d60aac3f7e6e3026/0190c0c7-784c-7a63-a799-ffedfd2a5d73?expires=1721224899&validate_token=c2c1a2a074285deb5e37976d0c34060553e0a1bdb784a3a2b1831e4323256e4d:b7b551e3543361afedd2b383e65db93ddb21b62c6ece02bbaf15ca8fec22e3a9
Resolving localhost (localhost)... ::1, ::1, 127.0.0.1, ...
Connecting to localhost (localhost)|::1|:5001... connected.
HTTP request sent, awaiting response... 200 OK
Length: 63220 (62K) [application/octet-stream]
Saving to: ‘output.html’

output.html                             100%[=============================================================================>]  61.74K  --.-KB/s    in 0s      

2024-07-17 15:02:01 (580 MB/s) - ‘output.html’ saved [63220/63220]

output.html:
image

@lubosmj lubosmj marked this pull request as ready for review July 17, 2024 13:15
@@ -86,6 +87,14 @@ def destroy(self, request, pk):
data = {"detail": msg}
return Response(data, status=status.HTTP_409_CONFLICT)

@action(detail=True)
def download(self, request, pk):
Copy link
Member

@ipanova ipanova Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new action for all artifacts, not just the ones produces by task diagnostics, it needs a separate changelog as .feature

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot just do that without talking about RBAC.

Copy link
Member Author

@lubosmj lubosmj Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What RBAC? There are some discrepancies that should be resolved in a separate PR first. We say that only admins can see the endpoint, yet, regular users can upload raw artifacts to Pulp: #5525.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading other peoples artifacts is a new level here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add that as an action to the tasks viewset instead?

Copy link
Member Author

@lubosmj lubosmj Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Will I need to remove permission_classes = (permissions.IsAuthenticated,) on the ViewSet in order to define an access policy statement for this specific action? If yes, this would mean that the endpoint will be locked for users, and as @gerrod3 pointed out in his comment in the attached issue, every uploading scenario in pulp-cli will start to fail.

Or, do you want me to create a workaround with a custom permission class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add that as an action to the tasks viewset instead?

Can you elaborate, please? Would you like to return a task instead of an immediate response?

@ggainey ggainey closed this Jul 17, 2024
@ggainey ggainey reopened this Jul 17, 2024
@lubosmj lubosmj closed this Jul 18, 2024
@lubosmj lubosmj reopened this Jul 18, 2024
@lubosmj lubosmj closed this Jul 18, 2024
@lubosmj lubosmj reopened this Jul 18, 2024
@lubosmj
Copy link
Member Author

lubosmj commented Jul 18, 2024

After discussing this with @mdellweg, we reached a consensus on updating this PR. The current implementation is not the best one. Right now, it adds a new download action on the artifact's detail endpoint that returns a pre-signed URL for downloading the artifact itself. This has bad implications for RBAC.

Key points we talked about:

  1. The artifacts endpoint should not be used in any workflow. Artifacts are just an implementation detail that should not be visible to users.
  2. If a user has permissions to view a task, she can also have permissions to preview profiler data.
  3. It would be better to reference the profiler data from a task itself. Looking for HREFs in logs is a bit cumbersome. The profiler data should be still stored within artifacts with on_delete=SET_NULL. Orphan cleanup will handle the removal.

Open questions:

  1. Should we create a separate action on the task's detail endpoint returning URLs for downloading profiler data? Or, is it better to preview the URLs when listing all tasks? The benefit of having a separate action is that it allows us to provide extra RBAC granularity for previewing profiler data. +1 to the first one -- better RBAC handling
  2. Should we create a new DB table that will hold a reference to a task, artifact, and profiler data name? Or, is it better to directly reference the profiler data from a task itself via foreign keys? +1 to the first one -- more extensible, no sha256 conflicts

@lubosmj lubosmj marked this pull request as draft July 18, 2024 14:05
@bmbouter
Copy link
Member

Being able to download the Artifact was what I initially wanted so this is looking good. Is the idea to have the new action API be something like {task_href}/profile or maybe {task_href}/debug_profile?

@lubosmj
Copy link
Member Author

lubosmj commented Jul 19, 2024

Yes. And, this should not have any side effect since we can protect it with the RBAC.

@lubosmj lubosmj force-pushed the 5422-save-diagnostic-data branch 2 times, most recently from 719e1e0 to 0a2ce88 Compare July 22, 2024 12:36
@lubosmj
Copy link
Member Author

lubosmj commented Jul 22, 2024

The artifacts can be now downloaded from the task's detail endpoint:

http :5001/pulp/default/api/v3/tasks/0190da64-dfb4-70b1-b61c-095a670890d3/profile_data/
{
    "memory_profile": "http://localhost:5001/pulp/content/default/c2ecd4c6dae70a22e5684ff123210180e9f1a3436ec516a0040c843ba84dafcf/0190da64-e1dd-741d-a15a-221cf4bd03b6?expires=1721654577&validate_token=b151f73c5aec8353fc7c143c24284b1653accb408cac287df6e2dcc3dbca5a80:72d7e7110328b7383afdb64859577ca7465e8b90c356a22fd79ca30863361a40",
    "pyinstrument_data": "http://localhost:5001/pulp/content/default/c2ecd4c6dae70a22e5684ff123210180e9f1a3436ec516a0040c843ba84dafcf/0190da64-e1c6-7df6-912b-6b3dd558810b?expires=1721654577&validate_token=7292e01559e3b4bcfa93a85e3442973587ca28306f83ea79de19589b3f68f4fb:38c63ab1066dfa6ef90e1defe1f867d6d4fb58f1d4cb70fcf298c2de2916b7b9"
}

@lubosmj lubosmj marked this pull request as ready for review July 22, 2024 12:36
@@ -242,6 +242,19 @@ def purge(self, request):
)
return OperationPostponedResponse(task, request)

@action(detail=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an attached serializer.
Also i think we need to mark the action as "GET".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The action decorator will route GET requests by default". I am going only to attach the serializer there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the best I could do:

            "ProfileDataResponse": {
                "type": "object",
                "properties": {
                    "urls": {
                        "type": "object",
                        "additionalProperties": {
                            "type": "string",
                            "format": "uri"
                        }
                    }
                },
                "required": [
                    "urls"
                ]
            },

http :5001/pulp/default/api/v3/tasks/0190db0f-e1d9-71fb-a514-fb8d8fbee514/profile_data/
{
    "urls": {
        "memory_profile": "http://localhost:5001/pulp/content/default/3ae491a65a6819a786e90583e44bba65cf2ebba89e03b3db9913e19cd4e848e5/0190db0f-e306-7909-91b4-50e839307282?expires=1721729565&validate_token=8f4bdddf4c68e08473881524710f6ccf6ae3579bb14a93882d527ce70257d15b:7425724ea8fee084509927e0a0cd8233cd4338db9213d63ba963a0b1e4b6ec80"
    }
}

I could not make this working without having the "urls" key. I am not sure it is right to define a raw generic schema for the response that returns only "object" without "properties".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more specific we are, the more levels will the bindings create objects instead of dicts.
This is a playground for subtly breaking backwards compatibility of the generated bindings when in fact both specs describe the very same api in correct ways.

@@ -100,6 +113,8 @@ class Task(BaseModel, AutoAddObjPermsMixin):
pulp_domain = models.ForeignKey("Domain", default=get_domain_pk, on_delete=models.CASCADE)
versions = HStoreField(default=dict)

profile_data = models.ManyToManyField("Artifact", through=ProfileData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
profile_data = models.ManyToManyField("Artifact", through=ProfileData)
profile_artifacts = models.ManyToManyField("Artifact", through=ProfileData)

@@ -87,7 +88,7 @@ class TaskViewSet(
"statements": [
{"action": ["list"], "principal": "authenticated", "effect": "allow"},
{
"action": ["retrieve", "my_permissions"],
"action": ["retrieve", "profile_data", "my_permissions"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure this will not be the granular RBAC you promoted, but will enable an admin with sufficient Ssechel make it granular. Don't we need to provide a permission for it too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can introduce a new permission for this. Yet, I would say that people who can preview a task, should also see its profile data.

responses=inline_serializer(
"ProfileDataResponse",
fields={"urls": DictField(child=URLField())},
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

@lubosmj lubosmj requested a review from mdellweg July 23, 2024 10:31
@lubosmj lubosmj force-pushed the 5422-save-diagnostic-data branch 2 times, most recently from 14ff3aa to 304214b Compare July 23, 2024 10:48
name = models.TextField()

class Meta:
unique_together = ("artifact", "name")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unique_together = ("artifact", "name")
unique_together = ("task", "name")

@@ -286,6 +301,7 @@ class Meta:
]
permissions = [
("manage_roles_task", "Can manage role assignments on task"),
("view_task_profile_artifacts", "Can preview profile data for task"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("view_task_profile_artifacts", "Can preview profile data for task"),
("view_task_profile_artifacts", "Can view profile data for task"),

Comment on lines +392 to +393
The default setting is `False`. When set to `True`, each task records various diagnostics (listed below)
and stores them as separate artifacts. To download the data, issue GET requests to `${TASK_HREF}profile_artifacts/`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably mention that they will be cleaned up automatically.

@lubosmj lubosmj marked this pull request as draft July 23, 2024 11:27
@lubosmj lubosmj marked this pull request as ready for review July 23, 2024 13:04
@mdellweg mdellweg enabled auto-merge (rebase) July 24, 2024 12:24
@mdellweg mdellweg merged commit 0b94beb into pulp:main Jul 24, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability to save Task Diagnostic data
6 participants