Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to save Task Diagnostic data #5422

Open
bmbouter opened this issue May 24, 2024 · 13 comments · May be fixed by #5583
Open

Add ability to save Task Diagnostic data #5422

bmbouter opened this issue May 24, 2024 · 13 comments · May be fixed by #5583
Assignees
Labels

Comments

@bmbouter
Copy link
Member

When using a k8s hosted version of Pulp, we can enable TASK_DIAGNOSTICS, but getting the saved /var/tmp/pulp/<task_UUID>/ is really difficult. We greatly need this data.

What would be easy in this type of deployment would be for the data to be saved to an S3 bucket somehow.

@mdellweg
Copy link
Member

Time for dogfooding: Let's put any type of report in artifacts.

@decko
Copy link
Member

decko commented May 28, 2024

Maybe use pulp_file to save this data.
It's gonna use the Storage that Pulp is using.

@decko
Copy link
Member

decko commented May 28, 2024

Which things we're getting from the diagnostics? Can we get the same information using other ways, like container metrics, or OTEL?

@bmbouter
Copy link
Member Author

bmbouter commented Jun 3, 2024

Saving the data as a pulp_file artifact I think is a great idea. That would be even easier for us to pull the data back out then. So would that be:

  1. pulp admin creates a pulp-file repo
  2. pulp admin configures the TASK_DIAGNOSTIC_FILE_REPO_URL equal to its URL? or maybe TASK_DIAGNOSTIC_FILE_REPO_ID equal to its UUID?

What do you think?

@mdellweg
Copy link
Member

mdellweg commented Jun 4, 2024

Saving the data as a pulp_file artifact I think is a great idea. That would be even easier for us to pull the data back out then. So would that be:

1. pulp admin creates a pulp-file repo

2. pulp admin configures the `TASK_DIAGNOSTIC_FILE_REPO_URL` equal to its URL? or maybe `TASK_DIAGNOSTIC_FILE_REPO_ID` equal to its UUID?

What do you think?

Adding a file to that repository requires a task. That may create an infinite loop.

@ipanova
Copy link
Member

ipanova commented Jun 4, 2024

I see the value of storing the output files solely in artifacts because those could be saved without depending on the tasking system. Also they will automatically be picked up by the orphan clean up.

@bmbouter
Copy link
Member Author

bmbouter commented Jun 4, 2024

Saving them as an artifact sounds even better for all the reasons mentioned. Also it would avoid the admin even having to create a dedicated pulp_file repo.

When a task runs, how can I know how to fetch that saved artifact? If that's easy then I'm +1 on this.

@bmbouter bmbouter changed the title Add ability to save Task Diagnostic data to an S3 bucket Add ability to save Task Diagnostic data Jun 4, 2024
@bmbouter
Copy link
Member Author

bmbouter commented Jun 5, 2024

I thought about this some more and what would be nice is if the Task got a created_resources entry in it to the Artifact URL. Is that an option?

@ipanova
Copy link
Member

ipanova commented Jun 6, 2024

We should be able to at least to log the href of the created artifact whenever task_diagnostics is on.

@bmbouter
Copy link
Member Author

bmbouter commented Jun 6, 2024

Logging would work. Having it on the task is more beneficial for our use for a few reasons:

  • Sometimes the logs can be hard to get a hold of on the hosted installations because it's not a system you just log into
  • We'll leave this enabled on staging, and we want our hosted users to be able to get their cProfile data in a self-service way (without having to contact us for each cProfile they want).

@ipanova
Copy link
Member

ipanova commented Jun 6, 2024

@bmbouter I agree that having it directly on the task is more transparent, just we never placed artifacts nor content as created_resource, only those resources that require locking, like repos, remotes, distributions, etc

@bmbouter
Copy link
Member Author

bmbouter commented Jun 6, 2024

@ipanova I agree and I'm also a little uneasy setting that precedent here. Is there another option to have that info added to the task somehow?

@bmbouter
Copy link
Member Author

We discussed at pulpcore meeting today. Here's my summary of what I think the plan is.

Let's implement this setting to save the bare Artifact (not a file content unit, not in a file repo) and just have it log for now. We can look at improving the usability later as a separate issue. This avoids us having to set the precident adding an Artifact to the created_resources data of a task.

@pulp/core is this right?

@lubosmj lubosmj self-assigned this Jul 11, 2024
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 16, 2024
@lubosmj lubosmj linked a pull request Jul 16, 2024 that will close this issue
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 16, 2024
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 17, 2024
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 17, 2024
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 17, 2024
lubosmj added a commit to lubosmj/pulpcore that referenced this issue Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs review
Development

Successfully merging a pull request may close this issue.

6 participants