Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repo version repair command errors with traceback #4776

Closed
jlsherrill opened this issue Nov 27, 2023 · 10 comments · Fixed by #4815
Closed

repo version repair command errors with traceback #4776

jlsherrill opened this issue Nov 27, 2023 · 10 comments · Fixed by #4815
Labels

Comments

@jlsherrill
Copy link
Contributor

Version
"rpm": "3.24.0",
"core": "3.41.0",
"file": "3.41.0",

Describe the bug
When running a repo repair for an rpm repo, we got a traceback on a couple of the tasks. It sorta looks its a failed download, but could be something else? If it is a failed download, is it stopping immediately?

To Reproduce
have a repo version with missing units
Repair the repo version with verify_checksums set to true
simulate a download failure (maybe?)
Get the traceback below

Expected behavior
Repair will try to redownload as much as possible before failing

Additional context

"traceback: File "/usr/local/lib/python3.8/site-packages/pulpcore/tasking/tasks.py", line 60, in _execute_task\n result = func(*args, **kwargs)\n File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 184, in repair_version\n loop.run_until_complete(\n File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete\n return future.result()\n File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 146, in _repair_artifacts_for_content\n valid = await loop.run_in_executor(\n File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run\n result = self.fn(*self.args, **self.kwargs)\n File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 103, in _verify_artifact\n for chunk in fp.chunks(CHUNK_SIZE):\n File "/usr/local/lib/python3.8/site-packages/django/core/files/base.py", line 55, in ...

@mdellweg
Copy link
Member

reformatted traceback:

File "/usr/local/lib/python3.8/site-packages/pulpcore/tasking/tasks.py", line 60, in _execute_task
result = func(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 184, in repair_version
loop.run_until_complete(
File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 146, in _repair_artifacts_for_content
valid = await loop.run_in_executor(
File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 103, in _verify_artifact
for chunk in fp.chunks(CHUNK_SIZE):
File "/usr/local/lib/python3.8/site-packages/django/core/files/base.py", line 55, in ...

@mdellweg
Copy link
Member

Can you add the error text (I think it is above the traceback in the logs)? Also it looks like the traceback was truncated.
All i can derive from this now is that _verify_artifact may have failed.

@dkliban
Copy link
Member

dkliban commented Nov 27, 2023

pulp [4697efb989ad48c5942472475c18308b]: pulpcore.tasking.tasks:INFO: Task 018bf877-c044-761d-a688-1fd3811acef7 failed (Invalid endpoint: s3.us-east-1.amazonaws.com)
pulp [4697efb989ad48c5942472475c18308b]: pulpcore.tasking.tasks:INFO: File "/usr/local/lib/python3.8/site-packages/pulpcore/tasking/tasks.py", line 60, in _execute_task
result = func(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 184, in repair_version
loop.run_until_complete(
File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 146, in _repair_artifacts_for_content
valid = await loop.run_in_executor(
File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/site-packages/pulpcore/app/tasks/repository.py", line 103, in _verify_artifact
for chunk in fp.chunks(CHUNK_SIZE):
File "/usr/local/lib/python3.8/site-packages/django/core/files/base.py", line 55, in chunks
self.seek(0)
File "/usr/local/lib/python3.8/site-packages/django/core/files/utils.py", line 46, in <lambda>
seek = property(lambda self: self.file.seek)
File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/files.py", line 48, in _get_file
self._file = self.storage.open(self.name, "rb")
File "/usr/local/lib/python3.8/site-packages/django/core/files/storage/base.py", line 22, in open
return self._open(name, mode)
File "/usr/local/lib/python3.8/site-packages/storages/backends/s3.py", line 461, in _open
f = S3File(name, mode, self)
File "/usr/local/lib/python3.8/site-packages/storages/backends/s3.py", line 126, in __init__
self.obj = storage.bucket.Object(name)
File "/usr/local/lib/python3.8/site-packages/storages/backends/s3.py", line 444, in bucket
self._bucket = self.connection.Bucket(self.bucket_name)
File "/usr/local/lib/python3.8/site-packages/storages/backends/s3.py", line 411, in connection
self._connections.connection = session.resource(
File "/usr/local/lib/python3.8/site-packages/boto3/session.py", line 446, in resource
client = self.client(
File "/usr/local/lib/python3.8/site-packages/boto3/session.py", line 299, in client
return self._session.create_client(
File "/usr/local/lib/python3.8/site-packages/botocore/session.py", line 997, in create_client
client = client_creator.create_client(
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 159, in create_client
client_args = self._get_client_args(
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 490, in _get_client_args
return args_creator.get_client_args(
File "/usr/local/lib/python3.8/site-packages/botocore/args.py", line 137, in get_client_args
endpoint = endpoint_creator.create_endpoint(
File "/usr/local/lib/python3.8/site-packages/botocore/endpoint.py", line 402, in create_endpoint
raise ValueError("Invalid endpoint: %s" % endpoint_url)

@dkliban
Copy link
Member

dkliban commented Nov 27, 2023

So this might be related to boto/boto3#2131 (comment)

@dkliban
Copy link
Member

dkliban commented Nov 29, 2023

I was able to reproduce this only by setting the endpoint_url in the domain storage_settings to s3.us-east-1.amazonaws.com without the https://. However, when I do this, all other operations such a sync and publish stop working also. However, the instance where we are seeing this behavior is continuing to sync and publish correctly.

@dkliban
Copy link
Member

dkliban commented Nov 30, 2023

I was fully able to reproduce the original issue reported here. I set AWS_S3_ENDPOINT_URL to s3.us-east-1.amazonaws.com. I left the domain configured properly. When I try to repair a repository in a non-default domain, the repair task produces the traceback from the orginal issue description.

@mdellweg
Copy link
Member

I was fully able to reproduce the original issue reported here. I set AWS_S3_ENDPOINT_URL to s3.us-east-1.amazonaws.com. I left the domain configured properly. When I try to repair a repository in a non-default domain, the repair task produces the traceback from the orginal issue description.

Is repair repairing the artifacts in the proper domain at all? Or is it always using the default domains storage?

@gerrod3
Copy link
Contributor

gerrod3 commented Nov 30, 2023

What values are being set for the domain in question? Is it possible it doesn't have all the necessary settings being set?

@dkliban
Copy link
Member

dkliban commented Nov 30, 2023

I was fully able to reproduce the original issue reported here. I set AWS_S3_ENDPOINT_URL to s3.us-east-1.amazonaws.com. I left the domain configured properly. When I try to repair a repository in a non-default domain, the repair task produces the traceback from the orginal issue description.

Is repair repairing the artifacts in the proper domain at all? Or is it always using the default domains storage?

Yes, the repair operates on the correct domain. There is just some code path that causes the default storage for Pulp to be evaluated.

@dkliban
Copy link
Member

dkliban commented Nov 30, 2023

What values are being set for the domain in question? Is it possible it doesn't have all the necessary settings being set?

The domain's storage settings are correct. The domain works properly. It's only the code that does the repair that performs a file field lookup on the artifact and causes this to occur.

dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Nov 30, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 1, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 1, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 1, 2023
This patch also adds repair API tests for S3 storage backend.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 1, 2023
This patch also adds repair API tests for S3 and Azure storage backends.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 4, 2023
This patch also adds repair API tests for S3 and Azure storage backends.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 4, 2023
This patch also adds repair API tests for S3 and Azure storage backends.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 4, 2023
This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds assertions that a repository version repair task can be run in a non-default
domain.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 4, 2023
This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds assertions that a repository version repair task can be run in a non-default
domain.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 5, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 5, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit to dkliban/pulpcore that referenced this issue Dec 5, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: pulp#4776
fixes: pulp#4806
dkliban added a commit that referenced this issue Dec 5, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: #4776
fixes: #4806
patchback bot pushed a commit that referenced this issue Dec 5, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: #4776
fixes: #4806
(cherry picked from commit 95a8f47)
mdellweg pushed a commit that referenced this issue Dec 6, 2023
The 'current_domain' ContextVar was not getting coppied into the thread that was running the artifact
verification. The context is now explicitly copied into that thread.

This patch also adds repair API tests for S3 and Azure storage backends.

This patch also adds test assertions that a repository version repair task can be run in a non-default
domain.

fixes: #4776
fixes: #4806
(cherry picked from commit 95a8f47)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants