Skip to content
This repository has been archived by the owner on Dec 7, 2022. It is now read-only.

Content app tries all RemoteArtifacts #3826

Merged
merged 1 commit into from Jan 28, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
75 changes: 46 additions & 29 deletions pulpcore/content/handler.py
Expand Up @@ -8,6 +8,7 @@
import django # noqa otherwise E402: module level not at top of file
django.setup() # noqa otherwise E402: module level not at top of file

from aiohttp.client_exceptions import ClientResponseError
from aiohttp.web import FileResponse, StreamResponse
from aiohttp.web_exceptions import HTTPForbidden, HTTPNotFound
from django.core.exceptions import MultipleObjectsReturned, ObjectDoesNotExist
Expand Down Expand Up @@ -215,42 +216,58 @@ async def _stream_content_artifact(self, request, response, content_artifact):
"""
Stream and optionally save a ContentArtifact by requesting it using the associated remote.

If a fatal download failure occurs while downloading and there are additional
:class:`~pulpcore.plugin.models.RemoteArtifact` objects associated with the
:class:`~pulpcore.plugin.models.ContentArtifact` they will also be tried. If all
:class:`~pulpcore.plugin.models.RemoteArtifact` downloads raise exceptions, an HTTP 502
Copy link
Contributor

@jortel jortel Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that 502 is the appropriate response. The RFC[1] defines it as

"while acting as a gateway or proxy, received an invalid response from an
inbound server"
.

My understanding is that a 502 is typically raised for protocol mismatch or negotiation errors between the proxy and the upstream. In our case, we either don't have an upstream (matching RemoteArtifact) to try, or we tried and failed (likely a 404) to GET the file.

I think a more appropriate response would be a 404 but open to being convinced otherwise.

[1] https://tools.ietf.org/html/rfc7231#section-6.6.3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. I just now force-pushed a revision that switches it to HttpNotFound instead.

error is returned to the client.

Args:
request(:class:`~aiohttp.web.Request`): The request to prepare a response for.
response (:class:`~aiohttp.web.StreamResponse`): The response to stream data to.
content_artifact (:class:`~pulpcore.plugin.models.ContentArtifact`): The ContentArtifact
to fetch and then stream back to the client
"""
remote_artifact = content_artifact.remoteartifact_set.get()
remote = remote_artifact.remote.cast()

async def handle_headers(headers):
for name, value in headers.items():
if name.lower() in HOP_BY_HOP_HEADERS:
continue
response.headers[name] = value
await response.prepare(request)

async def handle_data(data):
await response.write(data)
if remote.policy != Remote.STREAMED:
await original_handle_data(data)

async def finalize():
Raises:
:class:`~aiohttp.web.HTTPNotFound` when no
:class:`~pulpcore.plugin.models.RemoteArtifact` objects associated with the
:class:`~pulpcore.plugin.models.ContentArtifact` returned the binary data needed for
the client.
"""
for remote_artifact in content_artifact.remoteartifact_set.all():
remote = remote_artifact.remote.cast()

async def handle_headers(headers):
for name, value in headers.items():
if name.lower() in HOP_BY_HOP_HEADERS:
continue
response.headers[name] = value
await response.prepare(request)

async def handle_data(data):
await response.write(data)
if remote.policy != Remote.STREAMED:
await original_handle_data(data)

async def finalize():
if remote.policy != Remote.STREAMED:
await original_finalize()

downloader = remote.get_downloader(remote_artifact=remote_artifact,
headers_ready_callback=handle_headers)
original_handle_data = downloader.handle_data
downloader.handle_data = handle_data
original_finalize = downloader.finalize
downloader.finalize = finalize
try:
download_result = await downloader.run()
except ClientResponseError:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmbouter Sorry for the late (again) comment: Do you know what happens if the ClientResponseError occurs in an ongoing download (e.g. server closes connection unexpectedly)?

The code seems to look for the next remote artifact and put it into the same response (which may cause an exception for response.prepare or, even worse, just append the content to the broken content of the first artifact.

(I have not tested this, I may be wrong)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got around to filing this https://pulp.plan.io/issues/4434

continue
if remote.policy != Remote.STREAMED:
await original_finalize()

downloader = remote.get_downloader(remote_artifact=remote_artifact,
headers_ready_callback=handle_headers)
original_handle_data = downloader.handle_data
downloader.handle_data = handle_data
original_finalize = downloader.finalize
downloader.finalize = finalize
download_result = await downloader.run()
if remote.policy != Remote.STREAMED:
self._save_content_artifact(download_result, content_artifact)
await response.write_eof()
return response
self._save_content_artifact(download_result, content_artifact)
await response.write_eof()
return response
raise HTTPNotFound()

def _save_content_artifact(self, download_result, content_artifact):
"""
Expand Down