Skip to content
This repository has been archived by the owner on Jan 16, 2023. It is now read-only.

Properly handle non-S3 storage backends #124

Closed

Conversation

johananl
Copy link

This PR addresses #120. This bug causes Collectfast to crash when a non-S3 storage backend is configured.

I am not completely sure about the way I've implemented the storage type check, so please feel free to suggest alternative approaches. Regardless and in general - any comments and suggestions are very welcome.

Checking for self.storage.preload_metadata when collectfast is
disabled isn't necessary and also crashes the command when the
static files storage type isn't S3.
"setting `AWS_PRELOAD_METADATA` to `True`. Overriding "
"`storage.preload_metadata` and continuing.")
if settings.enabled:
if django_settings.STATICFILES_STORAGE not in VALID_BACKENDS:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one problem with this is that it would cause a RuntimeError if you're using a subclass of S3BotoStorage/S3Boto3Storage. I see two other options:

(1) import the classes and use if isinstance(self.storage, S3BotoStorage) or isinstance(self.storage, S3Boto3Storage) (which could be refactored into a helper: if is_s3_backend(self.storage))
(2) duck typing: if not hasattr(self.storage, 'preload_metadata'): raise --> but I'm wondering if we'd just want to set preload_metadata defensively in that case:

if not getattr(self.storage, 'preload_metadata', False):
    self.storage.preload_metadata = True
    warnings.warn(...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point @pachewise.

Actually I wanted to implement it as in your first suggestion but encountered problems with mocked objects in some existing tests.

In any case, I think both of your suggestions could work, however looks like we have bigger questions now following @antonagestam's comment below. Let's clear the desired behavior first, and then I'll solve the problems in the implementation.

raise RuntimeError(
"Collectfast is intended to work with an S3 storage "
"backend only."
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the desired behaviour. In case a storage backend is configured that isn't S3 we want to handle it gracefully. I've had a plan (in my head) for some time to introduce an abstraction layer for what strategy to use for different backends, but I think that's a much larger scope than what's needed here.

I believe all we want to do is skip the md5/etag comparison if the backend isn't s3. If it's feasible I think it'd be cool if we could still have threading enabled.

Copy link
Author

@johananl johananl Feb 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment @antonagestam.

Yes, I understand the desire to handle non-S3 backends gracefully. Just to verify I understood you correctly, you're saying that when settings.enabled == True we should handle things gracefully even if the storage backend isn't S3, right?
If that's the case, we must check the backend type before referencing self.storage.preload_metadata, otherwise we get an AttributeError with non-S3 storages. As a reminder, in #120 you've suggested to do:

if settings.enabled and self.storage.preload_metadata is not True:
    ...

instead of checking the storage type.

So, now I'm confused :-) If we want to allow settings.enabled == True + non-S3 backend, we must check the storage type, unless I'm missing something.

Could you please clarify the right approach, the way you see it? I'd love to solve this this way or the other, without increasing the scope of this bugfix too much.

Please let me know your thoughts.

Thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antonagestam - are you suggesting something like:

self.collectfast_enabled = settings.enabled
if not hasattr(self.storage, 'preload_metadata'):  # or S3 checks
    self.collectfast_enabled = False
    warnings.warn('Your current storage does not have preload_metadata!'
                  'Disabling collectfast and continuing...'
    )
if self.collectfast_enabled and self.storage.preload_metadata is not True:
    self.storage.preload_metadata = True
    warnings.warn(
        # ...
    )

@pachewise
Copy link

any progress here @antonagestam @johananl ?

@johananl
Copy link
Author

I'd love to move on with this @pachewise, however I am not clear on the desired behavior. I think we need @antonagestam here.

@antonagestam
Copy link
Owner

antonagestam commented Mar 16, 2018

I've created a separate patch for this issue as I found it difficult to express the behavior I was trying to explain without writing the code. I take a slightly different approach than you and instead only check the preload_metadata attribute if the storage is a boto instance.

I did not want to introduce a new RuntimeError, one reason being that that would prevent this from being released in a fix version and another reason that it goes in the opposite direction of what I want to take in the future. I want Collectfast to grow to work for other storage backends as well.

Please check it out and play around with it to see if it fixes your issue in the way you expect!

#127

Also, sorry for the slight delay here.

@pachewise
Copy link

we can probably close this, since it has been fixed already

@antonagestam
Copy link
Owner

Thanks @pachewise, closing

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants