Skip to content
This repository has been archived by the owner on Jan 16, 2023. It is now read-only.

Set preload_metadata on the fly #30

Closed
Gwildor opened this issue Apr 1, 2014 · 7 comments
Closed

Set preload_metadata on the fly #30

Gwildor opened this issue Apr 1, 2014 · 7 comments

Comments

@Gwildor
Copy link

Gwildor commented Apr 1, 2014

Setting AWS_PRELOAD_METADATA = True is not always an option, because it makes some commands in the S3BotoStorage really slow. But when it's not enabled, the command does not work. So, preload_metadata should be set when it's not set or set to false when running the command.

Relevant:
https://bitbucket.org/david/django-storages/issue/106/preload-breaks-things-like-exist
AGoodId/django-s3-collectstatic#2

@antonagestam
Copy link
Owner

Sounds scary to override the settings, wouldn't it be preferable to extend S3BotoStorage as suggested in the thread that you linked?

It's also worth noting that while Collectfast was specifically created for my needs when using S3, it currently would work for all storages that behaves the same way and the package doesn't actually depend on S3BotoStorage. Ideally the solution should work for other storage backends too.

From the issue that you linked it kind of sounds like this is something that should be fixed in django-storages though I'm not against implementing a workaround.

Pull requests are welcome! :)

@Gwildor
Copy link
Author

Gwildor commented Apr 1, 2014

Hmm, now that I'm rereading that topic, I might see possibilities for my our case where that would fix things. We were doing thumbnails as well, and we concluded that the entries method(/property) of S3BotoStorage is slow with AWS_PRELOAD_METADATA = True because it keeps loading the whole bucket (unless it's appending to the entries list, but have not looked at the storage code that thoroughly yet), but actually the case of false negatives might be even more likely, which would indeed be fixed by the code shared in that topic.

Relevant links for this edge case:

jazzband/sorl-thumbnail#92
SmileyChris/easy-thumbnails#195
SmileyChris/easy-thumbnails#283

Anyway, of course that has nothing to do with Collectfast, I'm trailing a bit off here and just putting it out there in case anyone else ran into this the same route I did.

As for Collectfast, I agree that changing a setting as you please is a bit dangerous. In this case, however, there are just two possible settings for the particular setting: on or off. When it's turned on, Collectfast works out of the box as intended. When it's turned off, however, Collectfast does not work at all either (at least, zero advantage over regular collectstatic). That being the case, I would say that there are a few possible solutions here:

  • Set the setting for the length of the command, in a similar way as django-s3-collectstatic (mentioned in my earlier post) does (which solves it in a pretty clear way, in my opinion).
  • Refuse to work, or at least log a big warning, when the setting is turned off, being that in that case Collectfast won't help you.
  • Document why the setting must be turned on, and possible even give an example of how to combine this with turning it on just for Collectfast (my first hunch would be using an environment setting to turn it on for Collectfast, or turn it off when it's not set).

On the note that ideally it should work on other storages as well: I agree, and it would be great if it does (although I have no experience with other remote storages than S3). At the moment, however, S3 is specifically stated in the readme, so I would say it is not that bad to add code just for S3 while no other storages are supported yet. To go even further: other small bits of code will probably have to be added later on to support other remote storages in its fullest.

Whatever the option you decide to go with, I would be happy to create and PR something, although I won't make any promises on the time span (probably this weekend).

@antonagestam
Copy link
Owner

I think any combination of options 2 and 3 would be the most compelling way to go. I could also see how going with option 1 but logging a warning if we're overriding the setting could work. Thoughts? I really don't have that strong opinions about this to be honest. Hard to make a choice since I am not familiar with the way AWS_PRELOAD_METADATA is implemented and what all the consequences are.

Feel free to add more to the discussion, otherwise I'll try to have a decision ASAP. I appreciate you taking your time!

@antonagestam
Copy link
Owner

I've submitted a PR for this, can you confirm it works as expected?

@Gwildor
Copy link
Author

Gwildor commented Apr 10, 2014

Sorry, I was not by able to quickly test this, which is also why I haven't worked on a PR myself. I tested this today, and it seems to work fine, thank you! :)

Edit: on second thought, I noticed something with the PR, I've responded on it there.

@antonagestam
Copy link
Owner

@Gwildor Great, thanks! No stress! 👍

@antonagestam
Copy link
Owner

closed in #33

Gwildor added a commit to Gwildor/collectfast that referenced this issue Feb 27, 2017
This fixes uploading a static file with the same name twice resulting in
a second file with a randomized suffix in the name, and of course the
first one never to be updated to the newer version.

This is a bit related to antonagestam#30 and basically has the same fix. It's a bit
surprising that this never was a problem before, but that's probably
because, in contrary to the setting from antonagestam#30, that default in
django-storages for this setting is the one we need. This also means
that the impact of this change is a lot smaller. It only affects
projects which have manually set this setting to False, most likely to
prevent user uploads from overwriting each other.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants