New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Cloud backend not a very good replacement for Django's Media or Static handling #491

Closed
elnygren opened this Issue Apr 25, 2018 · 14 comments

Comments

Projects
None yet
8 participants
@elnygren

elnygren commented Apr 25, 2018

The Google cloud backend isn't a very good candidate for Django's Media or Static files storage backend without some additional configuration and studying the sources. Ideally, the docs would talk a little bit about issues such as #463 #413

Here's my solution:

"""
GoogleCloudStorage extensions suitable for handing Django's
Static and Media files.

Requires following settings:
MEDIA_URL, GS_MEDIA_BUCKET_NAME
STATIC_URL, GS_STATIC_BUCKET_NAME

In addition to
https://django-storages.readthedocs.io/en/latest/backends/gcloud.html
"""
from django.conf import settings
from storages.backends.gcloud import GoogleCloudStorage
from storages.utils import setting
from urllib.parse import urljoin


class GoogleCloudMediaStorage(GoogleCloudStorage):
    """GoogleCloudStorage suitable for Django's Media files."""

    def __init__(self, *args, **kwargs):
        if not settings.MEDIA_URL:
            raise Exception('MEDIA_URL has not been configured')
        kwargs['bucket_name'] = setting('GS_MEDIA_BUCKET_NAME', strict=True)
        super(GoogleCloudMediaStorage, self).__init__(*args, **kwargs)

    def url(self, name):
        """.url that doesn't call Google."""
        return urljoin(settings.MEDIA_URL, name)


class GoogleCloudStaticStorage(GoogleCloudStorage):
    """GoogleCloudStorage suitable for Django's Static files"""

    def __init__(self, *args, **kwargs):
        if not settings.STATIC_URL:
            raise Exception('STATIC_URL has not been configured')
        kwargs['bucket_name'] = setting('GS_STATIC_BUCKET_NAME', strict=True)
        super(GoogleCloudStaticStorage, self).__init__(*args, **kwargs)

    def url(self, name):
        """.url that doesn't call Google."""
        return urljoin(settings.STATIC_URL, name)
@nbau21

This comment has been minimized.

Show comment
Hide comment
@nbau21

nbau21 Jun 1, 2018

Thanks for doing the legwork. I tried implementing this snippet but I keep getting nulls. Am I missing something?

in settings.py:

DEFAULT_FILE_STORAGE = 'api.storage.backends.GoogleCloudStaticStorage',                                    
GS_BUCKET_NAME = my-bucket-name,                                                                                      
GS_STATIC_BUCKET_NAME = my-bucket-name 

STATIC_URL  = ''                                                                 

And my images are in gs://my-bucket-name/images and gs://my-bucket-name/thumbnail

Edit: Alright, after a good night's sleep, I realized I should be using my bucket's URL for STATIC_URL. Ex: https://my-bucket-name.storage.googleapis.com.

Again, thanks for doing the legwork!

nbau21 commented Jun 1, 2018

Thanks for doing the legwork. I tried implementing this snippet but I keep getting nulls. Am I missing something?

in settings.py:

DEFAULT_FILE_STORAGE = 'api.storage.backends.GoogleCloudStaticStorage',                                    
GS_BUCKET_NAME = my-bucket-name,                                                                                      
GS_STATIC_BUCKET_NAME = my-bucket-name 

STATIC_URL  = ''                                                                 

And my images are in gs://my-bucket-name/images and gs://my-bucket-name/thumbnail

Edit: Alright, after a good night's sleep, I realized I should be using my bucket's URL for STATIC_URL. Ex: https://my-bucket-name.storage.googleapis.com.

Again, thanks for doing the legwork!

@sww314 sww314 added the google label Jun 5, 2018

@edevil

This comment has been minimized.

Show comment
Hide comment
@edevil

edevil Jun 15, 2018

Maybe a PR to the docs would move this along? Thanks for your work.

edevil commented Jun 15, 2018

Maybe a PR to the docs would move this along? Thanks for your work.

@chrishiestand

This comment has been minimized.

Show comment
Hide comment
@chrishiestand

chrishiestand Jul 13, 2018

@elnygren And where does your file get stored?

chrishiestand commented Jul 13, 2018

@elnygren And where does your file get stored?

@chrishiestand

This comment has been minimized.

Show comment
Hide comment
@chrishiestand

chrishiestand Jul 13, 2018

n/m - I realized you reference these from settings.py

STATICFILES_STORAGE = 'googlecloudstorage.GoogleCloudStaticStorage'
DEFAULT_FILE_STORAGE = 'googlecloudstorage.GoogleCloudMediaStorage'

So you just need to store them in such a way as python can find them.

chrishiestand commented Jul 13, 2018

n/m - I realized you reference these from settings.py

STATICFILES_STORAGE = 'googlecloudstorage.GoogleCloudStaticStorage'
DEFAULT_FILE_STORAGE = 'googlecloudstorage.GoogleCloudMediaStorage'

So you just need to store them in such a way as python can find them.

@jschneier

This comment has been minimized.

Show comment
Hide comment
@jschneier

jschneier Aug 12, 2018

Owner

Okay, so as currently written we hit Google every time .url is called. This is obviously substandard for performance reasons. This is really easy to update this is taken from the Google Cloud library, PR anyone?

    def public_url(self):
        """The public URL for this blob's object.

        :rtype: `string`
        :returns: The public URL for this blob.
        """
        return '{storage_base_url}/{bucket_name}/{quoted_name}'.format(
            storage_base_url=_API_ACCESS_ENDPOINT,
            bucket_name=self.bucket.name,
            quoted_name=quote(self.name.encode('utf-8')))

No one has asked for signed urls yet but we'll need them eventually I think. Amazon is nice because it's on the bucket whereas for this we need to fetch the blob.

Owner

jschneier commented Aug 12, 2018

Okay, so as currently written we hit Google every time .url is called. This is obviously substandard for performance reasons. This is really easy to update this is taken from the Google Cloud library, PR anyone?

    def public_url(self):
        """The public URL for this blob's object.

        :rtype: `string`
        :returns: The public URL for this blob.
        """
        return '{storage_base_url}/{bucket_name}/{quoted_name}'.format(
            storage_base_url=_API_ACCESS_ENDPOINT,
            bucket_name=self.bucket.name,
            quoted_name=quote(self.name.encode('utf-8')))

No one has asked for signed urls yet but we'll need them eventually I think. Amazon is nice because it's on the bucket whereas for this we need to fetch the blob.

@sheepeatingtaz

This comment has been minimized.

Show comment
Hide comment
@sheepeatingtaz

sheepeatingtaz Aug 12, 2018

Not sure if this is the place, as it seems like I'm hijacking another issue, but..

No one has asked for signed urls yet but we'll need them eventually I think.

I'd like signed URLs, please! See My PR #326 or the one that extends it further #399

I know for mine, I've just not had the time or experience to see how to fit it in with the tests, but could help add documentation, if it would help

sheepeatingtaz commented Aug 12, 2018

Not sure if this is the place, as it seems like I'm hijacking another issue, but..

No one has asked for signed urls yet but we'll need them eventually I think.

I'd like signed URLs, please! See My PR #326 or the one that extends it further #399

I know for mine, I've just not had the time or experience to see how to fit it in with the tests, but could help add documentation, if it would help

@xtechgamer735

This comment has been minimized.

Show comment
Hide comment
@xtechgamer735

xtechgamer735 Aug 17, 2018

As someone looking to implement this in a django project, is it really the way forward with these issues that have been raised? The lack of documentation made me uneasy in the first place as I am new to Django but I can find anything else.

Im struggling to see at first how I implement this but I can get my head around it if its worth it.

xtechgamer735 commented Aug 17, 2018

As someone looking to implement this in a django project, is it really the way forward with these issues that have been raised? The lack of documentation made me uneasy in the first place as I am new to Django but I can find anything else.

Im struggling to see at first how I implement this but I can get my head around it if its worth it.

@sww314

This comment has been minimized.

Show comment
Hide comment
@sww314

sww314 Aug 17, 2018

Collaborator

@xtechgamer735 I use this project in production on with Google Cloud Storage. There are some issues that are being worked, but the project is actively maintained.

Personally, I do not use GCS for static file storage (try this instead: http://whitenoise.evans.io/en/stable/) is simplier without the fragile deployment to a 3rd party network storage solution. Instead, I use it for media storage and it works fine with only a few configuration settings.

Collaborator

sww314 commented Aug 17, 2018

@xtechgamer735 I use this project in production on with Google Cloud Storage. There are some issues that are being worked, but the project is actively maintained.

Personally, I do not use GCS for static file storage (try this instead: http://whitenoise.evans.io/en/stable/) is simplier without the fragile deployment to a 3rd party network storage solution. Instead, I use it for media storage and it works fine with only a few configuration settings.

@xtechgamer735

This comment has been minimized.

Show comment
Hide comment
@xtechgamer735

xtechgamer735 Aug 17, 2018

@sww314 Ah thanks for the link! That looks great.

I think GCS is the best way forward for me, it was the comment regarding inefficient url usage that got me most worried. Is that nothing really to be concerned about?

If you don't mind me asking what type of data are you storing?

xtechgamer735 commented Aug 17, 2018

@sww314 Ah thanks for the link! That looks great.

I think GCS is the best way forward for me, it was the comment regarding inefficient url usage that got me most worried. Is that nothing really to be concerned about?

If you don't mind me asking what type of data are you storing?

@sww314

This comment has been minimized.

Show comment
Hide comment
@sww314

sww314 Aug 17, 2018

Collaborator

The url call makes the listing of many-many files slow. So in particular, it makes collectstatic pretty slow.
For typical, flow of getting and saving a media file (images, docs etc) it works fine - since Google replies are fast.

Collaborator

sww314 commented Aug 17, 2018

The url call makes the listing of many-many files slow. So in particular, it makes collectstatic pretty slow.
For typical, flow of getting and saving a media file (images, docs etc) it works fine - since Google replies are fast.

@sergioisidoro sergioisidoro referenced this issue Aug 21, 2018

Open

Add support to google cloud storage #2626

0 of 6 tasks complete
@jschneier

This comment has been minimized.

Show comment
Hide comment
@jschneier

jschneier Aug 26, 2018

Owner

Okay so I think what we want is essentially what we have in the boto backends:

  1. if Custom URL: Return that
  2. if Signed URL: Fetch the blob and return the generated signed url
  3. else:
name = self._encode_name(self._normalize_name(clean_name(name)))
return 'https://storage.googleapis.com/{}/{}'.format(self.bucket_name, quote(name, safe=''))

I ripped that out of the Google Cloud storage library, can someone try that out and confirm it at least passes a sanity check?

Then we can move onto points 1 & 2 after which I can ship 1.7

Owner

jschneier commented Aug 26, 2018

Okay so I think what we want is essentially what we have in the boto backends:

  1. if Custom URL: Return that
  2. if Signed URL: Fetch the blob and return the generated signed url
  3. else:
name = self._encode_name(self._normalize_name(clean_name(name)))
return 'https://storage.googleapis.com/{}/{}'.format(self.bucket_name, quote(name, safe=''))

I ripped that out of the Google Cloud storage library, can someone try that out and confirm it at least passes a sanity check?

Then we can move onto points 1 & 2 after which I can ship 1.7

@sww314

This comment has been minimized.

Show comment
Hide comment
@sww314

sww314 Aug 26, 2018

Collaborator

@jschneier for 3, you can just use self.bucket.blob(name).public_url()
No need to move code from the Google library.

Collaborator

sww314 commented Aug 26, 2018

@jschneier for 3, you can just use self.bucket.blob(name).public_url()
No need to move code from the Google library.

@jschneier

This comment has been minimized.

Show comment
Hide comment
@jschneier

jschneier Aug 26, 2018

Owner
Owner

jschneier commented Aug 26, 2018

@sww314

This comment has been minimized.

Show comment
Hide comment
@sww314

sww314 Aug 27, 2018

Collaborator

#570 Fixes 2 & 3 which makes the backend usable for public and files that you want to expire. It also greatly improves performance for public files.

I would suggest item 1 - custom url can be addressed after. I think the custom urls have some limitations on GCS.

Collaborator

sww314 commented Aug 27, 2018

#570 Fixes 2 & 3 which makes the backend usable for public and files that you want to expire. It also greatly improves performance for public files.

I would suggest item 1 - custom url can be addressed after. I think the custom urls have some limitations on GCS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment