Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoogleCloudStorage.save fails from FileField #1010

Open
Hatell opened this issue May 6, 2021 · 2 comments
Open

GoogleCloudStorage.save fails from FileField #1010

Hatell opened this issue May 6, 2021 · 2 comments

Comments

@Hatell
Copy link
Contributor

Hatell commented May 6, 2021

I found issue in saving to a GoogleCloudStorage from a FileField if FileField.storage is not the same storage.
Issue occurs if storage is a GoogleCloudStorage with a different bucket.

To demonstrate this here is a sample code to reproduce the bug:

models.py

from django.core.files.storage import FileSystemStorage
from storages.backends.gcloud import GoogleCloudStorage

STORAGE_A = FileSystemStorage(location='/home/user') # Or any other storage can be GCS with different bucket
STORAGE_B = GoogleCloudStorage(
  # settings here,
)

class SourceFiles(models.Model):
  source_file = models.FileField(storage=STORAGE_A)

bug.py

from django.core.files.base import ContentFile
from .models import SourceFiles, STORAGE_A, STORAGE_B

source = SourceFiles.objects.create()

# Use STORAGE_A
source.source_file.save(
  'a.txt',
  ContentFile(b'Testfile'),
)

# Copy to STORAGE_B
STORAGE_B.save(
  'b.txt',
  source.source_file, # < bug
)
source.source_file.close()

traceback STORAGE_A is a FileSystemStorage

  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/django/core/files/storage.py", line 53, in save
    return self._save(name, content)
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/storages/backends/gcloud.py", line 159, in _save
    content, rewind=True, size=content.size,
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 70, in size
    return self.storage.size(self.name)
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/django/core/files/storage.py", line 327, in size
    return os.path.getsize(self.path(name))
  File "/usr/lib64/python3.9/genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
FileNotFoundError: [Errno 2] Tiedostoa tai hakemistoa ei ole: '/home/user/b.txt'

traceback STORAGE_A is a GoogleCloudStorage

  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/django/core/files/storage.py", line 53, in save
    return self._save(name, content)
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/storages/backends/gcloud.py", line 159, in _save
    content, rewind=True, size=content.size,
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 70, in size
    return self.storage.size(self.name)
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/storages/backends/gcloud.py", line 212, in size
    blob = self._get_blob(name)
  File "/home/user/.virtualenvs/cv/lib64/python3.9/site-packages/storages/backends/gcloud.py", line 206, in _get_blob
    raise NotFound('File does not exist: {}'.format(name))
google.api_core.exceptions.NotFound: 404 File does not exist: b.txt

Issue is in storages/backends/gcloud.py:155 row where saving contents name is overwritten from a.txt to b.txt. Later in code saving contents size is resolved and it fails becouse b.txt is not found from STORAGE_A.

@Hatell Hatell changed the title GoogleCloudStorage.save fails from FileField to different storage GoogleCloudStorage.save fails from FileField May 6, 2021
@sww314
Copy link
Contributor

sww314 commented May 6, 2021

@Hatell in general, django-storages is clunky going between network files. The API is based on Django file based view of the world. It does not support bucket to bucket copies.

Typically, when I need to do this is fallback to the Google API/library. maybe something like this:

source = SourceFiles.objects.create()

# Use STORAGE_A
source.source_file.save(
  'a.txt',
  ContentFile(b'Testfile'),
)

source_bucket = source.source_file.storage.bucket
source_blob = source.source_file.storage.blob
dest_bucket = STORAGE_B.bucket
new_blob = source_bucket.copy_blob(source_blob, dest_bucket, "b.txt")

I use similiar code to copy between buckets. Otherwise, the default behavior will move files locally.

@Hatell
Copy link
Contributor Author

Hatell commented May 7, 2021

Issue is also in any storage to GCS, not just bucket to bucket. For example FileSystemStorage.

A quick fix was just comment line where a source content.name is modified, I have no idea why GoogleCloudStorage._save method needs to change a source contents name. Change of a destination contents name I understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants