Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #35326 -- added allow_overwrite parameter to FileSystemStorage. #18020

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bcail
Copy link
Contributor

@bcail bcail commented Mar 26, 2024

Trac ticket number

ticket-35326

Branch description

This patch handled temporary uploaded files when the filesystem storage allows overwriting files.

Checklist

  • This PR targets the main branch.
  • The commit message is written in past tense, mentions the ticket number, and ends with a period.
  • I have checked the "Has patch" ticket flag in the Trac system.
  • I have added or updated relevant tests.
  • I have added or updated relevant docs, including release notes if applicable.
  • For UI changes, I have attached screenshots in both light and dark modes.

@bcail bcail force-pushed the issue-35326-overwriting-storage branch 3 times, most recently from e9653cd to 027188b Compare March 28, 2024 13:25
@bcail bcail marked this pull request as ready for review March 28, 2024 14:40
@bcail bcail force-pushed the issue-35326-overwriting-storage branch from 027188b to 0ec0a4d Compare April 1, 2024 19:08
@jschneier
Copy link
Contributor

What about using the OPTIONS dict to implement this feature more generally?

@bcail
Copy link
Contributor Author

bcail commented Apr 22, 2024

OK, so then I would add an allow_overwrite parameter to both the file & memory storage __init__ methods? That seems doable.

@bcail bcail force-pushed the issue-35326-overwriting-storage branch 2 times, most recently from ce1431e to 59124ab Compare April 23, 2024 20:19
Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @bcail! I have initial comments 👍

django/core/files/storage/filesystem.py Outdated Show resolved Hide resolved
django/core/files/storage/filesystem.py Show resolved Hide resolved
Comment on lines 68 to 97
def get_available_name(self, name, max_length=None):
if self._allow_overwrite:
name = str(name).replace("\\", "/")
dir_name, file_name = os.path.split(name)
if ".." in pathlib.PurePath(dir_name).parts:
raise SuspiciousFileOperation(
"Detected path traversal attempt in '%s'" % dir_name
)
validate_file_name(file_name)
return name
else:
return super().get_available_name(name, max_length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this change is required? I also believe it is untested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is needed so that the name is allowed even if there's already a file by that name. If I remove this change, the two file_storage.tests.OverwritingStorageTests tests fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add file_storage.test_generate_filename.GenerateFilenameStorageTests.test_storage_dangerous_paths_dir_name or create a very similar test for when to FileSystemStorage has allow_overwrite=True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the tests.

Copy link
Contributor

@sarahboyce sarahboyce May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of overwriting get_available_name here, we can overwrite exists to something like:

    def exists(self, name):
        exists = os.path.lexists(self.path(name))
        if self._allow_overwrite:
            return False
        return exists

It requires some test updates (these need a think and can probably be better) but something like:

@@ -99,14 +99,14 @@ class FileStorageTests(SimpleTestCase):
         f = self.storage.open("storage_test", "w")
         f.write("storage contents")
         f.close()
-        self.assertTrue(self.storage.exists("storage_test"))
+        self.assertTrue(os.path.exists(os.path.join(self.temp_dir, "storage_test")))

         f = self.storage.open("storage_test", "r")
         self.assertEqual(f.read(), "storage contents")
         f.close()

         self.storage.delete("storage_test")
-        self.assertFalse(self.storage.exists("storage_test"))
+        self.assertFalse(os.path.exists(os.path.join(self.temp_dir, "storage_test")))

     def _test_file_time_getter(self, getter):
         # Check for correct behavior under both USE_TZ=True and USE_TZ=False.
@@ -275,10 +275,10 @@ class FileStorageTests(SimpleTestCase):
         """
         Saving a pathname should create intermediate directories as necessary.
         """
-        self.assertFalse(self.storage.exists("path/to"))
+        self.assertFalse(os.path.exists(os.path.join(self.temp_dir, "path/to")))
         self.storage.save("path/to/test.file", ContentFile("file saved with path"))

-        self.assertTrue(self.storage.exists("path/to"))
+        self.assertTrue(os.path.exists(os.path.join(self.temp_dir, "path/to")))
         with self.storage.open("path/to/test.file") as f:
             self.assertEqual(f.read(), b"file saved with path")

@@ -694,13 +694,11 @@ class OverwritingStorageTests(FileStorageTests):
         stored_name_1 = self.storage.save(name, f_1)
         try:
             self.assertEqual(stored_name_1, name)
-            self.assertTrue(self.storage.exists(name))
             self.assertTrue(os.path.exists(os.path.join(self.temp_dir, name)))
             with self.storage.open(name) as fp:
                 self.assertEqual(fp.read(), content_1)
             stored_name_2 = self.storage.save(name, f_2)
             self.assertEqual(stored_name_2, name)
-            self.assertTrue(self.storage.exists(name))
             self.assertTrue(os.path.exists(os.path.join(self.temp_dir, name)))
             with self.storage.open(name) as fp:
                 self.assertEqual(fp.read(), content_2)
@@ -722,13 +720,11 @@ class OverwritingStorageTests(FileStorageTests):
         stored_name_1 = self.storage.save(name, f_1)
         try:
             self.assertEqual(stored_name_1, name)
-            self.assertTrue(self.storage.exists(name))
             self.assertTrue(os.path.exists(os.path.join(self.temp_dir, name)))
             with self.storage.open(name) as fp:
                 self.assertEqual(fp.read(), content_1)
             stored_name_2 = self.storage.save(name, f_2)
             self.assertEqual(stored_name_2, name)
-            self.assertTrue(self.storage.exists(name))
             self.assertTrue(os.path.exists(os.path.join(self.temp_dir, name)))
             with self.storage.open(name) as fp:

@bcail
Copy link
Contributor Author

bcail commented Apr 30, 2024

@sarahboyce Thanks for your review. I've added OS_OPEN_FLAGS back in, for deprecation, and I added some documentation. Do we need to add a deprecation warning in the code if OS_OPEN_FLAGS is used? If so, I'm not sure how to do that since we're always setting OS_OPEN_FLAGS here in the django code. Should I check for whether OS_OPEN_FLAGS has been changed from the default and raise a warning if so?

I added a commit with a warning if the default OS_OPEN_FLAGS is overridden.

Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit with a warning if the default OS_OPEN_FLAGS is overridden.

Thank you for doing this @bcail, I think this approach looks good to me.

I have a few minor comments, this is getting really close 👍

docs/ref/files/storage.txt Outdated Show resolved Hide resolved
docs/releases/5.1.txt Outdated Show resolved Hide resolved
docs/releases/5.1.txt Outdated Show resolved Hide resolved
django/core/files/storage/filesystem.py Outdated Show resolved Hide resolved
django/core/files/storage/filesystem.py Show resolved Hide resolved
@bcail bcail force-pushed the issue-35326-overwriting-storage branch from bb1bfef to a876585 Compare May 1, 2024 14:47
@jschneier
Copy link
Contributor

Kind of derailed the original ticket so would rename the title

@bcail bcail changed the title Fixed #35326 -- handled temporary uploaded files for OverwritingStorage. Fixed #35326 -- added allow_overwrite parameter to FileSystemStorage. May 1, 2024
docs/releases/5.1.txt Outdated Show resolved Hide resolved
@sarahboyce
Copy link
Contributor

Thank you for the updates @bcail, have you seen this suggestion?
As it would change how exists works (a method we should overwrite/define) rather than get_available_name (that in theory we shouldn't overwrite/define), I think this might be a better way to go 🤔

@bcail
Copy link
Contributor Author

bcail commented May 13, 2024

@sarahboyce I wonder if there would be any downstream code that would want to call exists(), to see if a file is in storage or not, and would get confused by us changing exists() to always return False for overwriting storage?

But I'll do it whichever way you think is best.

@sarahboyce
Copy link
Contributor

To clarify my thoughts, if we look at the docs for exists

Returns True if a file referenced by the given name already exists in the storage system, or False if the name is available for a new file.

I believe it is available for a new file when we are in overwrite mode regardless of whether a file already exists (though we might want to slightly tweak the wording of the true case).

I also believe this would better suit the documented approach for writing a custom file storage system (i.e. defining exists but generally leaving get_available_name alone).

@sarahboyce I wonder if there would be any downstream code that would want to call exists(), to see if a file is in storage or not, and would get confused by us changing exists() to always return False for overwriting storage?

Thank you for raising this. 👍
There should be no impact when allow_overwrite is False (which is the default).
Any other impact can be handled when someone chooses to add support for this newly added allow_overwrite parameter (including overwriting exists to be as before).
Hence, I currently believe that leaving get_available_name alone and instead handling this with exists would be the better approach.

@jschneier do you have an opinion here?

@sarahboyce sarahboyce force-pushed the issue-35326-overwriting-storage branch from 2cb63d6 to f2b4ec1 Compare May 15, 2024 13:55
@sarahboyce sarahboyce force-pushed the issue-35326-overwriting-storage branch from f2b4ec1 to 6375678 Compare May 15, 2024 14:28
@sarahboyce sarahboyce requested a review from nessita May 15, 2024 14:33
@sarahboyce
Copy link
Contributor

Thank you @bcail 🥳 this looks great, I pushed up small edits and will seek a second opinion before merging 👍

@bcail
Copy link
Contributor Author

bcail commented May 15, 2024

Thanks, @sarahboyce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants