Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 2 Skipped Deployments
|
Codecov ReportAttention: Patch coverage is
❌ Your patch status has failed because the patch coverage (78.72%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #6161 +/- ##
==========================================
+ Coverage 86.92% 87.01% +0.09%
==========================================
Files 423 422 -1
Lines 26161 26170 +9
Branches 2842 2849 +7
==========================================
+ Hits 22741 22773 +32
+ Misses 2800 2775 -25
- Partials 620 622 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…-storage-capabilities
| if os.path.exists(folder_path): | ||
| import shutil | ||
|
|
||
| shutil.rmtree(folder_path) |
There was a problem hiding this comment.
What does the folder_path evaluate to? I want to make sure we don't delete the entire contents of fides_uploads
There was a problem hiding this comment.
I just went and double checked because ... paranoia...
get_local_filename() is called with f"{self.id}/{self.file_name}".
In src/fides/api/service/storage/util.py - get_local_filename() adds LOCAL_FIDES_UPLOAD_DIRECTORY to the start of the path, and LOCAL_FIDES_UPLOAD_DIRECTORY == fides_uploads.
os.path.dirname() is called on this path, which returns the directory portion of the path.
So for an attachment with:
id = att_123
file_name = example.pdf
The folder_path would evaluate to: fides_uploads/att_123
So this should be safe. It only deletes the specific folder for this attachment (named by its ID)
It's contained within the fides_uploads directory
Each attachment gets its own subdirectory named by its ID - this follows the pattern I am using for gcs and s3
There was a problem hiding this comment.
Perfect, thanks for checking!
| # If the file_key ends with a '/', it's a folder prefix | ||
| if file_key.endswith("/"): | ||
| # List all objects with the prefix | ||
| objects_to_delete = s3_client.list_objects_v2( |
There was a problem hiding this comment.
I don't know how likely this would be, but if there are more than 1000 files, then we would need to paginate to get the next set of files https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/list_objects_v2.html#S3.Client.list_objects_v2. We probably don't need to do this, but just thought I'd call it out
There was a problem hiding this comment.
I updated to use the paginate option just in case
If we want to optimize this further, we could batch the deletes using delete_objects with multiple keys per page, but that would make the code more complex.
The current implementation prioritizes reliability and simplicity over performance, which is probably the right tradeoff given that:
This is a deletion operation that doesn't need to be super fast
The number of files per attachment is likely to be small
The simpler code is easier to maintain and less prone to bugs
Co-authored-by: Adrian Galvan <adrian@ethyca.com>
…-storage-capabilities
galvana
left a comment
There was a problem hiding this comment.
Thanks for making the changes, looks good!
| if os.path.exists(folder_path): | ||
| import shutil | ||
|
|
||
| shutil.rmtree(folder_path) |
There was a problem hiding this comment.
Perfect, thanks for checking!
| # If the file_key ends with a '/', it's a folder prefix | ||
| if file_key.endswith("/"): | ||
| # List all objects with the prefix, handling pagination | ||
| paginator = s3_client.get_paginator("list_objects_v2") |
fides
|
||||||||||||||||||||||||||||
| Project |
fides
|
| Branch Review |
main
|
| Run status |
|
| Run duration | 00m 52s |
| Commit |
|
| Committer | JadeWibbels |
| View all properties for this run ↗︎ | |
| Test results | |
|---|---|
|
|
0
|
|
|
0
|
|
|
0
|
|
|
0
|
|
|
5
|
| View all changes introduced in this branch ↗︎ | |
Closes LJ-530
Description Of Changes
Google Cloud storage was added as a storage option after attachments were introduced and because we are missing a single storage service at this time, this option needed to be added independently. The delete statements for all storage types were updated to be more thorough, but there should be no change to the functionality.
Code Changes
Steps to Confirm
It should show up in the test bucket

Verify you can retrieve the attachment by id.

Delete the attachment works as expected - Before running delete command if you have gone into the created folder to check the attachment make sure you are no longer in it, because it will not auto delete a folder that is being accessed.
Verify the attachment is no longer in gcs.

Attachments should work the same way for manual webhook steps and there should be no change in functionality for s3 or local storage.
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works