-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add file datatype type to support saving and reading files/folders in artifact_store #1805
Conversation
77ec8c8
to
163caa8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1805 +/- ##
===========================================
- Coverage 80.33% 65.91% -14.43%
===========================================
Files 95 124 +29
Lines 6602 8842 +2240
===========================================
+ Hits 5304 5828 +524
- Misses 1298 3014 +1716 ☔ View full report in Codecov by Sentry. |
7f7c213
to
78ebe53
Compare
34609c5
to
54e0bd3
Compare
54e0bd3
to
8708513
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - need to sync on plan for various Encodable
versions.
save_path = os.path.join(file_id_folder, name) | ||
logging.info(f"Copying file {file_path} to {save_path}") | ||
if path.is_dir(): | ||
shutil.copytree(file_path, save_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jieguangzhou if there is a partially copy and program crashes may be a rollback?
possibly in future if not in this pr.
Ignore if already happens in shut.copy*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to do this in future, not only filesystem, all the artifact store need to support this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be handled in db.add
. We need to be able to roll back the whole add
process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Deleting already added artifacts
- Cancelling any running jobs
- Removing computed outputs
Not an easy task.
save_path = os.path.join(file_id_folder, name) | ||
logging.info(f"Copying file {file_path} to {save_path}") | ||
if path.is_dir(): | ||
shutil.copytree(file_path, save_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jieguangzhou @blythed
Since we are copying it in artifact as a snapshot of the directory, we need to caution users that if they change the source directory after db.load , it will not be reflected in the component.
although it is intended to be like this
"""Download file or folder from GridFS and return the path""" | ||
temp_dir = tempfile.mkdtemp(prefix=file_id) | ||
|
||
# try to download a file first, if it fails, assume it's a folder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is not valid anymore right since you have a metadata.type with 'dir' or 'file' which will give you if folder or file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When downloading, we currently only have file_id information and no other information. Although we can also use file_id to judge first, it will increase a search.
Or save more information upstream, but actually increase the complexity
WDTY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be explicit. If we don't have the type
then we should raise an Exception
. Don't assume anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
c0fdf02
to
a065b06
Compare
a065b06
to
61ced9a
Compare
Description
#1789
Related Issues
Checklist
make unit-testing
andmake integration-testing
successfully?Additional Notes or Comments