Arbitrary depth support for zip imports#1099
Conversation
cd4ad35 to
72c0305
Compare
8e97943 to
024d0c1
Compare
BryonLewis
left a comment
There was a problem hiding this comment.
Thanks for making this work with arbitrary levels. I have a few questions and a comment based on using direct zip exports as imports without other datasets included.
| json=metdata, | ||
| ) | ||
| for folderName, folderType in discovered_folders.items(): | ||
| subFolderName = folderName if len(discovered_folders) > 1 else '' |
There was a problem hiding this comment.
The removal of the check for a single top level folder means that it imports anything from the exported Zip from the website as a subFolder instead of directly importing it in. I.E. - if you export a dataset as zip using the export all and then try importing it. Instead of treating it as the ./Import_name, it will treat it as ./Import_name/dataset_name meaning that it will always import the exported zips as a sub folder of the import. For multiple sub folders this makes sense but I'm not sure if we want this behavior for a single export zip dataset. I realize this update now allows arbitrary depth of the imports, but at the cost of a smooth export/import procedure for the zip exports. Just want to bring that up and see if it's worth it.
The idea would be to do a sum of the number of 'unstructured' and 'datatset' folders and if it is 1 the subFolderName = '' else it sets it to the current subfolder name?
| # remove metadata | ||
| metdata = [constants.TypeMarker, constants.FPSMarker, constants.DatasetMarker] | ||
| gc.sendRestRequest( | ||
| "DELETE", | ||
| f"folder/{folderId}/metadata", | ||
| json=metdata, | ||
| ) |
There was a problem hiding this comment.
The removal of this means that top level folders have residual metadata that stick around including fps and a type: zip. The type may be fine as long as we don't ever use type for searching for data. The fps is relative for creating a single dataset import but is a little weird if you have multiple zip files that are being uploaded. Especially because if you set it to the video default rate the fps value will be -1. This may be okay and intended though.
| 'name': 'singleDatasetImport', | ||
| 'path': 'zipTestFiles/singleDatasetImport.zip', |
There was a problem hiding this comment.
mistake on my part SingleDatasetImport not singleDatasetImport. I should probably be more consistent in my zip file names (maybe the idea was to keep dataset imports capitalized in the beginning).
* initial MVP of zip upload working * mend * adding in some failure conditions * adding in integration testing * updating tests * source folder * starting to address comments * minor fixes * removing zipmarker * adding in multi-import zip and dataset import zip * gAdding in beginning of dataset import * integration testing for multi zip files * reviewing and simplifying * Update docs/Web-Version.md Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * addressing comments * Arbitrary depth support for zip imports (#1099) * Arbitrary depth support for imports * Respond to comments Co-authored-by: Brandon Davis <brandon.davis@kitware.com>
filenamesdoesn't list the directory as its own item, which was breaking stuff. See this comment for a better description of what was broken: Web Zip file importing #1062 (review)postprocessbecause it makes things easier and you can leave off the "cleanup" routine in the celery task. The server should be smart enough to detect a zip upload and not go around creating empty annotation files and aux folders when it knows that it's about to perform an import.Now, even if there's an upload where a zip file contains multiple datasets at
a/b/c/danda/1/2, things should work out properly.