-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataDeps doesn't check if files themself exist (?) #10
Comments
Which files in particular? Do you mean in-between the fetch step, and the post-fetch step? We can't check the files after the post-fetch step, |
That is a good point. I'll think on this a little. |
The way I do this currently in MLDatasets is that after DataDeps does its thing (i.e. check that the folder exists) I check if the requested file exists in that folder. If it doesn't, the code assumes that the file should be present but must have been deleted. Consequently it simply retriggers In other words I also don't assume that the requested file is in the specified list of to-download files (since as you say we don't know what the post-fetch step does). But I think the above is a fair enough assumption. We could allow this mechanism as part of For this, |
that seems reasonable. |
A nice side-effect of this is that the existence of the downloaded archive file is never checked. As a consequence a user could just have the dataset predownloaded and extracted without keeping the archive file around |
It seems that by design the package does not check if the specified folder actually contains the specified files. This seems like a missed opportunity to me. What are your thoughts on this?
The text was updated successfully, but these errors were encountered: