-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix an issue where missing files would be indexed without verification #3816
Conversation
For reference I've been testing this locally with yt_astro_analysis 1.1.1 using this script (which is a self-contained version of OP's script) import yt
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog
data_ds = yt.load('snap_000.11')
hc = HaloCatalog(data_ds=data_ds, finder_method='hop')
hc.create() as I'm writing this, the script has been running on one CPU for 30min and counting, I have no idea wether that's expected for a 500Mb dataset or if it means I'm actually stuck in an infinite loop. edit: I used a |
d46f937
to
c179206
Compare
this is clearly broken. Switching to draft for now |
c179206
to
f9c9a79
Compare
df = cls( | ||
self.dataset, self.io, template % {"num": i}, fi, (start, end) | ||
) | ||
except FileNotFoundError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that other frontends will directly benefit from this without changes because they call open
or h5py.File
, which both raise FileNotFoundError
naturally.
a62c662
to
1c7df4e
Compare
@matthewturk I think the codetour watch workflow is broken (and probably always was) |
I don't think this is the right solution. If we are breaking out of this loop because a file is not found, then we are ignoring data files (and hence, particles) associated with this dataset. For Issue #2819, it looks like the directory of the data files is getting lost somehow. I think that's the thing that needs to be fixed. |
So the way I see it there are two solutions:
Personally I would still favour the second approach because I see no harm in allowing informed users to work with partial datasets. Your call :) |
I think I'm only now fully understanding the original issue and I'm coming around to agreeing with your solution. I agree we should allow users to operate on incomplete datasets if they want to and if they understand that that is what is happening. I'm happy with your second option of adding a warning message. |
1c7df4e
to
cf356a9
Compare
@brittonsmith there you go ! |
BUG: fix an issue where missing files would be indexed without verification (cherry picked from commit ec79686)
backported as #3881 |
Manual backport #3816 to yt-4.0.x (BUG: fix an issue where missing files would be indexed without verification)
PR Summary
I think this fixes #2819
However I do not have a clear enough picture of what the code in the initial report is supposed to accomplish.
pinging @brittonsmith and @matthewturk for review