-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw an error if field accesses is ambiguous. #2967
Throw an error if field accesses is ambiguous. #2967
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this. I left minor comments and suggestions but as a whole I'm happy with the PR as it is :)
edit: of course my approval is a bit premature given the test failures.
Co-authored-by: Clément Robert <cr52@protonmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am glad this feature is finally there, this is really great!
I am fine with this being merged with or without my comments taken into account (and assuming tests pass).
In order to magically get diffs that would fix the failing tests, you can use this script: import sys
import os
from difflib import unified_diff
from nose.plugins import Plugin
from yt.utilities.exceptions import YTAmbiguousFieldName
from nose.plugins.plugintest import run
ROOT = os.path.abspath(__file__)
diff_file = open("diff.txt", mode="w")
class AmbiguousResolvePlugin(Plugin):
"""Plugin that takes no command-line arguments"""
name = "ambiguous-solver"
enabled = True
def configure(self, options, conf):
pass
def options(self, parser, env={}):
pass
def addError(self, test, err: sys.exc_info):
try:
test_path = test.context.__file__
except:
test_path = os.path.join(
ROOT,
test.context.__module__.replace(".", "/")
)
t, v, tb = err
if t is not YTAmbiguousFieldName:
return
import traceback
# print('TYPE:', t)
# print('VALUE:', v)
# print('TRACEBACK:', tb)
ambiguous_fname = v.fname
ok = False
for ft, _ in traceback.walk_tb(tb):
line = open(ft.f_code.co_filename, "r").readlines()[ft.f_lineno].replace("\n", "")
# print(f"{ft.f_code.co_filename}:{ft.f_code.co_firstlineno} {line}")
if test_path == ft.f_code.co_filename:
ft_err = ft
ok = True
if not ok:
return
# Now, ft contains the current frame,
# need to correct the test!
# print(f"error from {ft_err.f_code.co_filename}:{ft_err.f_code.co_firstlineno}:{ft_err.f_lineno}")
with open(test_path, 'r') as f:
lines = f.readlines()
lineno = ft_err.f_lineno - 1
suggested_ftype = v.possible_ftypes[-1]
corrected = lines.copy()
corrected[lineno] = corrected[lineno].replace(
'"%s"' % ambiguous_fname,
'("%s", "%s")' % (suggested_ftype, ambiguous_fname)
).replace(
"'%s'" % ambiguous_fname,
"('%s', '%s')" % (suggested_ftype, ambiguous_fname)
)
rel_path = os.path.relpath(test_path)
diff_file.writelines(
unified_diff(
lines, corrected,
fromfile=rel_path, tofile=rel_path
)
)
diff_file.flush()
run(
argv=["nosetests", "yt"],
plugins=[AmbiguousResolvePlugin()]
) Example $ pwd
/path/to/yt/root
$ python nose_debug.py
[...]
$ cat diffs.txt
--- /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
+++ /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
@@ -42,7 +42,7 @@
master_clump.add_validator("min_cells", 1)
def _total_volume(clump):
- total_vol = clump.data.quantities.total_quantity(["cell_volume"]).in_units(
+ total_vol = clump.data.quantities.total_quantity([("stream", "cell_volume")]).in_units(
"cm**3"
)
return "Cell Volume: %6e cm**3.", total_vol
--- /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
+++ /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
@@ -105,7 +105,7 @@
leaf_clumps = master_clump.leaves
fn = master_clump.save_as_dataset(
- fields=["density", "x", "y", "z", "particle_mass"]
+ fields=["density", ("enzo", "x"), "y", "z", "particle_mass"]
)
ds2 = load(fn)
[…] As you can see with the last bit, it is far from perfect! |
Also, there is some freedom to pick which Note that you'll likely need to apply the following patch (if it is indeed a fix for the many fails that popped...) --- a/yt/data_objects/derived_quantities.py
+++ b/yt/data_objects/derived_quantities.py
@@ -23,7 +23,7 @@ def get_position_fields(field, data):
ftype = finfo.name[0]
position_fields = [(ftype, f"particle_position_{d}") for d in axis_names]
else:
- position_fields = axis_names
+ position_fields = [("index", d) for d in axis_names]
return position_fields |
What's the status of this PR? |
Mostly: it's harder than I thought, and then I got caught up in a bunch of
stuff. So, help would be greatly appreciated!
…On Mon, Jan 18, 2021 at 11:07 AM Corentin Cadiou ***@***.***> wrote:
What's the status of this PR?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2967 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAVXO422KY5SUNLVAOCNGLS2RTGJANCNFSM4TU6YX6A>
.
|
FYI I have applied the nose trick in #3092 to catch as many ambiguous access calls as possible. Apart from removing ambiguous accesses, is there something more you want to add in this PR @matthewturk ? |
@matthewturk do you want help with this? I can look at the errors |
@jzuhone I've investigated a bit the issue(s) and the problem is that most of our tests use ambiguous field accesses. For example there are hundreds of occurences of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I think this looks good!
In the triage meeting today we decided to add an interim step for the ambiguous fields that for 4.0 we will issue a warning about the field being ambiguous to allow users to migrate scripts. In 4.1 the error will start and hopefully by then users will have shifted over scripts.
@yt-fido test this please |
I am expected one unit test to fail due to the pixelization routines relying on ambiguous fields that I couldn't fix in #3101. [EDIT:] it failed indeed! |
Note for other reviewers: I have updated the PR following @munkm comments about throwing a warning rather than an error until 4.1.0 and added some more tests. |
I prefer to be considered a contributor than a reviewer on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the tests are using pytest, the test file should be added to the ignore lists in tests/tests.yaml and nose_unit.cfg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a few questions regarding the test but aside from that, I think this is very well done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this out, and it works great. I just have the requested comment above about informing the user what field was actually used.
yt/data_objects/static_output.py
Outdated
"is ambiguous and corresponds to any one of " | ||
f"the following field types\n {self.field_info._ambiguous_field_names[fname]}. " | ||
"Please specify the requested field as an explicit " | ||
"tuple (ftype, fname)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be beneficial to inform the user which field it defaulted to in this pass, by including something like: "Defaulting to ('ftype', 'fname')."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good now!
2ffa147
to
dc6460f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking great now!
PR Summary
This addresses both #832 and #1467.
It instruments
__setitem__
in the field info container to check if there is a possible ambiguity, which we then check against in the "guessing" portion of_get_field_info
. We need to investigate the potential performance penalty for this instrumentation of__setitem__
but I hope it will be minimal.This may also break our tests, but ideally it will not. I think it will likely break workflows where the particle type is not set, as it will throw an error for particle unions that are identical to particle types (i.e, if you just have 'nbody' particles, then 'nbody' and 'all' will be the same.)
PR Checklist
black --check yt/
isort . --check --diff
flake8 yt/
flynt yt/ --fail-on-change --dry-run -e yt/extern