Throw an error if field accesses is ambiguous. #2967

matthewturk · 2020-11-13T19:22:14Z

PR Summary

This addresses both #832 and #1467.

It instruments __setitem__ in the field info container to check if there is a possible ambiguity, which we then check against in the "guessing" portion of _get_field_info. We need to investigate the potential performance penalty for this instrumentation of __setitem__ but I hope it will be minimal.

This may also break our tests, but ideally it will not. I think it will likely break workflows where the particle type is not set, as it will throw an error for particle unions that are identical to particle types (i.e, if you just have 'nbody' particles, then 'nbody' and 'all' will be the same.)

PR Checklist

pass black --check yt/
pass isort . --check --diff
pass flake8 yt/
pass flynt yt/ --fail-on-change --dry-run -e yt/extern
New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

neutrinoceros

Thank you for doing this. I left minor comments and suggestions but as a whole I'm happy with the PR as it is :)

edit: of course my approval is a bit premature given the test failures.

yt/utilities/exceptions.py

yt/fields/field_info_container.py

Co-authored-by: Clément Robert <cr52@protonmail.com>

cphyc

I am glad this feature is finally there, this is really great!

I am fine with this being merged with or without my comments taken into account (and assuming tests pass).

yt/fields/field_info_container.py

yt/utilities/exceptions.py

cphyc · 2020-11-15T17:04:41Z

In order to magically get diffs that would fix the failing tests, you can use this script:

import sys
import os
from difflib import unified_diff

from nose.plugins import Plugin
from yt.utilities.exceptions import YTAmbiguousFieldName
from nose.plugins.plugintest import run

ROOT = os.path.abspath(__file__)

diff_file = open("diff.txt", mode="w")

class AmbiguousResolvePlugin(Plugin):
    """Plugin that takes no command-line arguments"""
    name = "ambiguous-solver"
    enabled = True

    def configure(self, options, conf):
        pass

    def options(self, parser, env={}):
        pass

    def addError(self, test, err: sys.exc_info):
        try:
            test_path = test.context.__file__
        except:
            test_path = os.path.join(
                ROOT,
                test.context.__module__.replace(".", "/")
            )
        t, v, tb = err
        if t is not YTAmbiguousFieldName:
            return

        import traceback
        # print('TYPE:', t)
        # print('VALUE:', v)
        # print('TRACEBACK:', tb)

        ambiguous_fname = v.fname
        ok = False
        for ft, _ in traceback.walk_tb(tb):
            line = open(ft.f_code.co_filename, "r").readlines()[ft.f_lineno].replace("\n", "")
            # print(f"{ft.f_code.co_filename}:{ft.f_code.co_firstlineno} {line}")
            if test_path == ft.f_code.co_filename:
                ft_err = ft
                ok = True

        if not ok:
            return

        # Now, ft contains the current frame,
        # need to correct the test!
        # print(f"error from {ft_err.f_code.co_filename}:{ft_err.f_code.co_firstlineno}:{ft_err.f_lineno}")

        with open(test_path, 'r') as f:
            lines = f.readlines()

        lineno = ft_err.f_lineno - 1

        suggested_ftype = v.possible_ftypes[-1]
        corrected = lines.copy()
        corrected[lineno] = corrected[lineno].replace(
            '"%s"' % ambiguous_fname,
            '("%s", "%s")' % (suggested_ftype, ambiguous_fname)
        ).replace(
            "'%s'" % ambiguous_fname,
            "('%s', '%s')" % (suggested_ftype, ambiguous_fname)
        )

        rel_path = os.path.relpath(test_path)
        diff_file.writelines(
            unified_diff(
                lines, corrected,
                fromfile=rel_path, tofile=rel_path
            )
        )
        diff_file.flush()


run(
    argv=["nosetests", "yt"],
    plugins=[AmbiguousResolvePlugin()]
)

Example

$ pwd
/path/to/yt/root
$ python nose_debug.py
[...]
$ cat diffs.txt
--- /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
+++ /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
@@ -42,7 +42,7 @@
     master_clump.add_validator("min_cells", 1)
 
     def _total_volume(clump):
-        total_vol = clump.data.quantities.total_quantity(["cell_volume"]).in_units(
+        total_vol = clump.data.quantities.total_quantity([("stream", "cell_volume")]).in_units(
             "cm**3"
         )
         return "Cell Volume: %6e cm**3.", total_vol
--- /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
+++ /path/to/yt/root/yt/yt/data_objects/level_sets/tests/test_clump_finding.py
@@ -105,7 +105,7 @@
     leaf_clumps = master_clump.leaves
 
     fn = master_clump.save_as_dataset(
-        fields=["density", "x", "y", "z", "particle_mass"]
+        fields=["density", ("enzo", "x"), "y", "z", "particle_mass"]
     )
     ds2 = load(fn)
[…]

As you can see with the last bit, it is far from perfect!

cphyc · 2020-11-15T17:14:38Z

As you can see with the last bit, it is far from perfect!

Also, there is some freedom to pick which ftype should be used. By default here, it takes the last one but since it is fed from a set, well, that doesn't really matter.
Maybe one option would be to store in the exception which field yt used to infer, and set it as default in the diff that's created. It would also be a nice addition to inform the user what the old behavior was.

Note that you'll likely need to apply the following patch (if it is indeed a fix for the many fails that popped...)

--- a/yt/data_objects/derived_quantities.py
+++ b/yt/data_objects/derived_quantities.py
@@ -23,7 +23,7 @@ def get_position_fields(field, data):
             ftype = finfo.name[0]
         position_fields = [(ftype, f"particle_position_{d}") for d in axis_names]
     else:
-        position_fields = axis_names
+        position_fields = [("index", d) for d in axis_names]
 
     return position_fields

cphyc · 2021-01-18T17:07:31Z

What's the status of this PR?

matthewturk · 2021-01-20T14:00:39Z

Mostly: it's harder than I thought, and then I got caught up in a bunch of stuff. So, help would be greatly appreciated!

…

On Mon, Jan 18, 2021 at 11:07 AM Corentin Cadiou ***@***.***> wrote: What's the status of this PR? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2967 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAVXO422KY5SUNLVAOCNGLS2RTGJANCNFSM4TU6YX6A> .

for more information, see https://pre-commit.ci

cphyc · 2021-02-24T18:16:49Z

FYI I have applied the nose trick in #3092 to catch as many ambiguous access calls as possible. Apart from removing ambiguous accesses, is there something more you want to add in this PR @matthewturk ?

jzuhone · 2021-02-25T20:53:01Z

@matthewturk do you want help with this? I can look at the errors

cphyc · 2021-02-26T14:14:58Z

@matthewturk do you want help with this? I can look at the errors

@jzuhone I've investigated a bit the issue(s) and the problem is that most of our tests use ambiguous field accesses. For example there are hundreds of occurences of ad["x"], but x is provided by ("gas", "x"), ("index", "x"), ("stream", "x") so each of one will result in an error. I have came up with an automated way of fixing them (using the type inferred by yt to replace the ambiguous line), which is located here #3101 .

munkm

Right now I think this looks good!

In the triage meeting today we decided to add an interim step for the ambiguous fields that for 4.0 we will issue a warning about the field being ambiguous to allow users to migrate scripts. In 4.1 the error will start and hopefully by then users will have shifted over scripts.

cphyc · 2021-04-16T12:45:07Z

@yt-fido test this please

cphyc · 2021-04-16T12:46:07Z

I am expected one unit test to fail due to the pixelization routines relying on ambiguous fields that I couldn't fix in #3101.

[EDIT:] it failed indeed!

cphyc · 2021-04-20T14:01:22Z

Note for other reviewers: I have updated the PR following @munkm comments about throwing a warning rather than an error until 4.1.0 and added some more tests.

I prefer to be considered a contributor than a reviewer on this one.

neutrinoceros

Since the tests are using pytest, the test file should be added to the ignore lists in tests/tests.yaml and nose_unit.cfg

neutrinoceros

I got a few questions regarding the test but aside from that, I think this is very well done.

yt/fields/tests/test_ambiguous_fields.py

chummels

I tested this out, and it works great. I just have the requested comment above about informing the user what field was actually used.

chummels · 2021-04-20T15:55:14Z

yt/data_objects/static_output.py

+                    "is ambiguous and corresponds to any one of "
+                    f"the following field types\n {self.field_info._ambiguous_field_names[fname]}. "
+                    "Please specify the requested field as an explicit "
+                    "tuple (ftype, fname)."


I think it would be beneficial to inform the user which field it defaulted to in this pass, by including something like: "Defaulting to ('ftype', 'fname')."

Should be good now!

yt/data_objects/static_output.py

munkm

I think this is looking great now!

Throw an error if field accesses is ambiguous.

c5bd220

neutrinoceros approved these changes Nov 15, 2020

View reviewed changes

yt/utilities/exceptions.py Outdated Show resolved Hide resolved

yt/fields/field_info_container.py Outdated Show resolved Hide resolved

matthewturk and others added 2 commits November 15, 2020 05:20

Update yt/utilities/exceptions.py

d5b1b1f

Co-authored-by: Clément Robert <cr52@protonmail.com>

Change to set and fix indentation

46171c4

matthewturk added api-consistency naming conventions, code deduplication, informative error messages, code smells... backwards incompatible This change will change behavior yt-4.0 feature targeted for the yt-4.0 release bug labels Nov 15, 2020

cphyc previously approved these changes Nov 15, 2020

View reviewed changes

yt/fields/field_info_container.py Outdated Show resolved Hide resolved

cphyc reviewed Nov 15, 2020

View reviewed changes

yt/utilities/exceptions.py Outdated Show resolved Hide resolved

cphyc mentioned this pull request Nov 16, 2020

[proof of concept] No ambiguous fields #2970

Closed

Base automatically changed from master to main January 20, 2021 15:27

cphyc and others added 2 commits February 24, 2021 17:33

Merge branch 'main' into no_ambiguous_fields

655dd2c

[pre-commit.ci] auto fixes from pre-commit.com hooks

60ad04a

for more information, see https://pre-commit.ci

cphyc mentioned this pull request Feb 26, 2021

[1/2] No ambiguous field access — Update tests & hack answer tests #3101

Merged

munkm mentioned this pull request Mar 12, 2021

Requirements for 4.0 release #3125

Closed

30 tasks

cphyc and others added 2 commits March 15, 2021 18:57

Merge branch 'main' into no_ambiguous_fields

bbe43ce

Apply suggestions from code review

10e6909

cphyc mentioned this pull request Apr 12, 2021

Suggest fixes to typos #3184

Merged

2 tasks

cphyc added a commit to cphyc/yt that referenced this pull request Apr 12, 2021

Revert changes included from yt-project#2967

4f28bcc

munkm approved these changes Apr 15, 2021

View reviewed changes

cphyc added 4 commits April 16, 2021 13:53

Merge branch 'main' into no_ambiguous_fields

84c51a8

Raise warning instad of exception for the time being

536506f

Make sure to test the behaviour

404920c

Fix logic

dbb655f

cphyc requested review from cphyc, munkm and neutrinoceros and removed request for cphyc April 20, 2021 13:55

Will throw an error in 4.1.0, not 4.0.1

572cb7c

neutrinoceros reviewed Apr 20, 2021

View reviewed changes

neutrinoceros approved these changes Apr 20, 2021

View reviewed changes

yt/fields/tests/test_ambiguous_fields.py Show resolved Hide resolved

yt/fields/tests/test_ambiguous_fields.py Show resolved Hide resolved

yt/fields/tests/test_ambiguous_fields.py Show resolved Hide resolved

chummels requested changes Apr 20, 2021

View reviewed changes

cphyc added 3 commits April 20, 2021 17:00

Inform the user about the default picked

3148ce8

Do not test with nose

774319f

Move ambiguous field catching to own function

65bcea1

neutrinoceros reviewed Apr 20, 2021

View reviewed changes

yt/data_objects/static_output.py Outdated Show resolved Hide resolved

Make it copy-pasteable

a3bb0f7

neutrinoceros reviewed Apr 20, 2021

View reviewed changes

Rename is_ambiguous to make it obvious it is a bool

dc6460f

cphyc force-pushed the no_ambiguous_fields branch from 2ffa147 to dc6460f Compare April 20, 2021 16:28

munkm approved these changes Apr 22, 2021

View reviewed changes

chummels approved these changes Apr 23, 2021

View reviewed changes

neutrinoceros merged commit d207f80 into yt-project:main Apr 23, 2021

jzuhone mentioned this pull request Apr 23, 2021

Make ambiguous field accesses an error #1467

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throw an error if field accesses is ambiguous. #2967

Throw an error if field accesses is ambiguous. #2967

matthewturk commented Nov 13, 2020

neutrinoceros left a comment •

edited

Loading

cphyc left a comment

cphyc commented Nov 15, 2020 •

edited

Loading

cphyc commented Nov 15, 2020 •

edited

Loading

cphyc commented Jan 18, 2021

matthewturk commented Jan 20, 2021 via email

cphyc commented Feb 24, 2021

jzuhone commented Feb 25, 2021

cphyc commented Feb 26, 2021

munkm left a comment

cphyc commented Apr 16, 2021

cphyc commented Apr 16, 2021 •

edited

Loading

cphyc commented Apr 20, 2021 •

edited

Loading

neutrinoceros left a comment

neutrinoceros left a comment

chummels left a comment

chummels Apr 20, 2021

cphyc Apr 20, 2021

munkm left a comment

Throw an error if field accesses is ambiguous. #2967

Throw an error if field accesses is ambiguous. #2967

Conversation

matthewturk commented Nov 13, 2020

PR Summary

PR Checklist

neutrinoceros left a comment • edited Loading

Choose a reason for hiding this comment

cphyc left a comment

Choose a reason for hiding this comment

cphyc commented Nov 15, 2020 • edited Loading

cphyc commented Nov 15, 2020 • edited Loading

cphyc commented Jan 18, 2021

matthewturk commented Jan 20, 2021 via email

cphyc commented Feb 24, 2021

jzuhone commented Feb 25, 2021

cphyc commented Feb 26, 2021

munkm left a comment

Choose a reason for hiding this comment

cphyc commented Apr 16, 2021

cphyc commented Apr 16, 2021 • edited Loading

cphyc commented Apr 20, 2021 • edited Loading

neutrinoceros left a comment

Choose a reason for hiding this comment

neutrinoceros left a comment

Choose a reason for hiding this comment

chummels left a comment

Choose a reason for hiding this comment

chummels Apr 20, 2021

Choose a reason for hiding this comment

cphyc Apr 20, 2021

Choose a reason for hiding this comment

munkm left a comment

Choose a reason for hiding this comment

neutrinoceros left a comment •

edited

Loading

cphyc commented Nov 15, 2020 •

edited

Loading

cphyc commented Nov 15, 2020 •

edited

Loading

cphyc commented Apr 16, 2021 •

edited

Loading

cphyc commented Apr 20, 2021 •

edited

Loading