Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imported module names/function should be correctly passed to qastle. #131

Merged
merged 4 commits into from
Oct 16, 2023

Conversation

gordonwatts
Copy link
Member

@gordonwatts gordonwatts commented Oct 7, 2023

Passing through np and other module names for some functions didn't work if np had already been imported into the namespace. This fixes that and improves error messages around non-standard lambda captures.

  • ModuleType is now "captured" as just the name you use. So np and numpy are both passed down (warning!). np should no work both if or if it hasn't been imported.
  • White list for types of things that can be sent down: str, int, float, bool, complex, str, and bytes.
  • Added numpy to test install flavor so this functionality could be tested.

Fixes #127

@gordonwatts gordonwatts changed the title Add two tests to see if we can cause the problem Imported module names/function should be correctly passed to qastle. Oct 7, 2023
@codecov-commenter
Copy link

codecov-commenter commented Oct 10, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (7295efc) 96.79% compared to head (a8fc99c) 96.81%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #131      +/-   ##
==========================================
+ Coverage   96.79%   96.81%   +0.01%     
==========================================
  Files          15       15              
  Lines        1281     1286       +5     
==========================================
+ Hits         1240     1245       +5     
  Misses         41       41              
Flag Coverage Δ
unittests 96.81% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
func_adl/util_ast.py 92.33% <100.00%> (+0.15%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@masonproffitt
Copy link
Member

The problem of as_literal being called on non-literals is still there. For example, if you say np = (i for i in [1]) somewhere before the lambda, this is still going to give you ast.Constant(value=<generator object <genexpr> at [...]>), which is not a valid Python AST node. (Maybe a contrived example, but this AST would make debugging such a case much more difficult than it should be.)

@gordonwatts
Copy link
Member Author

If you have something like np * 2 in your code, that is clearly invalid and should cause a break, shouldn't it? np = xxx isn't something that appears in func-adl I think.

@masonproffitt
Copy link
Member

masonproffitt commented Oct 12, 2023

If you have something like np * 2 in your code, that is clearly invalid and should cause a break, shouldn't it?

If you have imported np as a module, sure, but np = 3; np * 2 is perfectly valid Python.

np = xxx isn't something that appears in func-adl I think.

Right, assignment can't appear within a query, but func-adl currently tries to capture arbitrary Python objects as Constants if the variable name has been defined before the query. So if I have a line like np = xxx anywhere before my query, I will get an error like cannot pickle object with no indication of how that has anything to do with my query (because the Name will have been completely removed from the AST).

My point is that the func-adl capturing mechanism should fail if the variable is anything that is not a constant (so anything other than int, float, complex, str, bytes, bool, None). It should either pass the Name unchanged (maybe with a warning) or raise an error on the spot.

@masonproffitt
Copy link
Member

Maybe this is a better example of what I'm talking about. Say I've made a typo in my query and referred to the wrong variable (var instead of var1):

var1 = 1000
query = ds.Select(lambda event: event.Jet_pt / var)

If var hasn't been defined, I'll get a nice NameError saying there's no such variable. However, if var has been defined anywhere earlier in my code as some non-constant Python object, query.value() will either result in a TypeError: cannot pickle '[...]' object error (if the object type cannot be pickled) or completely undefined behavior when the backend's AST transformer attempts to interpret the Constant node's value as a literal. I can add checking in the backend to try to guard against this (it looks like you do this in the xAOD backend), but it would better to avoid inserting a broken Constant node in the first place if we can.

@masonproffitt
Copy link
Member

As I mentioned in the issue (#127), this type checking is already done by func-adl for Python versions < 3.8, so the suggestion is just to do this for Python >= 3.8 as well.

@gordonwatts
Copy link
Member Author

Ok, @masonproffitt - I've added a whitelist for types we can capture - the error should be more informative. For example (from the test):

ValueError("Do not know how to capture data type \'bogus\' for variable \'my_var\' - only str, int, float, bool, complex, str, bytes are supported."

which happens from this test:

def test_sensible_error_with_bad_variable_capture():
    class bogus:
        def __init__(self):
            self.my_var = 10

    my_var = bogus()

    with pytest.raises(ValueError) as e:
        parse_as_ast(lambda x: x > my_var)

    assert "my_var" in str(e)

Does that get to what you are after?

@gordonwatts gordonwatts self-assigned this Oct 16, 2023
@gordonwatts gordonwatts added the bug Something isn't working label Oct 16, 2023
@gordonwatts gordonwatts merged commit dfeb858 into master Oct 16, 2023
38 checks passed
@gordonwatts gordonwatts deleted the pr_imported_modules branch October 16, 2023 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot use imported module names in lambdas
3 participants