Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Get errors for unhashable or mismatched types #7502

Merged
merged 5 commits into from Apr 10, 2019

Conversation

Projects
None yet
4 participants
@stuhood
Copy link
Member

commented Apr 4, 2019

Problem

The error for unhashable types used as Get params is unfortunate, because the call to identify fails and returns a null pointer to rust, which triggers an error... somewhere else. See #7499.

Solution

As suggested in #7499, have extern_generator_send directly identify Get params, meaning that failures to hash take advantage of the fact that extern_generator_send is already fallible. This avoids a significant number of calls back to python to identify Values (an optimization discussed in #7318 and #7114).

Result

Improved usability. Fixes #7499.

@stuhood

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Commits are useful to review independently.

Show resolved Hide resolved src/rust/engine/src/externs.rs Outdated
Show resolved Hide resolved src/rust/engine/src/rule_graph.rs Outdated
Show resolved Hide resolved src/rust/engine/src/rule_graph.rs Outdated
@@ -195,7 +195,7 @@ def test_union_rules(self):
self.assertTrue(isinstance(a, A))
# Fails due to no union relationship from A -> UnionBase.
expected_msg = """\
Exception: WithDeps(Inner(InnerEntry { params: {UnionWrapper}, rule: Task(Task { product: A, clause: [Select { product: UnionWrapper }], gets: [Get { product: A, subject: UnionA }, Get { product: A, subject: UnionB }], func: a_union_test(), cacheable: true }) })) did not declare a dependency on JustGet(Get { product: A, subject: A })
Exception: (A, [UnionWrapper], [Get(A, UnionA), Get(A, UnionB)], a_union_test()) for UnionWrapper did not declare a dependency on Get(A, A)

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 5, 2019

Contributor

I find this less informative than the previous error. What are all of these comma-separated parts of a string? As a rule-author, I don't want to have to dig into the rust code in Pants to work out what these things all are... (The old message wasn't great either, but at least it had keys associated with each value...)

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 5, 2019

Author Member

So, this is currently the exact output for rule-graph visualization and tracing, but they should clearly evolve. This change mostly just got us into alignment between an error case and those other cases.

If we think it is in scope for this change, I can look at it here... it is mostly a python signature, and we could roughly shape it that way. The only issue is the Gets and other used Params. A strawperson refactoring of this example that doesn't lose any information might be:

def a_union_test(UnionWrapper) -> A { for UnionWrapper; Get(A, UnionA), Get(A, UnionB) }

or

def a_union_test -> A { for UnionWrapper; Get(A, UnionA), Get(A, UnionB) }

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 5, 2019

Author Member

... I'm going to open a ticket to discuss this.

This comment has been minimized.

Copy link
@stuhood

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 6, 2019

Author Member

But, in the meantime, I'd like to merge this output as-is in order to make it obvious which cases exist, and where (based on these tests now actually using Display).

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Contributor

I really like the stack trace in particular with rust type names as it lets me reconstruct the rust part in my head much more easily, which is useful for specifically an error trace. This seems like a large UI change wrapped in another functional change. I would really prefer that we solve the output problem in a followup PR addressing #7509 and considering the difference between what we want to display in stack traces vs what we want to display in pretty graphs, while scoping this one to the identify() and related changes.

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 6, 2019

Author Member

I've dropped these changes from the PR and will wait to see what happens with #7509.

Show resolved Hide resolved src/rust/engine/src/externs.rs Outdated
@benjyw

benjyw approved these changes Apr 5, 2019

Copy link
Contributor

left a comment

Looks good as far as my limited understanding of this code goes.

@stuhood stuhood force-pushed the twitter:stuhood/improve-get-errors branch 2 times, most recently from 8cd695e to c5a9244 Apr 6, 2019

@stuhood stuhood force-pushed the twitter:stuhood/improve-get-errors branch from c5a9244 to b44073c Apr 6, 2019

@cosmicexplorer
Copy link
Contributor

left a comment

Some thoughts on comments to add in places mostly. Being able to use rust enums more naturally is very exciting!

TypeId(c.to_id(res.product)),
c.to_value(res.subject),
c.identify(res.subject),
)

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

This is a fantastic change. Is this relying on mostly the rust C representation of enums (with tag and (in this case) get fields), or is there some cffi magic going on to allow addressing unions in this deeply ergonomic way? Is there a link we could add documenting why this works to access our rust enums?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

In this case, I actually think we don't want to do this, as we probably want to colocate parsing the python response and producing the FFI object that represents it (so the below code is needless complexity for now). But I wrote this comment out anyway so I can link to it in #7367:

Could this part be moved into a helper method? Something like:

    @memoized_property
    def _generator_response_tags_enum(self):
       class _Tags(enum([self._lib.Get, self._lib.GetMulti, self._lib.Broke, self._lib.Throw])): pass
       return _Tags

    def _set_generator_response(self, response, c, input_res, tag):
      attr, value = self._generator_response_tags_enum(tag).resolve_for_enum_variant({
        self._lib.Get: lambda: ('get', (
            TypeId(c.to_id(input_res.product)),
            c.to_value(input_res.subject),
            c.identify(input_res.subject),
          ),
        # ...
        response.tag = tag
        setattr(response, attr, value)
      })

    def extern_generator_send(...):
      # ...
              if isinstance(res, Get):
                self._set_generator_response(response, c, res, self._lib.Get)
              # ...
      return response[0]

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 10, 2019

Author Member

This is a fantastic change. Is this relying on mostly the rust C representation of enums (with tag and (in this case) get fields), or is there some cffi magic going on to allow addressing unions in this deeply ergonomic way? Is there a link we could add documenting why this works to access our rust enums?

This is mostly relying on the C API that is generated by cbindgen. Kudos on integrating that.

ffi = getattr(self._ffi_module, 'ffi')
_FFISpecification(ffi).register_cffi_externs()
return ffi
return getattr(self._ffi_module, 'ffi')

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

Considering that register_cffi_externs() modifies the ffi object and not the lib object, I would think we would want to keep that call in this method to avoid spooky mutation of the ffi object happening whenever we first happen to access the lib object.

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 10, 2019

Author Member

Unfortunately, constructing the _FFISpecification inside of self.ffi triggered infinite recursion when I attempted to feed it self.lib (which required self.ffi, etc, etc).

GetMulti(TypeIdBuffer, HandleBuffer, IdentBuffer),
// NB: Broke not Break because C keyword.
Broke(Handle),
Throw(Handle),

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

Much better!! Thank you for taking the time to figure out how to do this with cffi, it makes the code leagues more readable. It also obviates the need for that TODO without additional complexity in building the engine!

with_vec(self.ptr(), self.len() as usize, |vec| Self::lift(&vec[0]))
}
}

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

This is all really great! I really like that we can manipulate python collections in this structured, efficient, and safe way.

"Expected exactly 1 item in Buffer, but had: {}",
self.len()
);
with_vec(self.ptr(), self.len() as usize, |vec| Self::lift(&vec[0]))

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

Any reason this doesn't delegate to .to_vec()? Like:

  fn unwrap_one(&self) -> Output {
    assert_equal!(self.len(), 1, "Expected exactly 1 item in Buffer, but had: {}", self.len());
    self.to_vec()[0]
  }

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 10, 2019

Author Member

with_vec doesn't actually "create" a vec: instead, it "views" some memory as a Vec via:

unsafe { Vec::from_raw_parts(c_ptr, c_len, c_len) }

That means that by avoiding calling to_vec(), we avoid actually copying the buffer containing the one item.

idents_ptr: *mut Ident,
idents_len: u64,
// A Handle to hold the underlying array alive.
handle_: Handle,

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

I know this part isn't new, but could we add a comment on one of these fields explaining why handle_ needs to exist? Is it so python doesn't gc the pointer?

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 10, 2019

Author Member

That's correct. Since CI is green here, I'm going to go ahead and merge this one, but can revisit.

///
/// The response from a call to extern_generator_send. Gets include Idents for their Handles
/// in order to avoid roundtripping to intern them, and to eagerly trigger errors for unhashable
/// types on the python side where possible.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

Good explanation.

@@ -213,6 +222,12 @@ def test_get_type_match_failure(self):
# `a_typecheck_fail_test` above expects `wrapper.inner` to be a `B`.
self.scheduler.product_request(A, [Params(TypeCheckFailWrapper(A()))])

def test_unhashable_failure(self):
"""Test that unhashable Get(...) params result in a structured error."""
expected_msg = """TypeError: unhashable type: 'list'"""

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 7, 2019

Contributor

I would like for this message to check more specifically which object has the unhashable type, but I can't see how to do that easily, so this is definitely specific enough for now.

@cosmicexplorer cosmicexplorer referenced this pull request Apr 7, 2019

Open

share enum definitions across rust and python #7367

0 of 3 tasks complete
@illicitonion
Copy link
Contributor

left a comment

Looks great, thanks!

@stuhood stuhood merged commit d0c670f into pantsbuild:master Apr 10, 2019

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@stuhood stuhood deleted the twitter:stuhood/improve-get-errors branch Apr 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.