[FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780

soiferj · 2020-01-27T19:36:33Z

As discussed here, hasattr is currently broken for Windows. When a Expr does not have an attribute, a TVMError is thrown rather than an AttributeError. This is because of how we extract the error type from the logged message.

On Unix, DMLC will log the stack trace when LOG(FATAL) is called. However, DMLC will not log the stack trace on Windows. This causes the errors to look different.

Unix:

AttributeError: relay.Call object has no attributed name_hint
Stack trace:
  File "/home/jonso/dev/TVM/src/node/reflection.cc", line 109
  [bt] (0) /home/jonso/dev/TVM/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7f0970787a62]
  [bt] (1) /home/jonso/dev/TVM/build/libtvm.so(tvm::ReflectionVTable::GetAttr(tvm::runtime::Object*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const+0x2c5) [0x7f0970af27f5]
  [bt] (2) /home/jonso/dev/TVM/build/libtvm.so(tvm::NodeGetAttr(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x15c) [0x7f0970af2c4c]
  [bt] (3) /home/jonso/dev/TVM/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), void (*)(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x14) [0x7f0970af6f14]
  [bt] (4) /home/jonso/dev/TVM/build/libtvm.so(TVMFuncCall+0x61) [0x7f0970fb68a1]

Windows:

[11:19:42] D:\_work\1\s\TVM\src\node\reflection.cc:109: AttributeError: relay.Call object has no attributed name_hint

This change updates the error string parsing logic in base.py for Windows. It will use the generic format of this message to try to find the error type.

@jmorrill @tqchen would you be able to take a look?

tqchen · 2020-01-27T22:58:16Z

Thanks @jonso4 !

jmorrill · 2020-01-29T00:52:31Z

@soiferj sorry, I'm late to the party. Getting over a flu.

Anyways, I think there might be a problem.

On my test, the "line" variable is "AttributeError: relay.Op object has no attributed attrs".
On line 206, start_pos returns -1, which causes a return of "None";

My issue is fixed if the body of _find_error_type is replaced with:

        end_pos = line.find(":")
        if end_pos == -1:
            return None
        err_name = line[:end_pos]
        if _valid_error_name(err_name):
            return err_name
        return None

I'd attempt to fix, but I don't want to break the case you are dealing with.

soiferj · 2020-01-29T00:57:07Z

@jmorrill, I haven't seen an error formatted like that. Can you print out the full error string in the function that calls into this one? In all of my tests, the message begins with timestamp, which has a few ":" characters.

If this does turn out to be an issue, I think we should handle both cases. If start_pos is -1, we should graph line[:end_pos]. Otherwise, grab line[start_pos + 1:end_pos]. What do you think?

jmorrill · 2020-01-29T01:06:21Z

I added: print("line var is: ", line)
Output is: line var is: AttributeError: relay.Op object has no attributed attrs

This happens when I run autotvm, coming down the line from extract_from_program I believe.

Here is the stack

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\jeremiah.morrill\AppData\Local\Programs\Python\Python37\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "C:\Users\jeremiah.morrill\AppData\Local\Programs\Python\Python37\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\autotvm\task\relay_integration.py", line 54, in _lower
    compiler.lower(mod, target=target)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\backend\vm.py", line 455, in lower
    self._lower(mod, target, target_host)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\_ffi\_ctypes\function.py", line 207, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\_ffi\_ctypes\function.py", line 72, in cfun
    rv = local_pyfunc(*pyargs)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\transform.py", line 899, in _pass_func
    return inst.transform_function(func, mod, ctx)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\memory_alloc.py", line 284, in transform_function
    func = ea.visit(func)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\expr_functor.py", line 43, in visit
    res = self.visit_function(expr)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\expr_functor.py", line 199, in visit_function
    new_body = self.visit(fn.body)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\expr_functor.py", line 47, in visit
    res = self.visit_let(expr)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\memory_alloc.py", line 159, in visit_let
    new_val = self.visit(let.value)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\expr_functor.py", line 45, in visit
    res = self.visit_call(expr)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\memory_alloc.py", line 170, in visit_call
    if is_primitive(call):
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\relay\memory_alloc.py", line 31, in is_primitive
    return hasattr(call.op, 'attrs') and hasattr(call.op.attrs, 'Primitive') and \
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\_ffi\object.py", line 65, in __getattr__
    return _api_internal._NodeGetAttr(self, name)
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\.site-packages\python\tvm\_ffi\_ctypes\function.py", line 207, in __call__
    raise get_last_ffi_error()
  File "C:\src\mediatubes\ml\tensortubes\src\third_party\tvm\src\node\reflection.cc", line 113
AttributeError: relay.Op object has no attributed attrs

Based on your suggestion, I threw this together, which seems to work with my use.

def _find_error_type(line):
    """Find the error name given the first line of the error message.

    Parameters
    ----------
    line : str
        The first line of error message.

    Returns
    -------
    name : str The error name
    """
    if sys.platform == "win32":
        # Stack traces aren't logged on Windows due to a DMLC limitation,
        # so we should try to get the underlying error another way.
        # DMLC formats errors "[timestamp] file:line: ErrorMessage"
        # ErrorMessage is usually formatted "ErrorType: message"
        # We can try to extract the error type using the final ":"
        end_pos = line.rfind(":")
        if end_pos == -1:
            return None
        start_pos = line.rfind(":", 0, end_pos)
        if start_pos == -1:
            err_name = line[:end_pos]
            if _valid_error_name(err_name):
                return err_name
            else:
                return None
        err_name = line[start_pos + 1 : end_pos].strip()
        if _valid_error_name(err_name):
            return err_name
        return None
    else:
        end_pos = line.find(":")
        if end_pos == -1:
            return None
        err_name = line[:end_pos]
        if _valid_error_name(err_name):
            return err_name
        return None

soiferj · 2020-01-29T01:12:54Z

Oh interesting - your branch is throwing, while mine is just calling LOG(FATAL) in reflection.cc. Do you know why yours is throwing?

Update: you're going through a different codepath (AutoTVM). Let's update the logic as below.

In any case, maybe we can update the logic. I think we can condense this:

if start_pos == -1:
            err_name = line[:end_pos]
            if _valid_error_name(err_name):
                return err_name
            else:
                return None
err_name = line[start_pos + 1 : end_pos].strip()
if _valid_error_name(err_name):
    return err_name
return None

into this:

if start_pos == -1:
            err_name = line[:end_pos].strip()
else:
           err_name = line[start_pos + 1 : end_pos].strip()
if _valid_error_name(err_name):
    return err_name
return None

What do you think? Would you be able to send a PR?

jmorrill · 2020-01-29T01:35:50Z

Yes much cleaner, and works on my PC :)
Submitted PR
#4785

Co-authored-by: Jon Soifer <jonso@microsoft.com>

properly extract error type from windows error message

f492c58

soiferj changed the title ~~[FFI][Windows] Extract Python error type from Windows error log~~ [FFI][Windows] Fix hasattr by extracting Python error type from Windows error message Jan 27, 2020

tqchen approved these changes Jan 27, 2020

View reviewed changes

tqchen merged commit f71a10c into apache:master Jan 27, 2020

soiferj deleted the windows_error branch January 27, 2020 22:59

jmorrill mentioned this pull request Jan 29, 2020

[FFI][Windows] Parse additional exception strings #4785

Merged

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020

properly extract error type from windows error message (apache#4780)

a1fec9f

Co-authored-by: Jon Soifer <jonso@microsoft.com>

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020

properly extract error type from windows error message (apache#4780)

0c6bbe6

Co-authored-by: Jon Soifer <jonso@microsoft.com>

zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020

properly extract error type from windows error message (apache#4780)

3bed6d3

Co-authored-by: Jon Soifer <jonso@microsoft.com>

yzhliu mentioned this pull request Jun 21, 2020

[BACKPORT-0.6] Fix hasattr by extracting Python error type from Windows error message #5873

Merged

ZihengJiang mentioned this pull request Sep 17, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780

[FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780

soiferj commented Jan 27, 2020 •

edited

tqchen commented Jan 27, 2020

jmorrill commented Jan 29, 2020

soiferj commented Jan 29, 2020

jmorrill commented Jan 29, 2020

soiferj commented Jan 29, 2020 •

edited

jmorrill commented Jan 29, 2020

[FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780

[FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780

Conversation

soiferj commented Jan 27, 2020 • edited

tqchen commented Jan 27, 2020

jmorrill commented Jan 29, 2020

soiferj commented Jan 29, 2020

jmorrill commented Jan 29, 2020

soiferj commented Jan 29, 2020 • edited

jmorrill commented Jan 29, 2020

soiferj commented Jan 27, 2020 •

edited

soiferj commented Jan 29, 2020 •

edited