Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash (stack overflow) with self-referential column expressions #350

Closed
chocolateboy opened this issue Sep 6, 2019 · 2 comments
Closed
Labels

Comments

@chocolateboy
Copy link
Contributor

chocolateboy commented Sep 6, 2019

  • OS: Linux (Arch)
  • VisiData: v1.5.2 and v2.-1dev
  • Python: 3.7.4

Not sure if this is obvious, expected, a bug, or something I've overlooked in the documentation, but it's possible to crash VisiData (and python) if a column expression refers to its own column name, something which it's easy to do inadvertently/indirectly when columns are renamed.

In the attached test case, I have a bunch of TV episodes I want to assign a short season/episode identifier to, e.g. Season 6, Episode 7 -> "06x07". The generated column name is ugly and I have no particular use for its constituent columns after combining them so I name the newly-created column episode (and possibly hide the old season and episode columns).

What happens next varies. Sometimes nothing happens (i.e. no error). Sometimes I get an immediate error (red exclamation marks in the new episode column's cells). Sometimes I trigger the error if I re-order the columns (e.g. moving the new episode column before the old one) and clear it if I move the column back. And sometimes VisiData crashes (with a mangled python stack trace) and python crashes ("core dumped") if I sort an unrelated column (as demonstrated in the attached test case).

In the error case, I get the core dump if I inspect any of the flagged cells with z Ctrl-E.

In case anyone else encounters the same issue, the stack trace of the crashed python process includes the following:

_PyBytes_Resize (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyCFunction_FastCallDict (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyCFunction_FastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyObject_CallMethodId_SizeT (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallKeywords (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDescr_FastCallKeywords (in /usr/lib/libpython3.7m.so.1.0)
@saulpw
Copy link
Owner

saulpw commented Sep 7, 2019

Yeah that's a good one, @chocolateboy. I figured that using the first one in the columns list seemed reasonable, and it seemed like a cell-error was appropriate if the column itself was first and thus self-referential. But this should not result in a significant performance penalty, much less an actual crash. I think I'm seeing the crash during replay of your test case, but it doesn't seem to happen interactively. I'll have to investigate.

saulpw added a commit that referenced this issue Sep 30, 2019
- [api] rename cached_property to lazy_property
- [api] rename LazyMap to LazyChainMap and LazyMapRow to LazyComputeRow (move to sheets.py)
- [extensible api] add new cached_property, which caches until clear_all_caches, which clears all cached_propertys
- [api] add @drawcache and @drawcache_property
- _eval_contexts cached by (sheet, rowid) during draw cycle
- add test case for #350
- [LazyComputeRow-] only catch recursion with usedcols
@saulpw
Copy link
Owner

saulpw commented Sep 30, 2019

This was great, @chocolateboy, thanks. We spent the weekend combing through the expr eval code, took notes for the book, fixed up the api, and VisiData now catches recursive expressions before the interpreter does.

With the above commit:

  • VisiData should not crash like this anymore.
  • For an identifier, the first so-named column in the columns list (shown or hidden) is used (this was also the previous behavior).
  • In your repro, the self-referential column starts out showing an error, but if you slide it to the right past the other episode, then it resolves properly (errors are not cached).
  • but values are cached, so sliding it back keeps the value; to clear the cache and see the error again, use cache-col (z').

The error message for your specific case is not great, but I'm not sure what to do about that. "TypedExceptionWrapper" is not very informative, but in this case it means you're using an error value from another column as a value in the expression.

@saulpw saulpw closed this as completed Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants