-
Notifications
You must be signed in to change notification settings - Fork 39
Attaching new columns #116
Comments
A view of a Table is like a lazy filter—the reason that selections on Tables are lazy is because some of the fields might not even be in memory. Generally, all awkward operations touch their contents as little as possible so that you don't end up loading a thousand columns of data just to filter then when you only cared about the filtered versions of two or three of them. Okay, but not that leads to this problem: a filtered Table consists of the full Table and a selection—say the fill Table has a million rows and the filtered version would have a thousand—and to assign to it, you give it an array that has a thousand elements. What does it put in for those other elements? You never see them, but it has to match the size of the other columns, which haven't been evaluated with the filter. Option 1: attempts to assign could cause all other columns to get filtered, but that would break the pattern, could be prohibitive to load into memory, and would violate the profile of least surprise. Option 2: the new column could be turned into an IndexedMaskedArray, which could be surprising because it's a different data type from what you'd get off you assigned to a non-view Table. Option 3: turning it into a SparseArray would also solve it. A SparseArray is a different class—could be surprising—but it's the same high-level data type. Type-safe functions wouldn't be telling you, "You have to check for the null case." SparseArray unfortunately has a log(n) All of that is (possibly me talking to myself) ways to extend the code to allow assignments to views. To solve your particular problem right now, you need a way to strictly evaluate the view, turning it into a non-view Table. You could do this: not_a_view = Table()
for name in original_view.columns:
not_a_view[name] = original_view[name] because the If the Table is actually a jagged table, you don't want to put jagged arrays into a Table, you want a new jagged table (jaggedness on the outside). If so, do this: not_a_view = JaggedArray.zip({n: original_view[n] for n in original_view.columns}) |
Actually, what I ought to do is Option 4: wrap up the code examples I have you as a new function called When that's implemented, the warning about assigning to a view would say, "Call compact() first." For now, use one of those two recipes. |
I am worried this is a sign of an anti-pattern: making selections, then calculating derived values. |
If the motivation for selecting before computing derived quantities is to save time by avoiding computations on irrelevant events, then it could be premature optimization, or transcribing the imperative code too closely. Then again, it might be mature optimization (!), or it might be to avoid impossible situations (e.g. use the first particle of an empty event). In any of those cases, MaskedArrays should help. Instead of changing the number of events in an array, MaskedArrays replace numerical elements with The only problem with that is that the MaskedArray part of the ecosystem isn't as well fleshed out. We have MaskedArray constructors, but no convenience methods to build them from other array types. Which, I suppose, is a call to arms, an indication that the MaskedArrays really are useful, after JaggedArrays and Tables... |
Thanks for the quick reply. The whole discussion is quite interesting. Indeed the situation is part of both. On one side I am still thinking in imperative form (thanks for pointing that out) and trying to mask at each step, partly is exactly to avoid impossible situations while still keeping the whole data consistent in a Table without having to carry around too many variables. I will try with your suggestions and come back to you in case I have further questions, and I will :). |
Hi,
when I try to attach a new column to a table that is the result of a cut I get the following error:
The error very clear and so is the cause, but I do not get the reason for such limitation, and if there is a way to circumvent it.
Thanks!
The text was updated successfully, but these errors were encountered: