[ENH] Table lock: tests run with tables that are read-only by default #5381

janezd · 2021-04-02T12:55:02Z

Issue

Ref #5303.

Description of changes

Lock Table and provide a context manager that opens it.

This is a draft; it will be interesting to see situations in which it fails. Here's one fascinating find: 17d37fe.

Includes

Code changes
Tests
Documentation

janezd · 2021-05-09T11:14:06Z

@markotoplak, food for thought: some projection widgets (FreeViz, LinearProjection) output the projection matrix, that is, positions of "anchors" (projected base vectors, variables) in the projection as Table. A widget connected to this would almost always store this table as a reference, not a copy.

The user can move these "anchors", which changes the corresponding Table. The output may or may not be immediately re-sent. Even if sent immediatelly, there is a (very short) lag. Worse still, if the destination widget checks whether the new input data differs from the previousindeed changed, it won't notice it -- because it receives the same instance as before (and doesn't retain a copy of the data to check).

To avoid this problem, a widget must never send a table if it later changes it (place). In practice this means that all output tables must be explicitly constructed for output, without storing it.

markotoplak · 2021-05-14T05:54:06Z

@janezd, yes, exactly, as soon as something goes through the signal it should not be changed anymore, or we get potential subtle bugs like this one: a receiving widget's stored table is changed before a signal is emitted.

janezd · 2021-05-15T09:23:51Z

Say that some function calls eliminate_zeros on Table's X. (We may call this function, say, JaccardModel._compute_sparse.) Must it do this on a copy in order to not change the data? Or is a sparse Table not changed by merely eliminating zeros?

In practice, this should be important only if some code stores csr's internal arrays and expects they won't change, or if some code treats zeros differently than missing data.

BTW, unless I've broken something with the last changes, we're down to only 5 failing tests. Yay.

janezd · 2021-05-20T19:33:32Z

The two remaining issues are due to CatGB: its Cython code expects a writeable buffer, though it probably does not actually change it. In our code, we solve this by adding a const specifier (as in def contingency(const double[:] x, int nx, const double[:] y, int ny):). For third-party code, I implemented a Table.force_unlocked, which does what its name says, but shouldn't be used if we're not sure the table won't be modified.

So: does any of these methods (to my knowledge we support several implementations) modify the array?

codecov · 2021-05-23T12:59:10Z

Codecov Report

Merging #5381 (375e1b2) into master (2e4803d) will increase coverage by 0.01%.
The diff coverage is 90.83%.

@@            Coverage Diff             @@
##           master    #5381      +/-   ##
==========================================
+ Coverage   85.96%   85.97%   +0.01%     
==========================================
  Files         313      313              
  Lines       65471    65641     +170     
==========================================
+ Hits        56280    56435     +155     
- Misses       9191     9206      +15

janezd · 2021-05-23T15:12:36Z

This now passes the tests. What still needs to be done:

decide which version of _subarray to use (see https://github.com/biolab/orange3/pull/5381/files#diff-ac1cac15baf777a4cad890100e4bde8a562e5d63f38100fbe38a02429a29060aR2265). The one currently marked as _subarray2 doesn't create views when unnecessary. I suppose it doesn't harm.
decide about https://github.com/biolab/orange3/pull/5381/files#diff-ac1cac15baf777a4cad890100e4bde8a562e5d63f38100fbe38a02429a29060aR1601. I found this, as well as the one above, when force_unlocked failed on catboost. I later enhanced force_unlocked to unlock the base, so this change is currently probably unnecessary, but probably still useful.
improve coverage
lint
squash
add a PR that removes exceptions and show warning instead; that PR would be reverted after some time

@markotoplak, @lanzagar, could you inspect this before I continue?

irgolic · 2021-05-26T18:34:43Z

This idea's really creative, and the execution is very thorough. Good job! 👏

However, I believe #5303 could be solved simpler and easier within orange-canvas-core – after all, the problem you are solving stems from Table as a format of serialization between widgets. If widgets are linked in a single chain, it isn't wrong to alter data directly.

Throwing an exception on data modification will break a LOOOOT of code, not just within Orange, but also in scripts that use Orange as a data mining library, and other projects which use Orange Table as their backbone. And their use cases for treating Table as mutable are completely valid.

I'd rather not force people to learn to write with table.unlocked(): if not necessary, I've never seen the with keyword used like this before, do you have an example from a well-known project? I'm afraid it will raise Orange's already steep learning curve, which I tried to flatten by adding DataFrame setters. If it is unavoidable, what's the exception/warning message going to be? Please don't consider this an afterthought, the ability to teach a solution in a one-line warning message is an important characteristic in its appraisal.

janezd · 2021-05-27T19:21:59Z

I'm OK with issuing a warning, and letting it forever be a warning, not an exception. Though there's this problem that Table.X is made read-only, so some of the new exceptions are raised by numpy. We can think of a way around it.

Furthermore, I'm OK with adding a global flag whether these warnings are enabled or not. Orange canvas would set it and thus show warning (for add-ons; the core code should be clean!). Any other projects can leave the flag unset.

I believe that the problem is not related only to workflows. With any parallelism, changing the table may be problematic.

I realize that this steepens the learning curve. The current exception says that the table is read-only unless unlocked; this should make the user look into documentation.

I don't know of any uses of with that are exactly like this, but this is in line with what with is. It was intended to manage resources in such a way that something is set/opened/changed/... and at the end of the block some code that cleans this up is guaranteed to execute. (with is also presented as replacement for try/finally). One example is opening a file (and the manager from this PR opens a table :), another is mocking in unit tests. So the use of with in this context isn't a particular hack.

irgolic · 2021-05-27T19:49:26Z

Though there's this problem that Table.X is made read-only, so some of the new exceptions are raised by numpy. We can think of a way around it.

That'd be ideal.

Furthermore, I'm OK with adding a global flag whether these warnings are enabled or not.

This is a great idea :D

The current exception says that the table is read-only unless unlocked; this should make the user look into documentation.

Don't rely on users looking into documentation, it'll make most people give up instead. If this is something that almost anyone who tries using Python Script will encounter, it'd be nice if we could guide them well with an error message. Could we offer an alternative suggestion in the sense of 'you should copy the array or unlock it if you're not using parallelism'?

On that note, I'm also interested in what messages a DataFrame created from a locked numpy array will show (table.X_df).

After a little bit, there's probably going to be a StackOverflow question on the top of Google search results for 'table is read-only unless unlocked', but let's try not to rely on that.

markotoplak · 2021-05-27T21:53:46Z

If widgets are linked in a single chain, it isn't wrong to alter data directly.

@irgolic, no, it is still wrong. Imagine that a chain is modifying data, and then a user changes the parameters of a widget in the chain.

markotoplak · 2021-05-27T22:04:50Z

@irgolic, I do understand your points and agree that as long as you keep your table local, it is fine to change it.

But if the table is passed through a signal, that changes. Even without any parallelism we will get problems (because a widget in a chain can be reapplied with new settings). I am for irreversibly locking tables that passed through a signal. First as a deprecation warning, of course. That would make it easier for users - no unlocking statements necessary. :)

irgolic · 2021-05-27T22:09:12Z

no, it is still wrong. Imagine that a chain is modifying data, and then a user changes the parameters of a widget in the chain.

You're right.

I am for irreversibly locking tables that passed through a signal. First as a deprecation warning, of course. That would make it easier for users - no unlocking statements necessary.

I'm seriously against erroring on an unlocked table in Python Script. Throw a warning and briefly explain what they did wrong. Think 'best effort'.
But I'd be down for setting up our CI to fail if a table is unlocked after it's been passed through a signal?

This got me thinking about how links could visibly stale if the table they sent through is ever unlocked, and would require recomputation.

Orange/data/table.py

janezd · 2021-06-11T17:04:19Z

@markotoplak proposed a very clever solution, which I happily implemented: tables are locked within tests, but not when running Orange (or individual widgets).

Barring any bugs in implementation, locking should have no effect on runtime, but it should show any violations through failed tests. This PR takes care of core Orange, while tests for add-ons will most certainly fail after this is merged.

Implementation details: locking is regulated by Table.LOCKING flag, which is False by default. It is enabled by the three init files for tests (= in CI) and by the module that definesWidgetTest (= when testing widgets locally). Developers can manually enable it for other tests, too. If they don't, problems in their code will be exposed on CI.

Orange/base.py

Orange/widgets/data/tests/test_owmergedata.py

markotoplak · 2021-10-01T11:06:39Z

Notes to self, TODO for merge:

Make sure that table lock can be easily disabled in add-on tests. They can not be rewritten to use the unlocking decorators until a new version of Orange is released.
Decide about subarray replacement
Decide about the retain shortcut.
Check test_owmergedata.
Check Orange.data.table.
Check test_table.

markotoplak · 2021-10-05T20:06:12Z

Hm, at first I wanted to use your subarray shortcut but then test_valueFilter_string_case_insens failed. It failed in the third line:

        d = data.Table("zoo")
        with d.unlocked():
            d[d[:, "name"].metas[:, 0] == "girl", "name"] = "GIrl"

I got ValueError: assignment destination is read-only.

Let's analyze what happened: d[:, "name"] is a selection for which original metas can be perfectly reused, and is thus copied and locked. Because it is the same array, d.metas is locked too, and the assignment fails. So whenever tables share arrays, the locking mechanism can fail.

Here we saw a problem in a single thread, which we can "fix" by not doing the optimization and we get a correct behavior of not allowing to unlock [d[:, "name"] while d is still unlocked. :)

And this reminds me of race conditions: having multiple threads unlocking the same table or an array shared between tables could be an issue. Fortunately, we avoid editing tables anyway so perhaps this is not such an issue, just perhaps some badly written code could start randomly crashing if table locking was enabled. But that code was likely wrong already. So I wouldn't worry about it, or can you, @janezd, think of a problematic case?

I'd avoid the optimization and would go for un-unlockable views, so original semantics in these cases. And I would not worry about multithreading, because if someone has multiple threads editing the same table, they deserve crashes.

Finally, should we also search for all cases when tables could share arrays and try to code around them? I wouldn't worry too much about it, I'd fix it when something starts failing even though it should not.

janezd · 2021-10-07T11:04:47Z

If computation of some table('s column) is separated into two threads, which simultaneously unlock it and set the data, this would fail although it is kind of legit. But the workaround is to unlock the table before starting the threads, so I don't see it as a big problem.

Your example with metas is more disturbing because one table can share an array with another without realizing it, and this can be different to track. Maybe we should just avoid optimizations and always copy?

janezd · 2021-10-07T14:54:45Z

@markotoplak, I'm OK with all your changes. You can tidy this up by squashing them into the original two commits, if you wish.

…o locking problems

janezd force-pushed the table-lock branch from 17d37fe to 277c1f5 Compare April 9, 2021 07:54

janezd force-pushed the table-lock branch from 277c1f5 to 4849ad9 Compare April 16, 2021 09:36

janezd added the needs discussion Core developers need to discuss the issue label Apr 29, 2021

janezd force-pushed the table-lock branch from 345de6f to 339d1fd Compare May 7, 2021 09:21

janezd removed the needs discussion Core developers need to discuss the issue label May 7, 2021

janezd force-pushed the table-lock branch 3 times, most recently from d912603 to 28e51ef Compare May 8, 2021 10:38

janezd force-pushed the table-lock branch 2 times, most recently from aca50e3 to 8e50762 Compare May 15, 2021 09:20

janezd force-pushed the table-lock branch 2 times, most recently from 03ce796 to 2223f50 Compare May 15, 2021 12:33

janezd force-pushed the table-lock branch 2 times, most recently from ea1d720 to 2508651 Compare May 23, 2021 12:21

janezd force-pushed the table-lock branch 4 times, most recently from 408fa64 to 490a9b7 Compare June 11, 2021 14:03

janezd commented Jun 11, 2021

View reviewed changes

Orange/data/table.py Outdated Show resolved Hide resolved

janezd commented Jun 11, 2021

View reviewed changes

Orange/data/table.py Outdated Show resolved Hide resolved

janezd force-pushed the table-lock branch from df4a059 to e7d9d57 Compare June 11, 2021 16:24

janezd force-pushed the table-lock branch 2 times, most recently from 69dda8a to 7b736f6 Compare June 16, 2021 13:21

janezd mentioned this pull request Jun 21, 2021

Row manipulation #5496

Closed

janezd assigned markotoplak Aug 27, 2021

markotoplak reviewed Oct 1, 2021

View reviewed changes

Orange/base.py Show resolved Hide resolved

markotoplak reviewed Oct 1, 2021

View reviewed changes

Orange/base.py Show resolved Hide resolved

markotoplak reviewed Oct 1, 2021

View reviewed changes

Orange/widgets/data/tests/test_owmergedata.py Outdated Show resolved Hide resolved

markotoplak force-pushed the table-lock branch from 83a10da to dd213c1 Compare October 1, 2021 20:10

markotoplak force-pushed the table-lock branch from f134c10 to d45703c Compare October 7, 2021 14:13

janezd and others added 4 commits October 8, 2021 13:39

Table: Add locking

3a60da3

Table: Make locking optional; enable it in tests

d32bdab

Adapt curve fit tests to table locking

c87ea41

Remove suggested shortcuts that reuse tables or parts of tables due t…

55fc086

…o locking problems

markotoplak force-pushed the table-lock branch from 45d3fc5 to 55fc086 Compare October 8, 2021 11:40

SklImpute: remove seemingly unnecessary copy

375e1b2

markotoplak changed the title ~~Table lock~~ [ENH] Table lock: tests run with tables that are read-only by default Oct 8, 2021

markotoplak merged commit ec30f34 into biolab:master Oct 8, 2021

janezd mentioned this pull request Oct 8, 2021

Table has public attributes we assume won't change (but can) #5303

Closed

janezd mentioned this pull request Oct 16, 2021

Disable table locking until Orange 3.31 biolab/orange3-educational#126

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Table lock: tests run with tables that are read-only by default #5381

[ENH] Table lock: tests run with tables that are read-only by default #5381

janezd commented Apr 2, 2021

janezd commented May 9, 2021

markotoplak commented May 14, 2021

janezd commented May 15, 2021

janezd commented May 20, 2021

codecov bot commented May 23, 2021 •

edited

Loading

janezd commented May 23, 2021 •

edited by markotoplak

Loading

irgolic commented May 26, 2021 •

edited

Loading

janezd commented May 27, 2021

irgolic commented May 27, 2021

markotoplak commented May 27, 2021

markotoplak commented May 27, 2021

irgolic commented May 27, 2021 •

edited

Loading

janezd commented Jun 11, 2021

markotoplak commented Oct 1, 2021 •

edited

Loading

markotoplak commented Oct 5, 2021

janezd commented Oct 7, 2021

janezd commented Oct 7, 2021

[ENH] Table lock: tests run with tables that are read-only by default #5381

[ENH] Table lock: tests run with tables that are read-only by default #5381

Conversation

janezd commented Apr 2, 2021

Issue

Description of changes

Includes

janezd commented May 9, 2021

markotoplak commented May 14, 2021

janezd commented May 15, 2021

janezd commented May 20, 2021

codecov bot commented May 23, 2021 • edited Loading

Codecov Report

janezd commented May 23, 2021 • edited by markotoplak Loading

irgolic commented May 26, 2021 • edited Loading

janezd commented May 27, 2021

irgolic commented May 27, 2021

markotoplak commented May 27, 2021

markotoplak commented May 27, 2021

irgolic commented May 27, 2021 • edited Loading

janezd commented Jun 11, 2021

markotoplak commented Oct 1, 2021 • edited Loading

markotoplak commented Oct 5, 2021

janezd commented Oct 7, 2021

janezd commented Oct 7, 2021

codecov bot commented May 23, 2021 •

edited

Loading

janezd commented May 23, 2021 •

edited by markotoplak

Loading

irgolic commented May 26, 2021 •

edited

Loading

irgolic commented May 27, 2021 •

edited

Loading

markotoplak commented Oct 1, 2021 •

edited

Loading