Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm Nodes should NOT cache outputs by default #372

Closed
mpu-creare opened this issue Feb 18, 2020 · 1 comment
Closed

Algorithm Nodes should NOT cache outputs by default #372

mpu-creare opened this issue Feb 18, 2020 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@mpu-creare
Copy link
Contributor

Description
This is a big gotcha for new users. When developing a new algorithm node, if they (1) evaluate the results, (2) then change the code, and (3) evaluate the results again, nothing will change because the output is retrieved from the cache.

Steps to Reproduce

class MyAlg(Algorithm):
     data = NodeTrait()
     def algorithm(self, inputs):
         return inputs['data']
my_alg = MyAlg()
o = my_alg.eval(coords)

class MyAlg(Algorithm):
     data = NodeTrait()
     def algorithm(self, inputs):
         return inputs['data'] * 0 
my_alg = MyAlg()
o2 = my_alg.eval(coords)

assert np.all(o2.data != o.data)

This throws an assertion error .

Expected Behavior
I would expect o2 to be all zeros.

Observed Behavior
o2 == o, since it comes from the cache.

Additional Notes
There are multiple ways we could solve this problem. I think the best way is to make cache_update True by default for Algorithm and Compositor Nodes, and then in the documentation note how to optimize a pipeline that uses the same Algorithm/Compositor node in multiple places.

@mpu-creare mpu-creare added the bug Something isn't working label Feb 18, 2020
@mpu-creare mpu-creare added this to To do in 2.0.0 Release via automation Feb 18, 2020
@jmilloy
Copy link
Collaborator

jmilloy commented Feb 20, 2020

I think we are going to set cache_outputs = tl.Bool(default_value=False) for Algorithm nodes, maybe for other nodes as well. The caching is an optimization that some user's will want to turn on.

@mpu-creare mpu-creare assigned mpu-creare and unassigned jmilloy Apr 17, 2020
@mpu-creare mpu-creare moved this from To do to In progress in 2.0.0 Release Apr 17, 2020
mpu-creare added a commit that referenced this issue Apr 17, 2020
New users often get confused when developing new algorithms -- they make changes but get the same result. This is because the data is fetched from the cache (non-unique pipeline definition). Based on #372 we decided to change the default behaviour for most nodes.
2.0.0 Release automation moved this from In progress to Done Apr 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
2.0.0 Release
  
Done
Development

No branches or pull requests

2 participants