Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made AutoDiffCostFunction._tmp_optim_vars copies of original #155

Merged
merged 1 commit into from
Apr 13, 2022

Conversation

luisenp
Copy link
Contributor

@luisenp luisenp commented Apr 12, 2022

This preserves optimization variables types and can solve errors like this.

Besides running unit tests, I also ran the backward_modes.py script and the first tutorial. Seems to be working w/o issues after this change.

cc @exhaustin

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2022
@luisenp luisenp requested a review from mhmukadam April 12, 2022 17:18
@luisenp luisenp self-assigned this Apr 12, 2022
@luisenp luisenp added bug Something isn't working and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Apr 12, 2022
@exhaustin
Copy link
Contributor

I don't have much context regarding the effect of this change, but thanks for fixing!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2022
@@ -135,7 +135,7 @@ def _register_vars_in_list(var_list_, is_optim=False):

# The following are auxiliary Variable objects to hold tensor data
# during jacobian computation without modifying the original Variable objects
self._tmp_optim_vars = tuple(Variable(data=v.data) for v in optim_vars)
self._tmp_optim_vars = tuple(v.copy() for v in optim_vars)
Copy link
Member

@mhmukadam mhmukadam Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I dug again into Variable.copy to find that data.clone() is called. Want to make sure that the intent here is to have _tmp_optim_vars connected to the compute graph.

More generally, is this okay or should we actually be doing data.detach().clone() in all our copy methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good observation! I don't think this actually matters, since _tmp_optim_vars is just a container to put the actual tensor data when computing jacobians. Any data here will get replaced every time the cost function is called. It would roughly function as the example below:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, I don't think we should be doing detach().clone() in our copy methods, but maybe we should add a flag that controls for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can provide a separate detach() function? If it makes sense we can flag an issue for this for now.

When you call Objective.clone() to get a new objective, is it still connected to the previous compute graph?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline conversation we might revisit the detach idea later since it could be tricky to setup, what should be just cloned vs also detached when calling everything recursively from Objective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on where you call it. If you do something like the code below, I'm almost sure that the resulting graph would be connected.

objective = create_objective()
optimizer = SomeOptimizer(objective)
layer = TheseusLayer(objective)
input_dict = compute_some_initial_vars()
sol_dict = layer.forward(input_dict)
new_objective = objective.copy()
new_optimizer = SomeOptimizer(new_objective)
layer2 = TheseusLayer(new_optimizer)
sol = layer2.forward()  # no input here, keep whatever variables you cloned
loss =  do_some_stuff(sol)
loss.backward()

@mhmukadam mhmukadam merged commit 262297e into main Apr 13, 2022
@mhmukadam mhmukadam deleted the lep.autodiff_cost_tmp_vars_make_copy branch April 13, 2022 16:34
suddhu pushed a commit to suddhu/theseus that referenced this pull request Jan 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants