-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
insert nn to scatter_born can not back propagation #49
Comments
Hello,
Thank you for trying Deepwave and for sending me your question.
Can you tell me more about what you mean by inserting a neural network into
the equation? If you are able to show me the code that you tried it might
also help me to understand what you are trying to do and why it didn't work.
|
I want to insert a unet into scalar_born function as the parameters("scatter"), but it seems failed. I have tested the unet, the unet is correct. |
it shows that the parameters need to do autograd have been changed by the inplace operation |
this is the code I use to cooperate the deepwave |
do you have any idea of this?😭 |
Thank you. I now understand better what you are trying to achieve,
although still have some questions.
You have a migration velocity model, v_mig1. Is this constant
(v_mig1.requires_grad = False), or do you wish for Deepwave to update
it?
You then have a network, net1, that you use to produce a scattering
model from this velocity model. You hope that Deepwave will update the
parameters of net1 during iterations over the data so that it produces
better scattering models from the velocity model. Is that correct?
Why does your code contain these lines:
v_tmp = net(v_mig1.unsqueeze(0).unsqueeze(0))
scatter1 = scatter.unsqueeze(0).unsqueeze(0) + v_tmp
rather than simply writing:
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
and then passing scatter to scalar_born?
What does "net.train()" do?
If you wish to update your network model parameters after each batch
(rather than after each epoch), then you will need to call net1 in
each batch. You can do this by moving "scatter = net(v_mig1...)"
inside the "for batch in range(n_batch)" loop.
You should replace "loss.backward(retain_graph=True)" with "loss.backward()".
|
I have done what you suggest but still have a problem. Showing like this: 1 frames RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward. |
if I change back to loss.backward(retain_graph=True), it will generate another problem like this:RuntimeError Traceback (most recent call last) 1 frames RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 64, 1, 1]] is at version 6; expected version 5 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True) |
what I want to do is use nn to train the scatter, just what you said. But it always comes out this problem, so I can not update the parameters inside the nn. |
These lines are not necessary:
"scatter = torch.zeros_like(v_mig1)
scatter.requires_grad_()"
I think the problem you encountered will be resolved by moving
"scatter = net(v_mig1...)" inside the "for batch in range(n_batch)"
loop. Once you do this you can remove "retain_graph=True".
|
optimizer.zero_grad() should also be inside that loop (since you call
optimiser.step() within it).
|
If you still get an error about inplace updates, then please try
modifying your batch loop to this:
for batch in range(n_batch):
optimiser.zero_grad()
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
loss = (
1e9 * loss_fn(scatter, torch.zeros_like(scatter.detach()))
)
epoch_loss += loss.item()
loss.backward()
optimiser.step()
This will help to determine whether the problem occurs in scalar_born
or in net1.
|
Yes! it can run now. But a quick question, why do we need to update each batch rather than each epoch? I read the example in deepwave documents, they are update each epoch |
If we also update the background velocity using nn, will this work? do we need to set scatter.requires_grad = False? or they can update simultaneously |
That is good news. If you only wish to run your network and modify its parameters once each epoch, rather than each batch, then I think your code can be modified to achieve that. It might reduce runtime (by running your network less frequently) and make updates more stable (as they will be based on the gradients from an entire epoch rather than just one batch), but will probably also make convergence take longer as your model will be updated much less frequently, and will lose the randomness benefit of small batch updates. If you wish to do it, then something like this might work: for epoch in range(n_epochs):
epoch_loss = 0
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
scatter1 = scatter.detach().squeeze(0).squeeze(0)
scatter1.requires_grad_()
optimiser1 = torch.optim.SGD([scatter1], lr=1)
optimiser1.zero_grad()
for batch in range(n_batch):
batch_start = batch * n_shots_per_batch
batch_end = min(batch_start + n_shots_per_batch, n_shots)
if batch_end <= batch_start:
continue
s = slice(batch_start, batch_end)
out = scalar_born(v_mig1, scatter1, dx, dt,
source_amplitudes=source_amplitudes[s],
source_locations=source_locations[s],
receiver_locations=receiver_locations[s],
pml_freq=freq)
loss = (1e9 * loss_fn(out[-1] * mask[s],
observed_scatter_masked[s]))
epoch_loss += loss.item()
loss.backward()
optimiser1.step() # update scatter1
scatter1 = scatter1.detach().unsqueeze(0).unsqueeze(0)
# train net to produce scatter1
for it in range(n_its):
optimiser.zero_grad()
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
loss = loss_fn(scatter, scatter1)
loss.backward()
optimiser.step() There may be other, perhaps more elegant, ways. This one separates the estimation of the scattering model each epoch from running In most cases the cost of running your neural network will be insignificant compared to the cost of running Deepwave, however, and so calling your neural network each batch (the way you currently do in your working code) will not substantially affect runtime (and avoids the complications in my code above of having multiple optimisers, etc.). You can still only update its parameters every epoch, rather than every batch, if you wish, by moving A learning rate of 1 (combined with the large scaling applied to Regarding your second question, yes, you can also update velocity (and also source amplitude, if you wish) simultaneously. You will need to set |
Thanks! That's quite useful, I will try as you suggest!!! |
Hi, |
Can you explain it a little bit? |
Can you show me the code you used to conclude that the loss is unchanged when you update the background velocity model? |
And do you mean that the loss didn't change when you used completely different velocity models, or only that it didn't change over iterations when you were inverting for the velocity model? If the latter, have you checked that the velocity model actually changed over iterations? |
You mean the loss is unchanged but the velocity has already updated? |
You appear to be trying to use scalar_born with a scattering potential
(`scatter`) that is fixed at zero. The output of Born forward modelling
(`scalar_born`) will therefore always be zero, regardless of what the
velocity is. You should either provide a scattering potential that is
constant (not updated, `requires_grad=False`) but non-zero, or one that is
updated (`requires_grad=True`, and included in list of parameters updated
by the optimiser).
|
yes, I discovered about this. Thanks! I learned a lot about the deep wave. It is very powerful and useful. Thanks again.
Best wishes,
Weilin
发送自 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Alan Richardson ***@***.***>
发送时间: Tuesday, February 14, 2023 9:55:41 AM
收件人: ar4/deepwave ***@***.***>
抄送: Zhang, Weilin ***@***.***>; Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
You appear to be trying to use scalar_born with a scattering potential
(`scatter`) that is fixed at zero. The output of Born forward modelling
(`scalar_born`) will therefore always be zero, regardless of what the
velocity is. You should either provide a scattering potential that is
constant (not updated, `requires_grad=False`) but non-zero, or one that is
updated (`requires_grad=True`, and included in list of parameters updated
by the optimiser).
―
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANLP5WYRJSI7HV3TDXTFMB3WXNJB3ANCNFSM6AAAAAAU2GIOPE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
One more question. If I want to figure out the source code of deep weave, do you have any reference that exactly explains the theory behind the source code?
发送自 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Zhang, Weilin ***@***.***>
发送时间: Tuesday, February 14, 2023 10:01:54 AM
收件人: ar4/deepwave ***@***.***>; ar4/deepwave ***@***.***>
抄送: Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
yes, I discovered about this. Thanks! I learned a lot about the deep wave. It is very powerful and useful. Thanks again.
Best wishes,
Weilin
发送自 Outlook for iOS<https://aka.ms/o0ukef>
________________________________
发件人: Alan Richardson ***@***.***>
发送时间: Tuesday, February 14, 2023 9:55:41 AM
收件人: ar4/deepwave ***@***.***>
抄送: Zhang, Weilin ***@***.***>; Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
You appear to be trying to use scalar_born with a scattering potential
(`scatter`) that is fixed at zero. The output of Born forward modelling
(`scalar_born`) will therefore always be zero, regardless of what the
velocity is. You should either provide a scattering potential that is
constant (not updated, `requires_grad=False`) but non-zero, or one that is
updated (`requires_grad=True`, and included in list of parameters updated
by the optimiser).
―
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANLP5WYRJSI7HV3TDXTFMB3WXNJB3ANCNFSM6AAAAAAU2GIOPE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
The code may look somewhat complicated, but it is actually quite simple.
The complicated appearance is mainly due to the way I had to write the code
to ensure that the compiler vectorised it to provide good performance. For
each propagator there is a Python file that does some setup and then calls
code in either a C++ or CUDA file, depending on whether the propagator is
running on a CPU or GPU. The C++/CUDA files contain standard forward and
adjoint finite difference time stepping propagators. One difference from
other implementations is that I also consider the PML regions when
calculating the adjoint. This ensures that Deepwave calculates the exact
gradient. It adds a bit more complication to the code, and is unlikely to
make a difference for seismic applications, but is useful for testing the
code as I can verify that the calculated gradient is correct, giving me
confidence that everything is implemented correctly. The Born propagator is
also probably a bit different from other codes as I also made it possible
to calculate the gradient with respect to the velocity model and source
amplitudes, but the equations for that can be worked out fairly
straightforwardly from the wave equation.
|
Great!Thanks for your kindly reply. If we input a constant scatter into scalar_born, is this function will generate the background wavefield, and then use this background wave field as a part of source function of born modelling?
发送自 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Alan Richardson ***@***.***>
发送时间: Tuesday, February 14, 2023 4:26:46 PM
收件人: ar4/deepwave ***@***.***>
抄送: Zhang, Weilin ***@***.***>; Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
The code may look somewhat complicated, but it is actually quite simple.
The complicated appearance is mainly due to the way I had to write the code
to ensure that the compiler vectorised it to provide good performance. For
each propagator there is a Python file that does some setup and then calls
code in either a C++ or CUDA file, depending on whether the propagator is
running on a CPU or GPU. The C++/CUDA files contain standard forward and
adjoint finite difference time stepping propagators. One difference from
other implementations is that I also consider the PML regions when
calculating the adjoint. This ensures that Deepwave calculates the exact
gradient. It adds a bit more complication to the code, and is unlikely to
make a difference for seismic applications, but is useful for testing the
code as I can verify that the calculated gradient is correct, giving me
confidence that everything is implemented correctly. The Born propagator is
also probably a bit different from other codes as I also made it possible
to calculate the gradient with respect to the velocity model and source
amplitudes, but the equations for that can be worked out fairly
straightforwardly from the wave equation.
―
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANLP5WZ2IH3IAY6FV62AWILWXOW4NANCNFSM6AAAAAAU2GIOPE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
In Born modelling two wavefields are propagated. The first propagates
from the source. At each time step it interacts with the scattering
potential (`scatter`), acting as a source for the second (scattered)
wavefield. This scattered wavefield is what is recorded by the
receivers. So if you have one source and one point in your scattering
model that is non-zero, then the first wavefield will simply be the
source wave propagating out from the source. When this source wave
encounters the non-zero point in the scattering model, the non-zero
point will act like a source in the second, scattered, wavefield. The
wave in this scattered wavefield will propagate away from the point
and be recorded by the receivers.
|
Thanks! It is really helpful!
发送自 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Alan Richardson ***@***.***>
发送时间: Wednesday, February 15, 2023 12:23:05 PM
收件人: ar4/deepwave ***@***.***>
抄送: Zhang, Weilin ***@***.***>; Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
In Born modelling two wavefields are propagated. The first propagates
from the source. At each time step it interacts with the scattering
potential (`scatter`), acting as a source for the second (scattered)
wavefield. This scattered wavefield is what is recorded by the
receivers. So if you have one source and one point in your scattering
model that is non-zero, then the first wavefield will simply be the
source wave propagating out from the source. When this source wave
encounters the non-zero point in the scattering model, the non-zero
point will act like a source in the second, scattered, wavefield. The
wave in this scattered wavefield will propagate away from the point
and be recorded by the receivers.
―
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANLP5W3T6IFXZDKAYDQPGB3WXTDCTANCNFSM6AAAAAAU2GIOPE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Weilin, I am closing this issue, but please feel free to reopen it, or to create a new issue, if you have any more problems or questions. |
thanks!
发送自 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Alan Richardson ***@***.***>
发送时间: Thursday, February 23, 2023 1:59:13 PM
收件人: ar4/deepwave ***@***.***>
抄送: Zhang, Weilin ***@***.***>; Author ***@***.***>
主题: Re: [ar4/deepwave] insert nn to scatter_born can not back propagation (Issue #49)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
Closed #49<#49> as completed.
―
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANLP5WY5JR2CRYTM3SQECF3WY5ULDANCNFSM6AAAAAAU2GIOPE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Can we insert a neural network in to scatter_born equation?Currently i try it failed.
The text was updated successfully, but these errors were encountered: