Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handwritten adjoints and removal of Zygote #107

Merged
merged 5 commits into from
Feb 26, 2021
Merged

Handwritten adjoints and removal of Zygote #107

merged 5 commits into from
Feb 26, 2021

Conversation

michel2323
Copy link
Member

@michel2323 michel2323 commented Feb 25, 2021

This PR introduces handwritten adjoints for all the constraints. The line flow constraint adjoints exhibit the same issues than the Zygote ones: They disagree with ForwardDiff and FD about 1e-4. This issue has to be kept open. It is either a bug or some unknown numerical issue.

@michel2323 michel2323 marked this pull request as draft February 25, 2021 03:28
@michel2323 michel2323 changed the title Resolving condition in adjoint [WIP] Resolving condition in adjoint Feb 25, 2021
Copy link
Member

@frapac frapac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some preliminary comments on this PR. I need to write down the math to better understand what we are doing 😬

@@ -244,6 +307,52 @@ function transfer!(polar::PolarForm, buffer::PolarNetworkState, u)
wait(ev)
end

KA.@kernel function adj_transfer_kernel!(
adj_vmag, adj_vang, adj_pinj, adj_qinj, adj_u, pv, pq, ref, adj_pload, adj_qload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here adj_pload is the adjoint of the power generation pg (adj_pg), as the load is constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. In that case we could remove that argument. I am not yet sure how to handle the adjoint! signatures in general. What we also have to clarify is weither the adjoint routines return a primal value. I would not recommend that.

@@ -130,14 +133,14 @@ KA.@kernel function residual_adj_kernel!(F, adj_F, v_m, adj_v_m, v_a, adj_v_a,
if i > npv
F[i + npq] -= qinj[fr]
end
@inbounds for c in ybus_re_colptr[fr]:ybus_re_colptr[fr+1]-1
@inbounds for c in colptr[fr]:colptr[fr+1]-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleansing!

nvbus = length(vm)
nnz = length(ybus_re.nzval)
T = eltype(vm)
edge_vm_from = zeros(T, nnz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if we could reuse these variables to compute the adjoint of the line flow constraints?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we could. I guess this belongs into the cache together with the adjoints.

@michel2323
Copy link
Member Author

Just some preliminary comments on this PR. I need to write down the math to better understand what we are doing 😬

I take shortcuts in the code; like for example when to use += etc. This is not the best example to learn manual differentiation.

michel2323 and others added 3 commits February 26, 2021 11:01
* currently implemented for `power_balance` and
  `reactive_power_constraints`
* homogeneize signature for AutoDiff.Hessian
@michel2323 michel2323 changed the title [WIP] Resolving condition in adjoint [WIP] Handwritten adjoints and removal of Zygote Feb 26, 2021
@michel2323 michel2323 changed the title [WIP] Handwritten adjoints and removal of Zygote Handwritten adjoints and removal of Zygote Feb 26, 2021
@michel2323 michel2323 marked this pull request as ready for review February 26, 2021 17:19
Copy link
Member

@frapac frapac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to merge once minor comments have been addressed

@@ -20,6 +20,8 @@ jobs:
- uses: julia-actions/julia-runtest@latest

test-moonshot:
env:
CUDA_VISIBLE_DEVICES: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why setting this value? (I am curious 😺 )

Copy link
Member Author

@michel2323 michel2323 Feb 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the tests on moonshot run on device=1 to avoid a clash with most users due to #110.

if isa(buffer.vmag, Array)
kernel! = accumulate_view!(KA.CPU())
kernel! = adj_branch_flow_kernel!(KA.CPU())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When polar is available in the scope, I would suggest defining the kernel directly as

kernel! = adj_branch_flow_kernel!(polar.device)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice! indeed!

@@ -1,4 +1,5 @@
import KernelAbstractions: @index
using ForwardDiff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to import ForwardDiff here, as it is already loaded in src/ExaPF.jl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok!


"""
accumulate_view(x, vx, indices)
# not needed in the reverse run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this dead comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have the forward code as a comment in the adjoint function. That's just a preference, we can remove it if you don't like it. It's easier for readability of the adjoint code if someone looks at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to keep the comment, then!

end
end
# not needed in the reverse run
# to_flow = vmag[to_bus]^2 * (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this one?

# slines[ℓ + nlines] = to_flow

adj_to_flow = adj_slines[ℓ + nlines]
adj_vmag[to_bus] += (2 * vmag[to_bus] * ytf_abs * vmag[fr_bus]^2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace 2 by 2.0 (same 4 and 6)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I tried that actually yesterday to make sure that's not the cause of the numerical issues.

adj_Δθ = cosθ * adj_sinθ
adj_Δθ -= sinθ * adj_cosθ
adj_vang[fr_bus] += adj_Δθ
adj_vang[to_bus] -= adj_Δθ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! This adjoint does not look trivial

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why it's off!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants